c# - How to match the 2nd occurence of a string with regex? -
i have url this
http://www.abc.com/h/x/y
and want parse "x/y" using regex. using following regex
h/(?<group>[\s\s]*?)\s*?/
but matches "x" want "x/y". can find 2nd occurence of '/' using programming language , parse want regex only.
please help.
the final regular expression depend on valid urls you'd parse, e.g. h
constant or can change well?
i'd use this:
http://(?:[a-z\d\-]+\.)*[a-z\d]+/h/(.*)
- the first part (
http://
) matching protocol rather obvious. - the non-capturing group (
(?:[a-z\d\-]+\.)*
)*
quantifier match (sub) domains under tld including last.
(if any). if ip given, contain first part of ip. [a-z\d]+
match tld or - intranet stuff - domain name (likelocalhost
). in case ip given, contain last byte.- the actual capture group (
(.*)
) capture following/h/
.
this implementation has 2 downsides:
- in current state, ipv6 ips aren't supported. nor given port numbers or other protocols. these require minimal adjustments i'm sure can figure out on own.
- this still parse invalid urls, such
http://--some-weird.--.com/h/1/2/3
.
Comments
Post a Comment