c# - How to match the 2nd occurence of a string with regex? -


i have url this

http://www.abc.com/h/x/y 

and want parse "x/y" using regex. using following regex

h/(?<group>[\s\s]*?)\s*?/ 

but matches "x" want "x/y". can find 2nd occurence of '/' using programming language , parse want regex only.

please help.

the final regular expression depend on valid urls you'd parse, e.g. h constant or can change well?

i'd use this:

http://(?:[a-z\d\-]+\.)*[a-z\d]+/h/(.*) 
  • the first part (http://) matching protocol rather obvious.
  • the non-capturing group ((?:[a-z\d\-]+\.)*) * quantifier match (sub) domains under tld including last . (if any). if ip given, contain first part of ip.
  • [a-z\d]+ match tld or - intranet stuff - domain name (like localhost). in case ip given, contain last byte.
  • the actual capture group ((.*)) capture following /h/.

this implementation has 2 downsides:

  • in current state, ipv6 ips aren't supported. nor given port numbers or other protocols. these require minimal adjustments i'm sure can figure out on own.
  • this still parse invalid urls, such http://--some-weird.--.com/h/1/2/3.

Comments

Popular posts from this blog

monitor web browser programmatically in Android? -

Shrink a YouTube video to responsive width -

wpf - PdfWriter.GetInstance throws System.NullReferenceException -