javascript - Confusing with an URL parsing regexp -


i met url parsing regular expression in proxy pac file.

its function match url pattern belongs domain wikimapia.(btw, guess)

^[\w\-]+:\/+(?!\/)(?:[^\/]+\.)?wikimapia\.org 

i split , give confusion following:

^  [\w\-]+     // protocol name containing '-' ? : \/+         // why not use '\/\/', aren't protocol names follow '://' ? (?!\/)      // what's function of part? (?:[^\/]+\.)? // non-capturing grouping ?: necessary here? or optimization? wikimapia \. org 

hope can explain confusion.

according rfc url can contain - in schema (protocol) , non ip based protocols can have more 2 /'s. http should ://.

the (?!\/) (negative lookahead) asserts whatever comes after "the" string of /'s not /. not serve purpose, regex engines greedy, consume / can, there shouldn't non / characters left. furthermore, next character either not / in optional (?:[^\/]+\.)? portion, , if not matched next character w in wikimapia.org. lookahead serves no purpose.

unless referencing capture-groups, making group non-capturing has no impact on performance. still thing though, having habit makes easier if using back-references.


Comments

Popular posts from this blog

monitor web browser programmatically in Android? -

Shrink a YouTube video to responsive width -

wpf - PdfWriter.GetInstance throws System.NullReferenceException -