javascript - Confusing with an URL parsing regexp -
i met url parsing regular expression in proxy pac file.
its function match url pattern belongs domain wikimapia.(btw, guess)
^[\w\-]+:\/+(?!\/)(?:[^\/]+\.)?wikimapia\.org i split , give confusion following:
^ [\w\-]+ // protocol name containing '-' ? : \/+ // why not use '\/\/', aren't protocol names follow '://' ? (?!\/) // what's function of part? (?:[^\/]+\.)? // non-capturing grouping ?: necessary here? or optimization? wikimapia \. org hope can explain confusion.
according rfc url can contain - in schema (protocol) , non ip based protocols can have more 2 /'s. http should ://.
the (?!\/) (negative lookahead) asserts whatever comes after "the" string of /'s not /. not serve purpose, regex engines greedy, consume / can, there shouldn't non / characters left. furthermore, next character either not / in optional (?:[^\/]+\.)? portion, , if not matched next character w in wikimapia.org. lookahead serves no purpose.
unless referencing capture-groups, making group non-capturing has no impact on performance. still thing though, having habit makes easier if using back-references.
Comments
Post a Comment