javascript - Confusing with an URL parsing regexp -
i met url parsing regular expression in proxy pac file.
its function match url pattern belongs domain wikimapia.(btw, guess)
^[\w\-]+:\/+(?!\/)(?:[^\/]+\.)?wikimapia\.org
i split , give confusion following:
^ [\w\-]+ // protocol name containing '-' ? : \/+ // why not use '\/\/', aren't protocol names follow '://' ? (?!\/) // what's function of part? (?:[^\/]+\.)? // non-capturing grouping ?: necessary here? or optimization? wikimapia \. org
hope can explain confusion.
according rfc url can contain -
in schema (protocol) , non ip based protocols can have more 2 /
's. http
should ://
.
the (?!\/)
(negative lookahead) asserts whatever comes after "the" string of /
's not /
. not serve purpose, regex engines greedy, consume /
can, there shouldn't non /
characters left. furthermore, next character either not /
in optional (?:[^\/]+\.)?
portion, , if not matched next character w
in wikimapia.org
. lookahead serves no purpose.
unless referencing capture-groups, making group non-capturing has no impact on performance. still thing though, having habit makes easier if using back-references.
Comments
Post a Comment