I want to create a regex to match URLs that start with http://, https://, // or to find urls that have an extension different from html, htm, php and php3. URL query substrings are optional
Let's say that I want to find these:
http://example.com
/example.mp3
/example.mp3?q=example
http://example.com/example.mp3
#example
And to reject these:
example
/example
/example/
/example.htm
/example.htm?q=example
/example.mp3/example //The .mp3 needs to be extension to be accepted
/example#example
I already tried this /(^(http:\\/\\/|https:\\/\\/|\\/\\/|#)|(.*)((.*)\\.^(?!html|htm|php|php3)$)(\\?.*)?$)/igm
but it didn't worked.
If the opposite(reversing the accepted and declined lists) is easier to do, even that is very appreciated, I can change the function that handles the regex.
It seems you may use
^(?:#.+|(?:https?:/)?/[^?#\n]*\.(?!(?:html?|php3?)\b)\w+(?:\?.*)?)$
See the regex demo
Pattern details :
^
- start of string (?:#.+
- either a #
followed with any 1+ chars |
- or (?:https?:/)?/[^?#\\n]*\\.(?!html?|php3?)\\w+(?:\\?.*)?)
-
(?:https?:/)?/
- an optional http:/
or https:/
and then /
[^?#]*
- 0+ chars other than ?
and #
\\.
- a dot (?!(?:html?|php3?)\\b)\\w+
- 1 or more letters/digits/underscore that is not equal to htm
, html
, php
or php3
(?:\\?.*)?)
- an optional ?
followed with any 0+ chars$
- end of string
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.