I have this regex to filter out urls but its also filtering out some invalid urls
$regexUrl = "((https?|ftp)\:\/\/)?"; // SCHEME $regexUrl .= "([a-zA-Z0-9+!*(),;?&=\$_.-]+(\:[a-zA-Z0-9+!*(),;?&=\$_.-]+)?@)?"; // User and Pass $regexUrl .= "([a-zA-Z0-9-.]*)\.([a-zA-Z]{2,3})"; // Host or IP $regexUrl .= "(\:[0-9]{2,5})?"; // Port $regexUrl .= "(\/([a-zA-Z0-9+\$_-]\.?)+)*\/?"; // Path $regexUrl .= "(\?[a-zA-Z+&\$_.-][a-zA-Z0-9;:@&%=+\/\$_.-]*)?"; // GET Query $regexUrl .= "(#[a-zA-Z_.-][a-zA-Z0-9+\$_.-]*)?"; // Anchor
for instance "http://...XYZ" is also filtered by the above regex but this is invalid url.
Any help would be appreciated
$valid = parse_url($url);
你在找什么
In your Host or IP line, change *
to +
and remove the .
from the first []
$regexUrl .= "([a-zA-Z0-9-]+)\.([a-zA-Z]{2,3})"; // Host or IP
The effect of this is to require (with +) some characters from the first []
and not permit a .
among them since the .
is handled (and required) by the \\.
which follows the first group.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.