I want to create a PHP script for a website. I just want to find out the links from that link. For example I have http://example.com link, my crawler should open that link in background and find all the links matching http://example.com/[any name]/reviews. I tried regex but not working, can anybody help me.
<?php
$url="https://clutch.co/it-services";
$contents =file_get_contents($url);
$pattern = "https://clutch.co/profile/".'/^[a-zA-Z ]*$/'."#review";
$pattern = preg_quote($pattern, '/');
if(preg_match_all($pattern, $contents, $matches)){
echo "Found matches:\n";
foreach ($matches[0] as $urls) {
echo $urls;
}
}
else{
echo "No matches found";
}
?>
The regex pattern has some syntax issues:
the delimiters /
need to be outside of the pattern and delimiters and special characters ( .
) inside that pattern ("https://") need to be exscaped ("https:\\/\\/")
So the pattern should be:
/https:\/\/clutch\.co\/profile\/[a-zA-Z ]*#review/
A regex fiddle: https://regex101.com/r/OEUQOU/1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.