I want to create a crawler using PHP script

Question

I want to create a PHP script for a website. I just want to find out the links from that link. For example I have http://example.com link, my crawler should open that link in background and find all the links matching http://example.com/[any name]/reviews. I tried regex but not working, can anybody help me.

<?php
$url="https://clutch.co/it-services";
$contents =file_get_contents($url);
$pattern = "https://clutch.co/profile/".'/^[a-zA-Z ]*$/'."#review";
$pattern = preg_quote($pattern, '/');
if(preg_match_all($pattern, $contents, $matches)){
   echo "Found matches:\n";
   foreach ($matches[0] as $urls) {
    echo $urls;
  }
}
else{
   echo "No matches found";
}
?>

Answer 1

The regex pattern has some syntax issues:

the delimiters / need to be outside of the pattern and delimiters and special characters ( . ) inside that pattern ("https://") need to be exscaped ("https:\\/\\/")

So the pattern should be:

/https:\/\/clutch\.co\/profile\/[a-zA-Z ]*#review/

A regex fiddle: https://regex101.com/r/OEUQOU/1

I want to create a crawler using PHP script

Question

1 answers

solution1
0 ACCPTED 2019-04-09 06:20:05

I want to create a crawler using PHP script

Question

1 answers

solution1 0 ACCPTED 2019-04-09 06:20:05

solution1
0 ACCPTED 2019-04-09 06:20:05