简体   繁体   English

我想使用 PHP 脚本创建一个爬虫

[英]I want to create a crawler using PHP script

I want to create a PHP script for a website.我想为网站创建一个 PHP 脚本。 I just want to find out the links from that link.我只想从那个链接中找出链接。 For example I have http://example.com link, my crawler should open that link in background and find all the links matching http://example.com/[any name]/reviews.例如,我有http://example.com链接,我的爬虫应该在后台打开该链接并找到所有与http://example.com/[any name]/reviews 匹配的链接。 I tried regex but not working, can anybody help me.我试过正则表达式但不起作用,有人可以帮助我。

<?php
$url="https://clutch.co/it-services";
$contents =file_get_contents($url);
$pattern = "https://clutch.co/profile/".'/^[a-zA-Z ]*$/'."#review";
$pattern = preg_quote($pattern, '/');
if(preg_match_all($pattern, $contents, $matches)){
   echo "Found matches:\n";
   foreach ($matches[0] as $urls) {
    echo $urls;
  }
}
else{
   echo "No matches found";
}
?>

The regex pattern has some syntax issues:正则表达式模式有一些语法问题:

the delimiters / need to be outside of the pattern and delimiters and special characters ( . ) inside that pattern ("https://") need to be exscaped ("https:\\/\\/")分隔符/需要在模式之外,并且该模式(“https://”)内的分隔符和特殊字符( . )需要被转义(“https:\\/\\/”)

So the pattern should be:所以模式应该是:

/https:\/\/clutch\.co\/profile\/[a-zA-Z ]*#review/

A regex fiddle: https://regex101.com/r/OEUQOU/1正则表达式小提琴: https : //regex101.com/r/OEUQOU/1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 我想使用php搜寻器从此文档中获取特定的网址 - I want to get specific urls from this document using a php crawler 爬虫脚本php - Crawler script php 我想使用wwwcopy脚本从我的PHP内容创建几个静态html页面 - I want to create several static html pages from my PHP content using a wwwcopy script 我想用php脚本中的联系表单创建自动弹出窗口 - I want to create auto pop up with contact form in php script 我现在使用php crawler提取了页面网址,我想计算每个页面上的浏览总数(点击数)。如何使用php?p - i fetched page urls using php crawler now i want to count the total number of views (hits) on each page.how can i do it using php ?p 使用php和Regex的爬虫 - a crawler using php and Regex 使用php脚本发送邮件(我希望以功能形式进行发送) - send mail using a php script(i want to make it in a form of function) 我正在使用的PHP搜寻器存在内存泄漏,这是什么原因引起的? - The PHP crawler I am using has a memory leak, what is causing this? 我想使用PHP和AJAX为我的网站创建菜单编辑器 - I want to create a menu editor for my website using PHP and AJAX 我想使用数组或会话/cookie 在 php 中创建一个照片库 - I want to create a photo gallery in php using array or session/ cookies
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM