简体   繁体   English

PHP检索方法

[英]PHP Crawling Methods

任何人都可以告诉我,如果curl无法在PHP中进行爬网,那么将使用哪种方法进行爬网?

If you are referring to creating a spider to crawl entire remote sites the way utilities like wget can, I don't believe cURL has that capability. 如果您是指创建蜘蛛以像wget这样的实用工具可以爬网整个远程站点,我不相信cURL具有该功能。 cURL would be used to make the requests and download each page, but you have to create the logic in your PHP script to parse the content of the pages, extract the links, and create a list of URLs to crawl. cURL将用于发出请求并下载每个页面,但是您必须在PHP脚本中创建逻辑以解析页面内容,提取链接并创建要爬网的URL列表。 cURL doesn't do that part for you. cURL不会为您做这部分。

CURL can: CURL可以:

  1. Follow redirect (set up as option) 跟随重定向(设置为选项)
  2. Store content (->curl_exec()) 存储内容(-> curl_exec())

This is all your need to crawl. 这就是您需要爬网的全部。 Used common methods, take examples from http://ua2.php.net/manual/en/function.curl-exec.php 常用的方法,例如http://ua2.php.net/manual/en/function.curl-exec.php中的示例

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM