简体繁体 English

PHP检索方法

[英]PHP Crawling Methods

原文 2011-05-12 11:32:28 0 2 php/ curl

任何人都可以告诉我，如果curl无法在PHP中进行爬网，那么将使用哪种方法进行爬网？

2 个解决方案

If you are referring to creating a spider to crawl entire remote sites the way utilities like wget can, I don't believe cURL has that capability. 如果您是指创建蜘蛛以像wget这样的实用工具可以爬网整个远程站点，我不相信cURL具有该功能。 cURL would be used to make the requests and download each page, but you have to create the logic in your PHP script to parse the content of the pages, extract the links, and create a list of URLs to crawl. cURL将用于发出请求并下载每个页面，但是您必须在PHP脚本中创建逻辑以解析页面内容，提取链接并创建要爬网的URL列表。 cURL doesn't do that part for you. cURL不会为您做这部分。

CURL can: CURL可以：

Follow redirect (set up as option) 跟随重定向（设置为选项）
Store content (->curl_exec()) 存储内容（-> curl_exec（））

This is all your need to crawl. 这就是您需要爬网的全部。 Used common methods, take examples from http://ua2.php.net/manual/en/function.curl-exec.php 常用的方法，例如http://ua2.php.net/manual/en/function.curl-exec.php中的示例

使用PHP进行网络爬网 - Web crawling using PHP

用PHP搜寻选项标签 - Crawling option tag with Php

使用PHP和XPATH进行爬网 - Crawling with PHP and XPATH

爬行抓取和穿线？用PHP - crawling scraping and threading? with php

使用PHP抓取Google搜索 - Crawling Google Search with PHP

PHP爬网程序未爬网所有元素 - PHP Crawler not crawling all elements

使用php抓取一个html页面？ - crawling a html page using php?

PHP从网站爬网数据 - PHP crawling data from website

关于Google抓取php页面 - About google crawling php pages

在核心php中爬行游戏商店 - Crawling play store in core php

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用PHP进行网络爬网 - Web crawling using PHP 用PHP搜寻选项标签 - Crawling option tag with Php 使用PHP和XPATH进行爬网 - Crawling with PHP and XPATH 爬行抓取和穿线？用PHP - crawling scraping and threading? with php 使用PHP抓取Google搜索 - Crawling Google Search with PHP PHP爬网程序未爬网所有元素 - PHP Crawler not crawling all elements 使用php抓取一个html页面？ - crawling a html page using php? PHP从网站爬网数据 - PHP crawling data from website 关于Google抓取php页面 - About google crawling php pages 在核心php中爬行游戏商店 - Crawling play store in core php

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM