简体   繁体   English

如何加快CURL任务?

[英]How can I speed up CURL tasks?

I'm using CURL to fetch some data from user accounts. 我正在使用CURL从用户帐户中获取一些数据。 First it logs in and then redirect to another URL where the data resides. 首先,它登录然后重定向到数据所在的另一个URL。

My stats showed that it took an average of 14 seconds to fetch some data spread over 5 pages. 我的统计数据显示,平均需要14秒才能获取超过5页的数据。 I would like to speed things up, my questions are: 我想加快速度,我的问题是:

Is it possible to see how much each step takes? 是否有可能看到每个步骤需要多少? Do you know how could I speed up/enhance CURL? 你知道我怎么能加速/增强CURL?

Thanks 谢谢

To make the task 'feel' faster, don't run it as part of a web request, run it in the background as a periodic task (cron job). 要使任务“感觉”更快,请不要将其作为Web请求的一部分运行,而是在后台作为定期任务(cron作业)运行它。

Cache the response on disk or in a database. 将响应缓存在磁盘或数据库中。

you can use parallelCurl by Pete Warden. 你可以使用Pete Warden的parallelCurl。 The source is available here http://github.com/petewarden/ParallelCurl . 源代码可以在http://github.com/petewarden/ParallelCurl找到 The module allows you to run multiple CURL url fetches in parallel in PHP 该模块允许您在PHP中并行运行多个CURL URL提取

You can't make the process of retrieving a page from a server any faster. 您无法更快地从服务器检索页面。

You can make the pages smaller, so they can download quicker. 您可以缩小页面,以便更快地下载。 You can beef up the processing power on the servers or the connection between your server and the server the pages are on. 您可以增强服务器上的处理能力或服务器与页面所在服务器之间的连接。

If you are consuming a service, what format is the data in? 如果您正在使用服务,那么数据的格式是什么? If it is XML, maybe it is too verbose and this is causing lots of extra kilobytes, for example. 如果它是XML,也许它太冗长了,这会导致很多额外的千字节,例如。

Speed up curl with this opcion 用这个选择加快卷曲

curl_setopt($curl, CURLOPT_IPRESOLVE, CURL_IPRESOLVE_V4 );

Regards 问候

split task to 3 files. 将任务拆分为3个文件。

  1. file for retrieving page list and as your main script (to put on crontab) (main.php) 用于检索页面列表的文件和作为主脚本(放在crontab上)(main.php)
  2. for parsing the actual page. 用于解析实际页面。 (parse.php) (parse.php)
  3. some shell script to process your 2nd script. 一些shell脚本来处理你的第二个脚本。

Then, in your 1st file, do something like this: 然后,在第一个文件中,执行以下操作:

<?php
$pagelist = get_page_list();//this will retrieve page list using CURL and save each page to some, let's say pagelist.txt and return this absolute path.

exec("/path/to/php /your/3rdscript.sh < $pagelist");
?>

And here's your 3rd file: 这是你的第三个文件:

#!/bin/bash  

while read line
do
    /path/to/php /path/to/your/2ndscript.php -f $line &
done

Please note that on 3rd script (the shell script) I use & (ampersand). 请注意,在第3个脚本(shell脚本)上,我使用&(&符号)。 This will tell the shell to put that particular process into background process. 这将告诉shell将该特定进程放入后台进程。

On your 2nd script, you can use something like this: 在第二个脚本上,您可以使用以下内容:

<?php

$pageurl = $argv[2];
//do your curl process to fetch page $pageurl here

Using step above, you can speed up by fetching several pages at once. 使用上面的步骤,您可以通过一次提取多个页面来加快速度。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM