繁体   English   中英

为什么我的php curl脚本超时?

[英]Why is my php curl script timing out?

我有波纹的代码,需要从文本区域字段的URL和代理的输入,使用卷曲取源代码,需要从网页的某些环节,并将其插入到数据库中。 这个工作对一个URL,但在我加入代理和两个循环多个URL /代理服务器无法正常工作。 现在,它只是超时而没有错误消息,并说找不到文件。 我从proxy-list.org获得代理。 任何指针将不胜感激。

<html>
<body>

<?
$urls=explode("\n", $_POST['url']);
$proxies=explode("\n", $_POST['proxy']);

$allurls=count($urls);
$allproxies=count($proxies);

for ( $counter = 0; $counter <= $allurls; $counter++) {
for ( $count = 0; $count <= $allproxies; $count++) {

 $ch = curl_init();
 curl_setopt($ch, CURLOPT_URL,$urls[$counter]);
 curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, 0);
 curl_setopt($ch, CURLOPT_PROXY,$proxies[$count]);
 curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
 curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
 curl_setopt($ch, CURLOPT_CUSTOMREQUEST,'GET');
 curl_setopt ($ch, CURLOPT_HEADER, 1); 
 curl_exec ($ch); 
 $curl_scraped_page=curl_exec($ch); 

//use the new tool box
require "ToolBoxA4.php";

//call the new function parseA1
$arrOut = parseA1 ($curl_scraped_page);

//the output is an array with 3 items:  $arrOut[0] is RHS, $arrOut[1] is TOP, $arrOut[2] is NAT
//to look at the RHS

//$arrLookAt = explode(",", $arrOut[0]);
//print_r ($arrLookAt);
//echo "<br><hr><br>";
//foreach ($arrLookAt as $value){
//       echo $value;
//       echo "<br>";
//}

$FileName = abs(rand(0,1000000000000));
$FileHandle = fopen($FileName, 'w') or die("can't open file");
fwrite($FileHandle, $curl_scraped_page);

//$dom = new DOMDocument();
//@$dom->loadHTML($curl_scraped_page);
//$xpath = new DOMXPath($doc);
//$hrefs = $xpath->query('//a[@href][@id]');

$hostname="****";
$username="****";
$password="****";
$dbname="****";
$usertable="****";

$con=mysql_connect($hostname,$username, $password) or die ("<html><script language='JavaScript'>alert('Unable to connect to database! Please try again later.'),history.go(-1)</script></html>");
mysql_select_db($dbname ,$con);

//function storeLink($url) {
//  $query = "INSERT INTO **** (time, ad1, ad2) VALUES ('$FileName','$url', '$gathered_from')";
//  mysql_query($query) or die('Error, insert query failed');
//}
//for ($i = 0; $i < $hrefs->length; $i++) {
//  $href = $hrefs->item($i);
//  $url = $href->getAttribute('href');
//  storeLink($url);
//
//}

//function storeLink($top, $right) {
//$query = "INSERT INTO happyturtle (time, ad1, ad2) VALUES ('$FileName','$top', '$right')";
//mysql_query($query) or die('Error, insert query failed');

$right = explode(",", $arrOut[0]);
$top = explode(",", $arrOut[1]);

for ( $countforme = 0; $countforme <= 5; $countforme++) {

$topnow=$top[$countforme];

$query = "INSERT INTO **** (time, ad1) VALUES ('$FileName','$topnow')";
mysql_query($query) or die('Error, insert query failed');

}

for ( $countforme = 0; $countforme <= 15; $countforme++) {

$rightnow = $right[$countforme];


$query = "INSERT INTO **** (time, ad1) VALUES ('$FileName','$rightnow')";
mysql_query($query) or die('Error, insert query failed');

}


mysql_close($con);

fclose($FileHandle);

curl_close($ch);

//echo $FileName; 

//echo "<br/>";

}
}

?>

</body>
</html>

您的代码将按顺序获取每个URL,因此可能需要很长时间才能运行。 一种可能的解决方案是使用cURL“ multi”接口,该接口允许多个请求同时运行-http: //www.php.net/manual/en/function.curl-multi-exec.php

如果这本质上是一个批处理过程,另一种选择是增加您正在使用的服务器上的PHP超时。 有关此信息,请访问http://php.net/manual/en/function.set-time-limit.php

我要指出的一个观点是,公共代理服务器(例如来自proxy-list.org的代理服务器)响应速度可能很慢,并且由于您从多个位置请求脚本,因此只要响应最慢的代理服务器,脚本就总是会花费时间(可能比您服务器的PHP超时设置长)。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM