简体   繁体   English

PHP cURL多处理导致服务器之间的随机连接问题?

[英]PHP cURL multi handling causing random connection issues between servers?

I have a website that tracks individual player's data for an online game. 我有一个网站,该网站跟踪在线游戏的个人玩家数据。 Everyday at the same time a cron is run that uses cURL to fetch each player's data from the game company's server (each player requires their own page to fetch). 每天同一时间运行cron,使用cURL从游戏公司的服务器中获取每个玩家的数据(每个玩家都需要自己的页面来获取)。 Previously I was looping through each player and creating their own cURL request at a time and storing the data - While this was a slow process, everything was working fine for weeks (doing anywhere from 500-1,000 players everyday). 以前,我遍历每个播放器并一次创建自己的cURL请求并存储数据-尽管这是一个缓慢的过程,但数周内一切正常(每天可处理500-1,000个播放器)。

As we gained more players the cron started to take too long to run so I rewrote it using ParallelCurl (cURL multi handling) about a week ago. 随着更多玩家的加入,cron开始耗时太长,因此大约一周前我使用ParallelCurl (cURL多重处理)重写了cron。 It was set to open no more than 10 connections at a time and was running perfectly - doing about 3,000 pages in 3-4 minutes. 它被设置为一次最多打开10个连接,并且运行良好-在3-4分钟内完成约3,000页。 I never noticed anything wrong until a day or two later I was randomly unable to connect to their servers (returning http code of 0). 直到一两天之后,我再也没有发现任何错误,因为我随机无法连接到他们的服务器(返回http代码0)。 I thought I was permanently banned/blocked until about 1-2 hours later I could suddenly connect again. 我以为我被永久禁止/阻止,直到大约1-2小时后,我突然可以再次连接。 The block occurred several hours after the cron had run for the day - the only requests that were being made at the time were the occasional single file requests (that have been working fine and left untouched for months). 当天执行cron的几个小时后就发生了阻止-当时唯一的请求是偶尔的单个文件请求(这些请求工作得很好,几个月没有受到影响)。

The past few days have all been like this. 过去几天都是这样。 Cron runs fine, then sometime later (a few hours) I can't get a connection for an hour or two. Cron运行正常,然后过了一段时间(几个小时),我一两个小时都无法连接。 Today I updated the cron to only open 5 connections at a time - everything worked fine until 5-6 hours later I couldn't connect for 2 hours. 今天,我将cron更新为一次只能打开5个连接-一切正常,直到5-6个小时后我无法连接2个小时。

I've done a ton of googling and can't seem to find anything useful. 我已经进行了大量的谷歌搜索,似乎找不到任何有用的东西。 I'd guess that possibly a firewall is blocking my connection, but I'm really in over my head when it comes to anything like that. 我猜想可能是防火墙阻止了我的连接,但是当涉及到诸如此类的事情时,我真的很烦。 I am really clueless as to what is happening, and what I need to do to fix it. 对于正在发生的事情以及修复它所需要做的事情,我真的一无所知。 I'd be grateful for any help - even a guess or a just point in the right direction. 我将不胜感激,希望您能为您提供帮助-甚至是猜测或正确方向的指点。

Note that I'm using a shared web host (HostGator). 请注意,我正在使用共享的Web主机(HostGator)。 2 days ago I submitted a ticket and made a post on their forums, I also sent an e-mail to the company and have yet to see a single reply from anything. 2天前,我提交了票证并在他们的论坛上发了帖,我还向公司发送了电子邮件,但还没有收到任何答复。

--EDIT-- - 编辑 -

Here's my code to run the multiple requests using parallelcurl. 这是我的使用parallelcurl运行多个请求的代码。 The include has been left untouched and is the same as shown here 包含内容保持不变,与此处显示的相同

set_time_limit(0);

require('path/to/parallelcurl.php');

$plyrs = array();//normally an array of all the players i need to update

function on_request_done($content, $url, $ch, $player) {
    $httpcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);    
    if ($httpcode !== 200) {
        echo 'Could Not Find '.$player.'<br />';
        return;
    } else {//player was found, store in db
        echo 'Updated '.$player.'<br />';
    }
}

$max_requests = 5;

$curl_options = array(
    CURLOPT_SSL_VERIFYPEER => FALSE,
    CURLOPT_SSL_VERIFYHOST => FALSE,
    CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.9) Gecko/20071025 Firefox/2.0.0.9',
);

$parallel_curl = new ParallelCurl($max_requests, $curl_options);

foreach ($plyrs as $p) {
    $search_url = "http://website.com/".urlencode($p);
    $parallel_curl->startRequest($search_url, 'on_request_done', $p);
usleep(300);//now that i think about it, does this actually do anything worthwhile positioned here?
}

$parallel_curl->finishAllRequests();

Here's the code I use to simply see if I can connect or not 这是我用来简单查看是否可以连接的代码

$ch = curl_init();

$options = array(
    CURLOPT_URL            => $url,
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_HEADER         => true,
    CURLOPT_FOLLOWLOCATION => true,
    CURLOPT_ENCODING       => "",
    CURLOPT_AUTOREFERER    => true,
    CURLOPT_CONNECTTIMEOUT => 120,
    CURLOPT_TIMEOUT        => 120,
    CURLOPT_MAXREDIRS      => 10,
    CURLOPT_SSL_VERIFYPEER => false,
    CURLOPT_SSL_VERIFYHOST => false,
);
curl_setopt_array( $ch, $options );
$response = curl_exec($ch); 
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);

print_r(curl_getinfo($ch));

if ( $httpCode != 200 ){
    echo "Return code is {$httpCode} \n"
        .curl_error($ch);
} else {
    echo "<pre>".htmlspecialchars($response)."</pre>";
}

curl_close($ch);

Running that when I'm unable to connect results in this: 当我无法连接时运行该命令将导致以下结果:

Array ( [url] => http://urlicantgetto.com/ [content_type] => [http_code] => 0 [header_size] => 0 [request_size] => 121 [filetime] => -1 [ssl_verify_result] => 0 [redirect_count] => 0 [total_time] => 30.073574 [namelookup_time] => 0.003384 [connect_time] => 0.025365 [pretransfer_time] => 0.025466 [size_upload] => 0 [size_download] => 0 [speed_download] => 0 [speed_upload] => 0 [download_content_length] => -1 [upload_content_length] => 0 [starttransfer_time] => 30.073523 [redirect_time] => 0 ) Return code is 0 Empty reply from server

This sounds like it's a network or firewall issue, rather than a PHP/code issue. 听起来这是网络或防火墙问题,而不是PHP /代码问题。

Either HostGator is blocking your outbound connections because you have a spike in outbound traffic that could be misinterpreted as a small DOS attack, or the game website is blocking you for the same reason. HostGator阻止出站连接,因为出站流量激增(可能被误解为小型DOS攻击),或者游戏网站出于相同原因而阻止了您。 Especially since this has only started since the number of requests has increased. 特别是因为这仅在请求数量增加后才开始。 And also the HTTP status code of 0 suggests firewall behaviour . 并且HTTP状态代码0表示防火墙行为

Alternatively, perhaps the connections aren't closing properly after the curl requests and later on when you try and load that website or download a file you can't because there are already too many open connections from your server. 另外,在curl请求之后,连接可能无法正确关闭,之后再尝试加载该网站或下载文件时,由于服务器上的打开连接已经太多,连接可能无法正确关闭。

If you have SSH access to your server I might be able to help debug if it's the network connections open problem, otherwise you'll need to speak to HostGator and the game website owners to see if either party is blocking you at all. 如果您具有服务器的SSH访问权限,则可能是网络连接打开问题可以帮助您进行调试,否则,您需要与HostGator和游戏网站所有者联系,以查看是否有一方完全阻止了您。

Another solution might be to scrape the game website slower (introduce a wait time between requests) to avoid being flagged as high network traffic. 另一个解决方案可能是将游戏网站的抓取速度变慢(增加请求之间的等待时间),以避免被标记为网络流量过大。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM