I am using a good Multi CURL interface called Rolling CURL
http://code.google.com/p/rolling-curl/issues/detail?id=20
It works fine, for example it gets data from 20 sites in around 3 seconds. The problem is that I need it to work on 200 - 300 sites that are on the SAME server. This takes around the same time that it takes to make a single CURL request in a loop which is around 10 minutes 47 seconds. So I am a bit stumped as to what to do. All I need to do is grab the HTTP code on each site. I have tried file_get_contents, PHP FTP functions, they are much slower.
Another thing is that when I run through a list of 12 + domains that are on the same server, it seems to block the request so I just don't get any data back at all on any of the sites. This problem will not occur when I run a list of less than 12. I am only fetching the header data of the sites so it shouldn't be that slow.
If anyone can help me or give me a detailed explanation on why this is happening with pointers on how I can overcome this problem I will be incredibly thankful.
That sounds like the library is throttling concurrent requests per server. Take a look if you can configure that. Eg this is in the source-code and also the description why:
class RollingCurl {
/**
* @var int
*
* Window size is the max number of simultaneous connections allowed.
*
* REMEMBER TO RESPECT THE SERVERS:
* Sending too many requests at one time can easily be perceived
* as a DOS attack. Increase this window_size if you are making requests
* to multiple servers or have permission from the receving server admins.
*/
private $window_size = 5;
Also this might be interesting for you:
To get the status code, normally a HEAD request is all you need to do, cUrl supports that.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.