简体   繁体   English

并行运行file_put_contents?

[英]Running file_put_contents in parallel?

was searching stackoverflow for a solution, but couldn't find anything even close to what I am trying to achieve. 正在搜索stackoverflow的解决方案,但找不到任何甚至接近我想要实现的东西。 Perhaps I am just blissfully unaware of some magic PHP sauce everyone is doing tackling this problem... ;) 也许我只是幸福地没有意识到每个人都在解决这个问题的一些神奇的PHP酱......;)

Basically I have an array with give or take a few hundred urls, pointing to different XML files on a remote server. 基本上我有一个数组,给出或采取几百个网址,指向远程服务器上的不同XML文件。 I'm doing some magic file-checking to see if the content of the XML files have changed and if it did, I'll download newer XMLs to my server. 我正在做一些神奇的文件检查,看看XML文件的内容是否已经改变,如果有,我会下载更新的XML到我的服务器。

PHP code: PHP代码:

$urls = array(
    'http://stackoverflow.com/a-really-nice-file.xml',
    'http://stackoverflow.com/another-cool-file2.xml'
);
foreach($urls as $url){
    set_time_limit(0);
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_FAILONERROR, true);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($ch, CURLOPT_BINARYTRANSFER, false);
    $contents = curl_exec($ch);
    curl_close($ch);
    file_put_contents($filename, $contents);
}

Now, $filename is set somewhere else and gives each xml it's own ID based on my logic. 现在,$ filename被设置在其他地方,并根据我的逻辑为每个xml提供它自己的ID。 So far this script is running OK and does what it should, but it does it terribly slow. 到目前为止,这个脚本运行正常并且做了应有的事情,但它的速度非常慢。 I know my server can handle a lot more and I suspect my foreach is slowing down the process. 我知道我的服务器可以处理更多,我怀疑我的foreach正在减慢这个过程。

Is there any way I can speed up the foreach? 有什么方法可以加快foreach的速度吗? Currently I am thinking to up the file_put_contents in each foreach loop to 10 or 20, basically cutting my execution time 10- or 20-fold, but can't think of how to approach this the best and most performance kind of way. 目前我正在考虑将每个foreach循环中的file_put_contents提升到10或20,基本上将我的执行时间缩短10或20倍,但是无法想到如何以最佳和最佳性能方式处理它。 Any help or pointers on how to proceed? 有关如何进行的任何帮助或指示?

Your bottleneck (most likely) is your curl requests, you can only write to a file after each request is done, there is no way (in a single script) to speed up that process. 您的瓶颈(最有可能)是您的curl请求,您只能在每个请求完成后写入文件,没有办法(在单个脚本中)加速该过程。

I don't know how it all works but you can execute curl requests in parallel: http://php.net/manual/en/function.curl-multi-exec.php . 我不知道它是如何工作的,但你可以并行执行curl请求: http//php.net/manual/en/function.curl-multi-exec.php

Maybe you can fetch the data (if memory is available to store it) and then as they complete fill in the data. 也许您可以获取数据(如果内存可用于存储它),然后在完成填充数据时。

Just run more script. 只需运行更多脚本。 Each script will download some urls. 每个脚本都会下载一些网址。

You can get more information about this pattern here: http://en.wikipedia.org/wiki/Thread_pool_pattern 您可以在此处获取有关此模式的更多信息: http//en.wikipedia.org/wiki/Thread_pool_pattern

The more script your run the more parallelism you get 您运行的脚本越多,您获得的并行性就越多

I use on paralel requests guzzle pool ;) ( you can send x paralel request) 我用paralel请求guzzle pool;)(你可以发送x paralel请求)

http://docs.guzzlephp.org/en/stable/quickstart.html http://docs.guzzlephp.org/en/stable/quickstart.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM