简体   繁体   English

PHP cURL 计时问题

[英]PHP cURL Timing Issue

I have a PHP script that is used to query an API and download some JSON information / insert that information into a MySQL database, we'll call this scriptA.php.我有一个 PHP 脚本,用于查询 API 并下载一些 JSON 信息/将该信息插入到 MySQL 数据库中,我们将此称为 scriptA.php。 I need to run this script multiple times as minute, preferably as many times in a minute that I can without allowing two instances to run at the same exact time or with any overlap.我需要以分钟为单位多次运行此脚本,最好在一分钟内尽可能多地运行,而不允许两个实例同时运行或有任何重叠。 My solution to this has been to create scriptB.php and put in on a one minute cron job.我对此的解决方案是创建 scriptB.php 并执行一分钟的 cron 作业。 Here is the source code of scriptB.php...这里是scriptB.php的源代码...

function next_run()
{
    $curl = curl_init();
    curl_setopt($curl, CURLOPT_URL, "http://somewebsite.com/scriptA.php");
    curl_exec($curl);
    curl_close($curl);
    unset($curl);
}
$i = 0;
$times_to_run = 7;
$function = array();
while ($i++ < $times_to_run) {
    $function = next_run();
    sleep(3);
}

My question at this point is to how cURL performs when used in a loop, does this code trigger scriptA.php and THEN once it has finished loading it at that point start the next cURL request?我此时的问题是在循环中使用 cURL 时如何执行,这段代码是否会触发 scriptA.php,然后一旦它完成加载,就开始下一个 cURL 请求? Does the 3 second sleep even make a difference or will this literally run as fast as the time it takes each cURL request to complete. 3 秒的睡眠是否会产生影响,或者这实际上会与每个 cURL 请求完成所需的时间一样快。 My objective is to time this script and run it as many times as possible in a one minute window without two iterations of it being run at the same time.我的目标是为这个脚本计时并在一分钟的窗口内尽可能多地运行它,而不是同时运行它的两次迭代。 I don't want to include the sleep statement if it is not needed.如果不需要,我不想包含 sleep 语句。 I believe what happens is cURL will run each request upon finishing the last, if I am wrong is there someway that I can instruct it to do this?我相信 cURL 会在完成最后一个请求后运行每个请求,如果我错了,我可以指示它这样做吗?

I need to run this script multiple times as minute, preferably as many times in a minute that I can without allowing two instances to run我需要在几分钟内多次运行此脚本,最好在一分钟内运行多次,而不允许运行两个实例

Your in luck as I wrote a class to handle just such a thing.你很幸运,因为我写了一个类来处理这样的事情。 You can find it on my github here你可以在我的github上找到它here

https://github.com/ArtisticPhoenix/MISC/blob/master/ProcLock.php https://github.com/ArtisticPhoenix/MISC/blob/master/ProcLock.php

I'll also copy the full code at the end of this post.我还将在本文末尾复制完整代码。

The basic idea is to create a file, I will call it afile.lock for this example.基本思想是创建一个文件,在这个例子中我将它afile.lock In this file is recorded the PID, or the process ID of the current process that is ran by cron.在此文件中记录了 PID,或由 cron 运行的当前进程的进程 ID。 Then when cron attempts to run the process again, it checks this lock file and sees if there is a PHP process running that is using this PID.然后当 cron 尝试再次运行该进程时,它会检查此锁定文件并查看是否有正在运行的 PHP 进程正在使用此 PID。

  • if there is it updates the modified time of the file (and throws an exception)如果有它更新文件的修改时间(并抛出异常)
  • if there is not then you are free to create a new instance of the "worker".如果没有,那么您可以自由创建“工人”的新实例。

As a bonus th modified time of the lock file can be used by the script (whose PID we are tracking) as a way of shutting down in the event the file is not updated, so for example: if cron is stopped, or if the lock file is manually deleted you can set in in such a way that the running script will detect this and self destruct.作为奖励,脚本(我们正在跟踪其 PID)可以使用锁定文件的修改时间作为在文件未更新时关闭的一种方式,例如:如果 cron 停止,或者锁定文件被手动删除,您可以设置为运行脚本会检测到并自毁。

So not only can you keep multiple instances from running, you can tell the current instance to die if cron is turned off.因此,您不仅可以阻止多个实例运行,还可以告诉当前实例在 cron 关闭时死亡。

The basic usage is as follows.基本用法如下。 In the cron file that starts up the "worker"在启动“工人”的 cron 文件中

//define a lock file (this is actually optional)
ProcLock::setLockFile(__DIR__.'/afile.lock');

try{
 //if you didn't set a lock file you can pass it in with this method call
  ProcLock::lock();
  //execute your process

}catch(\Exception $e){
    if($e->getCode() == ProcLock::ALREADY_LOCKED){
      //just exit or what have you
    }else{
      //some other exception happened.
    }
}

It's basically that easy.基本上就是这么简单。

Then in the running process you can every so often check (for example if you have a loop that runs something)然后在运行过程中,您可以经常检查(例如,如果您有一个运行某些东西的循环)

 $expires = 90; //1 1/2 minute (you may need a bit of fudge time)
 foreach($something as $a=>$b){
     $lastAccess = ProcLock::getLastAccess()
     if(false == $lastAccess  || $lastAccess + $expires < time()){
         //if last access is false (no lock file)
         //or last access + expiration, is less then the current time

         //log something like killed by lock timeout

         exit(); 
     }

 }

Basically what this says is that either the lock file was deleted wile the process was running, or cron failed to update it before the expiration time.基本上这意味着要么在进程运行时删除了锁定文件,要么 cron 未能在到期时间之前更新它。 So here we are giving it 90 seconds and cron should be updating the lock file every 60 seconds.所以在这里我们给它 90 秒,cron 应该每 60 秒更新一次锁定文件。 As I said the lock file is updated automatically if it's found when calling lock() , which calls canLock() which if it returns true meaning we can lock the process because its not currently locked, then it runs touch($lockfile) which updates the mtime (modified time).正如我所说,如果在调用lock()时找到锁文件会自动更新,它调用canLock()如果它返回true意味着我们可以锁定进程,因为它当前没有被锁定,然后它运行touch($lockfile)更新mtime(修改时间)。

Obviously you can only self kill the process in this way if it is actively checking the access and expiration times.显然,如果它正在主动检查访问和到期时间,则只能以这种方式自终止进程。

This script is designed to work both on windows and linux.此脚本旨在在 Windows 和 Linux 上运行。 On windows under certain circumstances the lock file won't properly be deleted (sometimes when hitting ctrl + c in the CMD window), however I have taken great pains to make sure this does not happen, so the class file contains a custom register_shutdown_function that runs when the PHP script ends.在某些情况下,在 Windows 上,不会正确删除锁定文件(有时在 CMD 窗口中按ctrl + c时),但是我非常努力地确保不会发生这种情况,因此类文件包含一个自定义register_shutdown_function PHP 脚本结束时运行。

When running something using the ProcLoc in the browser please note that the process id will always be the same no matter the tab its ran in. So if you open one tab that is Process locked, then open another tab, the process locker will see it as the same process and allow it to lock again.当在浏览器中使用 ProcLoc 运行某些东西时,请注意进程 ID 将始终相同,无论它运行在哪个选项卡中。因此,如果您打开一个进程锁定的选项卡,然后打开另一个选项卡,进程锁定器将看到它作为相同的进程并允许它再次锁定。 To properly run it in a browser and test the locking it must be done using two separate browsers such as crome and firefox.要在浏览器中正确运行它并测试锁定,必须使用两个独立的浏览器(例如 crome 和 firefox)来完成。 It's not really intended to be ran in the browser but this is one quirk I noticed.它并不是真的打算在浏览器中运行,但这是我注意到的一个怪癖。

One last note this class is completely static, as you can have only one Process ID per process that is running, which should be obvious.最后要注意这个类是完全静态的,因为每个正在运行的进程只能有一个进程 ID,这应该是显而易见的。

The tricky parts are棘手的部分是

  • making sure the lock file is disposed of in the event of even critical PHP failures确保在发生严重的 PHP 故障时处理锁定文件
  • making sure another process didn't pick up the pid number when it was freed from PHP.确保另一个进程在从 PHP 中释放时没有获取 pid 号。 This can be done with relative accuracy, in that we can tell if a PHP process is using it, and if so we assume its the process we need, there is much less chance a re-used PID would show up for another process very quickly, even less that it would be another PHP process这可以相对准确地完成,因为我们可以判断一个 PHP 进程是否正在使用它,如果是,我们假设它是我们需要的进程,那么重用的 PID 很快就会出现在另一个进程中的可能性要小得多,更不用说它是另一个 PHP 进程
  • making all this work on both Linux and Windows使所有这些都在 Linux 和 Windows 上工作

Lucky for you I have already invested sufficient time in this to do all these things, this is a more generic version of an original lock script I made for my job that we have used in this way successfully for 3 years in maintaining control over various synchronous cron jobs, everything from sFTP upload scanning, expired file clean up to RabbitMq message workers that run for an indefinite period of time.幸运的是我已经投入了足够的时间来完成所有这些事情,这是我为我的工作制作的原始锁定脚本的更通用版本,我们以这种方式成功使用了 3 年,以保持对各种同步的控制cron 作业,从 sFTP 上传扫描、过期文件清理到无限期运行的 RabbitMq 消息工作者。

In anycase here is the full code, enjoy.无论如何,这里是完整的代码,享受吧。

<?php
/*
 (c) 2017 ArtisticPhoenix
 
 For license information please view the LICENSE file included with this source code GPL3.0. 
 Proccess Locker
 ==================================================================
 This is a pseudo implementation of mutex since php does not have
 any thread synchronization objects
 This class uses files to provide locking functionality.
 Lock will be released in following cases
 1 - user calls unlock
 2 - when this lock object gets deleted
 3 - when request or script ends
 4 - when pid of lock does not match self::$_pid
 ==================================================================
 Only one Lock per Process!
 -note- when running in a browser typically all tabs will have the same PID
 so the locking will not be able to tell if it's the same process, to get 
 around this run in CLI, or use 2 diffrent browsers, so the PID numbers are diffrent.
 
 This class is static for the simple fact that locking is done per-proces, so there is no need 
 to ever have duplate ProcLocks within the same process
 ---------------------------------------------------------------
 */
final class {
    
    /**
     * exception code numbers
     * @var int
     */
    const DIRECTORY_NOT_FOUND   = 2000; 
    const LOCK_FIRST            = 2001;
    const FAILED_TO_UNLOCK      = 2002;
    const FAILED_TO_LOCK        = 2003;
    const ALREADY_LOCKED        = 2004;
    const UNKNOWN_PID           = 2005;
    const PROC_UNKNOWN_PID      = 2006;
    
    
    /**
     * process _key
     * @var string
     */
    protected static $_lockFile; 
    
    /**
     *
     * @var int
     */
    protected static $_pid;   
    
    /**
     * No construction allowed
     */
    private function __construct(){}
    
    /**
     * No clones allowed
     */
    private function __clone(){}
    
    /**
     * globaly sets the lock file
     * @param string $lockFile
     */
    public static function setLockFile( $lockFile ){
        $dir = dirname( $lockFile );
        if( !is_dir( dirname( $lockFile ))){ 
            throw new Exception("Directory {$dir} not found", self::DIRECTORY_NOT_FOUND);  //pid directroy invalid
        }
        
        self::$_lockFile = $lockFile;
    }
    
    /**
     * return global lockfile
     */
    public static function getLockFile() {
        return ( self::$_lockFile ) ? self::$_lockFile : false;
    }
    
    /**
     * safe check for local or global lock file
     */
    protected static function _chk_lock_file( $lockFile = null ){
        if( !$lockFile && !self::$_lockFile ){
            throw new Exception("Lock first", self::LOCK_FIRST); //
        }elseif( $lockFile ){
            return $lockFile;
        }else{
            return self::$_lockFile;
        }
    }
    
    /**
     * 
     * @param string $lockFile
     */
    public static function unlock( $lockFile = null ){
        if( !self::$_pid ){
            //no pid stored - not locked for this process
            return;
        }
        
        $lockFile = self::_chk_lock_file($lockFile);
        if(!file_exists($lockFile) || unlink($lockFile)){
            return true;
        }else{
            throw new Exception("Failed to unlock {$lockFile}", self::FAILED_TO_UNLOCK ); //no lock file exists to unlock or no permissions to delete file
        }
    }
    
    /**
     *
     * @param string $lockFile
     */
    public static function lock( $lockFile = null ){    
        $lockFile = self::_chk_lock_file($lockFile);
        if( self::canLock( $lockFile )){
            self::$_pid = getmypid();
            if(!file_put_contents($lockFile, self::$_pid ) ){
                throw new Exception("Failed to lock {$lockFile}", self::FAILED_TO_LOCK ); //no permission to create pid file
            }
        }else{
            throw new Exception('Process is already running[ '.$lockFile.' ]', self::ALREADY_LOCKED );//there is a process running with this pid 
        }
    }
    /**
     *
     * @param string $lockFile
     */
    public static function getPidFromLockFile( $lockFile = null ){
        $lockFile = self::_chk_lock_file($lockFile);
        
        if(!file_exists($lockFile) || !is_file($lockFile)){
            return false;
        }
    
        $pid = file_get_contents($lockFile);
    
        return intval(trim($pid));
    }
    
    /**
     * 
     * @return number
     */
    public static function getMyPid(){
        return ( self::$_pid ) ? self::$_pid : false;
    }
    
    /**
     * 
     * @param string $lockFile
     * @param string $myPid
     * @throws Exception
     */
    public static function validatePid($lockFile = null, $myPid = false ){
        $lockFile = self::_chk_lock_file($lockFile);
        if( !self::$_pid && !$myPid ){
            throw new Exception('no pid supplied', self::UNKNOWN_PID ); //no stored or injected pid number
        }elseif( !$myPid ){
            $myPid = self::$_pid;
        }
        return ( $myPid == self::getPidFromLockFile( $lockFile ));  
    }
    /**
     * update the mtime of lock file
     * @param string $lockFile
     */
    public static function canLock( $lockFile = null){
        if( self::$_pid ){
            throw new Exception("Process was already locked", self::ALREADY_LOCKED ); //process was already locked - call this only before locking
        }
        
        $lockFile = self::_chk_lock_file($lockFile);
        
        $pid = self::getPidFromLockFile( $lockFile );
        
        if( !$pid ){
            //if there is a not a pid then there is no lock file and it's ok to lock it
            return true;
        }
        
        //validate the pid in the existing file
        $valid = self::_validateProcess($pid);  
        
        if( !$valid ){
            //if it's not valid - delete the lock file
            if(unlink($lockFile)){
                return true;
            }else{
                throw new Exception("Failed to unlock {$lockFile}", self::FAILED_TO_UNLOCK ); //no lock file exists to unlock or no permissions to delete file
            }   
        }
        
        //if there was a valid process running return false, we cannot lock it.
        //update the lock files mTime - this is usefull for a heartbeat, a periodic keepalive script.
        touch($lockFile);
        return false;   
    }
    
    /**
     *
     * @param string $lockFile
     */
    public static function getLastAccess( $lockFile = null ){
        $lockFile = self::_chk_lock_file($lockFile);
        clearstatcache( $lockFile );
        if( file_exists( $lockFile )){
            return filemtime( $lockFile );
        }
        return false;
    }
    
    /**
     *
     * @param int $pid
     */
    protected static function _validateProcess( $pid ){
        $task = false;
        $pid = intval($pid);
        if(stripos(php_uname('s'), 'win') > -1){
            $task = shell_exec("tasklist /fi \"PID eq {$pid}\"");
            /*
             'INFO: No tasks are running which match the specified criteria.
                '
                */
            /*
             '
                Image Name                     PID Session Name        Session#    Mem Usage
                ========================= ======== ================ =========== ============
                php.exe                    5064 Console                    1     64,516 K
                '
            */
        }else{
            $cmd = "ps ".intval($pid);
            $task = shell_exec($cmd);
            /*
             '  PID TTY      STAT   TIME COMMAND
                '
            */
        }
            
        //print_rr( $task );
        if($task){
            return ( preg_match('/php|httpd/', $task) ) ? true : false;
        }
    
        throw new Exception("pid detection failed {$pid}", self::PROC_UNKNOWN_PID);  //failed to parse the pid look up results 
        //this has been tested on CentOs 5,6,7 and windows 7 and 10
    }
    
    /**
     * destroy a lock ( safe unlock )
     */
    public static function destroy($lockFile = null){
        try{
            $lockFile = self::_chk_lock_file($lockFile);
            self::unlock( $lockFile );
        }catch( Exception $e ){
            //ignore errors here - this called from distruction so we dont care if it fails or succeeds
            //generally a new process will be able to tell if the pid is still in use so
            //this is just a cleanup process
        }
    }
}
/*
 * register our shutdown handler - if the script dies unlock the lock
 * this is superior to __destruct(), because the shutdown handler runs even in situation where PHP exhausts all memory
 */
register_shutdown_function(array('\\Lib\\Queue\\ProcLock',"destroy"));

preferably as many times in a minute that I can without allowing two instances to run at the same exact time or with any overlap. - then you shouldn't use a cronjob at all, you should use a daemon. - 那么你根本不应该使用 cronjob,你应该使用守护进程。 but if you for some reason have to use a cronjob (eg, if you're on a shared webhosting platform that doesn't allow daemons), guess you could use the sleep hack to run the same code several times a minute?但是如果您出于某种原因必须使用 cronjob(例如,如果您在一个不允许守护进程的共享虚拟主机平台上),您猜您可以使用 sleep hack 每分钟运行几次相同的代码?

* * * * * /usr/bin/php /path/to/scriptA.php
* * * * * sleep 10; /usr/bin/php /path/to/scriptA.php
* * * * * sleep 20; /usr/bin/php /path/to/scriptA.php
* * * * * sleep 30; /usr/bin/php /path/to/scriptA.php
* * * * * sleep 40; /usr/bin/php /path/to/scriptA.php
* * * * * sleep 50; /usr/bin/php /path/to/scriptA.php

should make it execute every 10 seconds.应该让它每 10 秒执行一次。

as for making sure it doesn't run in paralell if the previous execution hasn't finished yet, add this to the start of scriptA如果前一次执行还没有完成,要确保它不会并行运行,请将其添加到 scriptA 的开头

call_user_func ( function () {
    static $lock;
    $lock = fopen ( __FILE__, "rb" );
    if (! flock ( $lock, LOCK_EX | LOCK_NB )) {
        // failed to get a lock, probably means another instance is already running
        die ();
    }
    register_shutdown_function ( function () use (&$lock) {
        flock ( $lock, LOCK_UN );
    } );
} );

and it will just die() if another instance of scriptA is already running.如果另一个 scriptA 实例已经在运行,它就会死()。 however, if you want it to wait for the previous execution to finish, instead of just exiting, remove LOCK_NB... but that could be dangerous, if every, or even just a majority of the executions use more than 10 seconds, you'll have more and more processes waiting for the previous execution to finish, until you run out of ram.但是,如果您希望它等待上一次执行完成,而不是仅仅退出,请删除 LOCK_NB...但这可能很危险,如果每个或什至大部分执行使用超过 10 秒,您会有越来越多的进程在等待上一次执行完成,直到内存用完。

as for your curl questions,至于你的卷曲问题,

My question at this point is to how cURL performs when used in a loop, does this code trigger scriptA.php and THEN once it has finished loading it at that point start the next cURL request , that is correct, curl waits until the page has completely loaded, usually meaning the entire scriptA has completed. My question at this point is to how cURL performs when used in a loop, does this code trigger scriptA.php and THEN once it has finished loading it at that point start the next cURL request ,这是正确的,curl 等待页面有完全加载,通常意味着整个 scriptA 已经完成。 (you can tell scriptA to finish the pageload prematurely with the fastcgi_finish_request() function if you really want, but that's unusual) (如果你真的想要,你可以告诉 scriptA 使用fastcgi_finish_request()函数提前完成页面加载,但这是不寻常的)

Does the 3 second sleep even make a difference or will this literally run as fast as the time it takes each cURL request to complete - yes, the sleep will make the loop 3 seconds slower per iteration. Does the 3 second sleep even make a difference or will this literally run as fast as the time it takes each cURL request to complete - 是的,睡眠会使循环每次迭代慢 3 秒。

My objective is to time this script and run it as many times as possible in a one minute window without two iterations of it being run at the same time - then make it a daemon that never exits, rather than a cronjob. My objective is to time this script and run it as many times as possible in a one minute window without two iterations of it being run at the same time ,而不是My objective is to time this script and run it as many times as possible in a one minute window without two iterations of it being run at the same time - 然后使它成为一个永不退出的守护进程,而不是一个 cronjob。

I don't want to include the sleep statement if it is not needed. - it's not needed. - 不需要。

I believe what happens is cURL will run each request upon finishing the last - this is correct. I believe what happens is cURL will run each request upon finishing the last - 这是正确的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM