简体   繁体   中英

PHP: How to prevent multiple execution of a code (if it is already in process)

Explanation

An API call (to another service) which usually takes 10-20 seconds to respond is stored in database,

After it is stored, System will try to use the API instantly to show the result to user, but it might fail (and display that it failed but we will try again automatically), therefore there is also a Cron Job set to run every 30 seconds and try the (failed) queries again.

If the API return success (whether in instant usage or using Cron Job) the flag is changed to success in database and it will not run again.

Issue

My problem is while the Instant Call to API is in process, the Cron Job might also try another call as it is not yet flagged as successful,

Also in rare cases, while the previous Cron Job is in process, the next Cron Job might run the code again.

What I have already tried to prevent the issue

I tried storing In Process API calls in a database table with Status=1 and delete them when the API call was successful or set status to 0 if it failed,

 if ($status === 0)
 {
     
     // Set Status to 1 in Database First (or die() if database update failed)
     
     // Then Call The API

     // If Failed Set Status to 0 so Cron Job can try again
     
     // If Successful Change Flag to success and remove from queue

 }
     

But what if the Instant Call and the Cron Job Call happen at the exact same time? they both check if status is 0 which it is, then both set status to 1 and execute the API Call...

Questions

  1. Is what I have tried the correct way to handle this?

  2. Should I worry about them happening at the exact time (the issue i explained in the Yellow Quote above) if there are a lot of calls (sometimes +500/sec)

Update Before Bounty

Isn't there really an easy way to handle such cases on the PHP side? if not, which way is better in experts' opinion? below are some methods but none of them are detailed enough and none of them have any Downvotes/Upvotes.

PS There are many updates/inserts to database, I don't think locking is an efficient idea and I'm not sure about the rest of ideas.

This is exactly why Semaphore was created for.

In php, it can be used in the following way : Using semaphores in PHP is actually very straight forward. There are only 4 semaphore functions:

sem_acquire() – Attempt to acquire control of a semaphore.
sem_get() – Creates (or gets if already present) a semaphore.
sem_release() – Releases the a semaphore if it is already acquired.
sem_remove() – Removes (deletes) a semaphore.

So how do they all work together?

  1. First, you call sem_get() to fetch the identifier for the semaphore.
  2. After that, one of your processes will call sem_acquire() to try and acquire the semaphore. If it's currently unavailable, sem_acquire() will block until the semaphore is released by another process.
  3. Once the semaphore is acquired, you may access the resource that you are controlling with it.
  4. After you are done with the resource, call sem_release() so that another process can acquire the semaphore.
  5. When all is said and done, and you've made sure that none of your processes require the semaphore anymore, you can call sem_remove() to remove the semaphore completely.

You can find more information and example about this in this article .

what I do in scripts is (pseudocode)

SCRIPT START
LOCK FILE 'MYPROCESSFILE.LOCK'
DO SOMETHING I WANT
UNLOCK FILE 'MYPROCESSFILE.LOCK'
SCRIPT END

So if the file is locked the second (duplicated) process wont run (will lock/halt/wait) UNTIL the file is UNLOCKED by the original process.

EDIT updated with WORKING PHP code

<?php

    class Locker {

        public $filename;
        private $_lock;

        public function __construct($filename) {
            $this->filename = $filename;
        }

        /**
         * locks relevant file
         */
        public function lock() {
                touch($this->filename);
                $this->_lock = fopen($this->filename, 'r');
                flock($this->_lock, LOCK_EX);
        }

        /**
         * unlock above file
         */
        public function unlock() {
                flock($this->_lock, LOCK_UN);
        }

    }

    $locker = new Locker('locker.lock');
    echo "Waiting\n";
    $locker->lock();
    echo "Sleeping\n";
    sleep(30);
    echo "Done\n";
    $locker->unlock();

?>

You need a proper queuing solution here. You can implement it yourself using a queue table and table locks to avoid different processes picking up the same job.

So you can pick up tasks from the queue table like this:

LOCK TABLES table WRITE;
SELECT * FORM table WHERE status = 0 LIMIT 1;
set status = 1 for the selected row
UNLOCK TABLES;

Locking the table will ensure that other processes don't do SELECTs and don't pick up the same row from the table.

Inserting the job to the queue as simple as this:

INSERT INTO table (job_id, status) VALUES(NULL, status);

Removing the job after processing is completed:

DELETE FROM table WHERE job_id = 12345;

在每个 cron 作业开始时,检查锁定文件是否存在,如果在 api 进程完成后取消链接此文件,则退出如果不在某个临时目录中的 crate 锁定文件退出。

Since you should know the times that the cron will run (say every 5 min) then for your user requested function could you check to see if the system time is exactly when a cron should be running? That would at lest prevent them from running at the exact same time.

I use this on Linux to see if a script is running when multiple execution needs to be avoided:

$output = array();
exec('pgrep -fl the_script.php', $output);

Then scan through $output and determine if it is already in progress.

For instance, here is copy/paste of existing code:

$exec_output = array();
exec('pgrep -fl archiver.php', $exec_output);
$pid_count = 0;
foreach ($exec_output as $line) {
    $parts = explode(' ', $line);
    if (basename($parts[2]) == 'archiver.php') $pid_count++;
}

Then do things based on $pid_count . The basename() check is to make sure I don't catch some other thing like special_archiver.php or whatever might exist. You can check for the full path as well.

Semaphores can be installed in php, and for signal control at a kernel level, it will control process locking atomically. Unix was designed to use this mechanism alongside other methods like signals for interprocess communication. Not sure you need to get that sophisticated.

It may work looking at the output of ps -ef but it may be subject to system load and process priority. You may find it works using a database flag, but why add overhead? Databases can get busy.

I would think a simple file is probably less prone to latency issues when you want to make 500 checks per second.

eg if the cron script starts with

if ( ! -f otherprocessisrunning)
then
   // create/open the file
   > cronprocessisrunning

   // when cron process finishes
   // it removes the cronprocessisrunning file
   rm -f cronprocessisrunning
else 
   sleep for 2 minutes
   call this function
fi

and the other script has the same behaviour in php that does this

if (! file_exist(cronprocessisrunning))
    > otherprocessisrunning
    start the other process
    when it is finished, remove otherprocessisrunning
endif

It should be fast enough as creating file handles (with no content) translates to a simple system call. If it is not, try it in bash shell.

I don't know if that can be a good way:

temp_queue Table
-----------------------
id --> Int, Index, Autoincrement
query_id --> Int (your query ID or something to identificate a specific query)
in_use_by --> varchar (cron or api)

Cron job:

Script starts

SELECT in_use_by FROM temp_queue ORDER_BY id ASC LIMIT 1;

if results != 0 return;

INSERT INTO temp_queue SET query_id=SOME_ID, in_use_by = 'cron';
SELECT in_use_by FROM temp_queue ORDER_BY id ASC LIMIT 1;

Then check the last SELECT results

if in_use_by == 'cron' continue
else return

When the execution ends:

DELETE FROM temp_queue WHERE query_id=SOME_ID

API job:

Script starts

SELECT in_use_by FROM temp_queue ORDER_BY id ASC LIMIT 1;

if results != 0 return;

INSERT INTO temp_queue SET query_id=SOME_ID, in_use_by = 'api';
SELECT in_use_by FROM temp_queue ORDER_BY id ASC LIMIT 1;

Then check the last SELECT results

if in_use_by == 'api' continue
else return

When the execution ends:

DELETE FROM temp_queue WHERE query_id=SOME_ID

What happen if the Cron Job & API try to call the query at exactly same time? They both will check for the 1st writed line with query_id=SOME_ID so only 1 of they with continue.

Yes, a lot of selects, inserts and deletes. But it works.

What do you guys think about that?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM