簡體   English   中英

PHP如何進行集中的curl_multi請求?

[英]How can PHP do centralized curl_multi requests?

我目前有一個用PHP編寫的網站,利用curl_multi來輪詢外部API。 服務器將子進程從Web請求中獨立分離,並且運行良好,但它在某種程度上僅限於每個進程。

有時它會遇到帶寬瓶頸,需要更好的集中式排隊邏輯。

我目前正在嘗試使用獨立后台進程的PHP IPC來處理所有傳出的請求,但是被困在通常認為不太可能由臨時程序員來滿足的事情中。 說,垃圾收集,進程間異常處理,請求 - 響應匹配......等我走錯了路嗎?

是否存在通用實踐(實現理論),甚至是我可以使用的庫?

編輯

使用localhost TCP / IP通信會使本地流量的壓力加倍,這絕對不是一個好方法。

我目前正在使用一些家庭釀造協議來處理IPC消息隊列......根本不正常。 非常感謝任何幫助。

這里有幾個不同的東西要區分:

  • 工作:你有N個工作要處理。 執行的任務可能會崩潰或掛起,無論如何都應該執行所有作業而不會丟失任何數據。

  • 資源:您正在一台機器和/或一個連接中處理您的工作,因此您需要處理您的cpu和帶寬。

  • 同步:如果您的進程之間存在交互,則需要共享信息,處理並發數據訪問。


控制資源

每個人都想上車...

每個人都想上車...

因為PHP中沒有內置線程,所以我們需要模擬互斥鎖 原理很簡單:

1所有作業都被放入隊列中

2有N個可用資源, 池中不再有資源

3我們迭代隊列(每個工作while

4在執行之前,作業要求池中的資源

5如果有可用資源,則執行作業

6如果沒有更多資源,則池會掛起,直到作業完成或被視為已死

Mutex如何運作?

如何在PHP中做到這一點?

為了繼續,我們有幾種可能性,但原則是相同的:

我們有2個程序:

  • 有一個進程啟動程序 ,它將同時啟動不超過N個任務。
  • 有一個進程子進程 ,它代表一個thread's context

怎么看過程啟動器?

進程啟動程序知道應該運行多少任務,並在不考慮其結果的情況下運行它們。 它只控制它們的執行(進程啟動,結束或掛起,N已經在運行)。

PHP這里給你一個想法 ,我稍后會給你一些有用的例子:

<?php
// launcher.php

require_once("ProcessesPool.php");

// The label identifies your process pool, it should be unique for your process launcher and your process children
$multi = new ProcessesPool($label = 'test');

// Initialize a new pool (creates the right directory or file, cleans a database or whatever you want)
// 10 is the maximum number of simultaneously run processes
if ($multi->create($max = '10') == false)
{
    echo "Pool creation failed ...\n";
    exit();
}

// We need to launch N processes, stored in $count
$count = 100; // maybe count($jobs)

// We execute all process, one by one
for ($i = 0; ($i < $count); $i++)
{
    // The waitForResources method looks for how many processes are already run,
    // and hangs until a resource is free or the maximum execution time is reached.
    $ret = $multi->waitForResource($timeout = 10, $interval = 500000);
    if ($ret)
    {
        // A resource is free, so we can run a new process
        echo "Execute new process: $i\n";
        exec("/usr/bin/php ./child.php $i > /dev/null &");
    }
    else
    {
        // Timeout is reached, we consider all children as dead and we  kill them.
        echo "WaitForResources Timeout! Killing zombies...\n";
        $multi->killAllResources();
        break;
    }
}

// All process has been executed, but this does not mean they finished their work.
// This is important to follow the last executed processes to avoid zombies.
$ret = $multi->waitForTheEnd($timeout = 10, $interval = 500000);
if ($ret == false)
{
    echo "WaitForTheEnd Timeout! Killing zombies...\n";
    $multi->killAllResources();
}

// We destroy the process pool because we run all processes.
$multi->destroy();
echo "Finish.\n";

那么進程子,模擬線程的上下文呢

孩子(執行的工作)只有3件事要做:

  • 告訴它啟動的進程啟動器
  • 做他的工作
  • 告訴它完成的進程啟動器

PHP它可能包含這樣的內容:

<?php
// child.php

require_once("ProcessesPool.php");

// we create the *same* instance of the process pool
$multi = new ProcessesPool($label = 'test');

// child tells the launcher it started (there will be one more resource busy in pool)
$multi->start();

// here I simulate job's execution
sleep(rand() % 5 + 1);

// child tells the launcher it finished his job (there will be one more resource free in pool)
$multi->finish();

您的用法示例很好,但ProcessPool類在哪里?

有很多方法可以同步任務,但這實際上取決於您的要求和約束。

您可以使用以下方法同步任務:

  • 一個獨特的文件
  • 一個數據庫
  • 一個目錄和幾個文件
  • 可能是其他方法(如系統IPC)

正如我們已經看到的,我們至少需要7種方法:

1 create()將創建一個空池

2 start()獲取池上的資源

3 finish()釋放資源

4如果沒有更多可用資源, waitForResources()掛起

5 killAllResources()獲取池中所有已啟動的作業並將其殺死

6 waitForTheEnd()掛起,直到沒有繁忙的資源

7 destroy()破壞游泳池

所以讓我們從創建一個抽象類開始,稍后我們將能夠使用上述方法實現它。

PHP AbstractProcessPool.php

<?php

// AbstractProcessPool.php

abstract class AbstractProcessesPool
{

    abstract protected function _createPool();

    abstract protected function _cleanPool();

    abstract protected function _destroyPool();

    abstract protected function _getPoolAge();

    abstract protected function _countPid();

    abstract protected function _addPid($pid);

    abstract protected function _removePid($pid);

    abstract protected function _getPidList();

    protected $_label;
    protected $_max;
    protected $_pid;

    public function __construct($label)
    {
        $this->_max = 0;
        $this->_label = $label;
        $this->_pid = getmypid();
    }

    public function getLabel()
    {
        return ($this->_label);
    }

    public function create($max = 20)
    {
        $this->_max = $max;
        $ret = $this->_createPool();
        return $ret;
    }

    public function destroy()
    {
        $ret = $this->_destroyPool();
        return $ret;
    }

    public function waitForResource($timeout = 120, $interval = 500000, $callback = null)
    {
        // let enough time for children to take a resource
        usleep(200000);
        while (44000)
        {
            if (($callback != null) && (is_callable($callback)))
            {
                call_user_func($callback, $this);
            }
            $age = $this->_getPoolAge();
            if ($age == -1)
            {
                return false;
            }
            if ($age > $timeout)
            {
                return false;
            }
            $count = $this->_countPid();
            if ($count == -1)
            {
                return false;
            }
            if ($count < $this->_max)
            {
                break;
            }
            usleep($interval);
        }
        return true;
    }

    public function waitForTheEnd($timeout = 3600, $interval = 500000, $callback = null)
    {
        // let enough time to the last child to take a resource
        usleep(200000);
        while (44000)
        {
            if (($callback != null) && (is_callable($callback)))
            {
                call_user_func($callback, $this);
            }
            $age = $this->_getPoolAge();
            if ($age == -1)
            {
                return false;
            }
            if ($age > $timeout)
            {
                return false;
            }
            $count = $this->_countPid();
            if ($count == -1)
            {
                return false;
            }
            if ($count == 0)
            {
                break;
            }
            usleep($interval);
        }
        return true;
    }

    public function start()
    {
        $ret = $this->_addPid($this->_pid);
        return $ret;
    }

    public function finish()
    {
        $ret = $this->_removePid($this->_pid);
        return $ret;
    }

    public function killAllResources($code = 9)
    {
        $pids = $this->_getPidList();
        if ($pids == false)
        {
            $this->_cleanPool();
            return false;
        }
        foreach ($pids as $pid)
        {
            $pid = intval($pid);
            posix_kill($pid, $code);
            if ($this->_removePid($pid) == false)
            {
                return false;
            }
        }
        return true;
    }

}

使用目錄和多個文件進行同步

如果要使用目錄方法(例如在/dev/ram1分區上),則實現將是:

1 create()將使用給定的$label創建一個空目錄

2 start()在目錄中創建一個文件,由孩子的pid命名

3 finish()破壞孩子的文件

4 waitForResources()計算該目錄中的文件

5 killAllResources()讀取目錄內容並終止所有pid

6 waitForTheEnd()讀取目錄,直到沒有更多文件

7 destroy()刪除目錄

這種方法看起來很省錢,但是如果你想同時運行一百個任務而不需要執行與要執行的作業一樣多的數據庫連接,那么它的效率非常高。

實施

PHP ProcessPoolFiles.php

<?php

// ProcessPoolFiles.php

class ProcessesPoolFiles extends AbstractProcessesPool
{

    protected $_dir;

    public function __construct($label, $dir)
    {
        parent::__construct($label);
        if ((!is_dir($dir)) || (!is_writable($dir)))
        {
            throw new Exception("Directory '{$dir}' does not exist or is not writable.");
        }
        $sha1 = sha1($label);
        $this->_dir = "{$dir}/pool_{$sha1}";
    }

    protected function _createPool()
    {
        if ((!is_dir($this->_dir)) && (!mkdir($this->_dir, 0777)))
        {
            throw new Exception("Could not create '{$this->_dir}'");
        }
        if ($this->_cleanPool() == false)
        {
            return false;
        }
        return true;
    }

    protected function _cleanPool()
    {
        $dh = opendir($this->_dir);
        if ($dh == false)
        {
            return false;
        }
        while (($file = readdir($dh)) !== false)
        {
            if (($file != '.') && ($file != '..'))
            {
                if (unlink($this->_dir . '/' . $file) == false)
                {
                    return false;
                }
            }
        }
        closedir($dh);
        return true;
    }

    protected function _destroyPool()
    {
        if ($this->_cleanPool() == false)
        {
            return false;
        }
        if (!rmdir($this->_dir))
        {
            return false;
        }
        return true;
    }

    protected function _getPoolAge()
    {
        $age = -1;
        $count = 0;
        $dh = opendir($this->_dir);
        if ($dh == false)
        {
            return false;
        }
        while (($file = readdir($dh)) !== false)
        {
            if (($file != '.') && ($file != '..'))
            {
                $stat = @stat($this->_dir . '/' . $file);
                if ($stat['mtime'] > $age)
                {
                    $age = $stat['mtime'];
                }
                $count++;
            }
        }
        closedir($dh);
        clearstatcache();
        return (($count > 0) ? (@time() - $age) : (0));
    }

    protected function _countPid()
    {
        $count = 0;
        $dh = opendir($this->_dir);
        if ($dh == false)
        {
            return -1;
        }
        while (($file = readdir($dh)) !== false)
        {
            if (($file != '.') && ($file != '..'))
            {
                $count++;
            }
        }
        closedir($dh);
        return $count;
    }

    protected function _addPid($pid)
    {
        $file = $this->_dir . "/" . $pid;
        if (is_file($file))
        {
            return true;
        }
        echo "{$file}\n";
        $file = fopen($file, 'w');
        if ($file == false)
        {
            return false;
        }
        fclose($file);
        return true;
    }

    protected function _removePid($pid)
    {
        $file = $this->_dir . "/" . $pid;
        if (!is_file($file))
        {
            return true;
        }
        if (unlink($file) == false)
        {
            return false;
        }
        return true;
    }

    protected function _getPidList()
    {
        $array = array ();
        $dh = opendir($this->_dir);
        if ($dh == false)
        {
            return false;
        }
        while (($file = readdir($dh)) !== false)
        {
            if (($file != '.') && ($file != '..'))
            {
                $array[] = $file;
            }
        }
        closedir($dh);
        return $array;
    }

}

PHP 演示 ,進程啟動器:

<?php

// pool_files_launcher.php

require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolFiles.php");

$multi = new ProcessesPoolFiles($label = 'test', $dir = "/tmp");

if ($multi->create($max = '10') == false)
{
    echo "Pool creation failed ...\n";
    exit();
}

$count = 20;

for ($i = 0; ($i < $count); $i++)
{
    $ret = $multi->waitForResource($timeout = 10, $interval = 500000, 'test_waitForResource');
    if ($ret)
    {
        echo "Execute new process: $i\n";
        exec("/usr/bin/php ./pool_files_calc.php $i > /dev/null &");
    }
    else
    {
        echo "WaitForResources Timeout! Killing zombies...\n";
        $multi->killAllResources();
        break;
    }
}

$ret = $multi->waitForTheEnd($timeout = 10, $interval = 500000, 'test_waitForTheEnd');
if ($ret == false)
{
    echo "WaitForTheEnd Timeout! Killing zombies...\n";
    $multi->killAllResources();
}

$multi->destroy();
echo "Finish.\n";

function test_waitForResource($multi)
{
    echo "Waiting for available resource ( {$multi->getLabel()} )...\n";
}

function test_waitForTheEnd($multi)
{
    echo "Waiting for all resources to finish ( {$multi->getLabel()} )...\n";
}

PHP 演示 ,進程子:

<?php

// pool_files_calc.php

require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolFiles.php");

$multi = new ProcessesPoolFiles($label = 'test', $dir = "/tmp");

$multi->start();

// here I simulate job's execution
sleep(rand() % 7 + 1);

$multi->finish();

使用數據庫進行同步

MySQL如果你更喜歡使用數據庫方法,你需要一個像這樣的表:

CREATE TABLE `processes_pool` (
  `label` varchar(40) PRIMARY KEY,
  `nb_launched` mediumint(6) unsigned NOT NULL,
  `pid_list` varchar(2048) default NULL,
  `updated` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

然后,實現將是這樣的:

1 create()將在上表中插入一個新行

2 start()在pid列表中插入一個pid

3 finish()從pid列表中刪除一個pid

4 waitForResources()讀取nb_launched字段

5 killAllResources()獲取並殺死每個pid

6 waitForTheEnd()掛起並定期檢查,直到nb_launched等於0

7 destroy()刪除行

實施

PHP ProcessPoolMySql.php

<?php

// ProcessPoolMysql.php

class ProcessesPoolMySQL extends AbstractProcessesPool
{

    protected $_sql;

    public function __construct($label, PDO $sql)
    {
        parent::__construct($label);
        $this->_sql = $sql;
        $this->_label = sha1($label);
    }

    protected function _createPool()
    {
        $request = "
            INSERT IGNORE INTO processes_pool
            VALUES ( ?, ?, NULL, CURRENT_TIMESTAMP )
        ";
        $this->_query($request, $this->_label, 0);
        return $this->_cleanPool();
    }

    protected function _cleanPool()
    {
        $request = "
            UPDATE processes_pool
            SET
                nb_launched = ?,
                pid_list = NULL,
                updated = CURRENT_TIMESTAMP
            WHERE label = ?
        ";
        $this->_query($request, 0, $this->_label);
        return true;
    }

    protected function _destroyPool()
    {
        $request = "
            DELETE FROM processes_pool
            WHERE label = ?
        ";
        $this->_query($request, $this->_label);
        return true;
    }

    protected function _getPoolAge()
    {
        $request = "
            SELECT (CURRENT_TIMESTAMP - updated) AS age
            FROM processes_pool
            WHERE label = ?
         ";
        $ret = $this->_query($request, $this->_label);
        if ($ret === null)
        {
            return -1;
        }
        return $ret['age'];
    }

    protected function _countPid()
    {
        $req = "
            SELECT nb_launched AS nb
            FROM processes_pool
            WHERE label = ?
        ";
        $ret = $this->_query($req, $this->_label);
        if ($ret === null)
        {
            return -1;
        }
        return $ret['nb'];
    }

    protected function _addPid($pid)
    {
        $request = "
            UPDATE processes_pool
            SET
                nb_launched = (nb_launched + 1),
                pid_list = CONCAT_WS(',', (SELECT IF(LENGTH(pid_list) = 0, NULL, pid_list )), ?),
                updated = CURRENT_TIMESTAMP
            WHERE label = ?
        ";
        $this->_query($request, $pid, $this->_label);
        return true;
    }

    protected function _removePid($pid)
    {
        $req = "
            UPDATE processes_pool
            SET
                nb_launched = (nb_launched - 1),
                pid_list =
                    CONCAT_WS(',', (SELECT IF (LENGTH(
                        SUBSTRING_INDEX(pid_list, ',', (FIND_IN_SET(?, pid_list) - 1))) = 0, null,
                            SUBSTRING_INDEX(pid_list, ',', (FIND_IN_SET(?, pid_list) - 1)))), (SELECT IF (LENGTH(
                                SUBSTRING_INDEX(pid_list, ',', (-1 * ((LENGTH(pid_list) - LENGTH(REPLACE(pid_list, ',', ''))) + 1 - FIND_IN_SET(?, pid_list))))) = 0, null,
                                    SUBSTRING_INDEX(pid_list, ',', (-1 * ((LENGTH(pid_list) - LENGTH(REPLACE(pid_list, ',', ''))) + 1 - FIND_IN_SET(?, pid_list))
                                )
                            )
                        )
                    )
                 ),
                updated = CURRENT_TIMESTAMP
            WHERE label = ?";
        $this->_query($req, $pid, $pid, $pid, $pid, $this->_label);
        return true;
    }

    protected function _getPidList()
    {
        $req = "
            SELECT pid_list
            FROM processes_pool
            WHERE label = ?
        ";
        $ret = $this->_query($req, $this->_label);
        if ($ret === null)
        {
            return false;
        }
        if ($ret['pid_list'] == null)
        {
            return array();
        }
        $pid_list = explode(',', $ret['pid_list']);
        return $pid_list;
    }

    protected function _query($request)
    {
        $return = null;

        $stmt = $this->_sql->prepare($request);
        if ($stmt === false)
        {
            return $return;
        }

        $params = func_get_args();
        array_shift($params);

        if ($stmt->execute($params) === false)
        {
            return $return;
        }

        if (strncasecmp(trim($request), 'SELECT', 6) === 0)
        {
            $return = $stmt->fetch(PDO::FETCH_ASSOC);
        }

        return $return;
    }

}

PHP 演示 ,進程啟動器:

<?php

// pool_mysql_launcher.php

require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolMySQL.php");

$dbh = new PDO("mysql:host=127.0.0.1;dbname=fuz", 'root', 'root');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

$multi = new ProcessesPoolMySQL($label = 'test', $dbh);

if ($multi->create($max = '10') == false)
{
    echo "Pool creation failed ...\n";
    exit();
}

$count = 20;

for ($i = 0; ($i < $count); $i++)
{
    $ret = $multi->waitForResource($timeout = 10, $interval = 500000, 'test_waitForResource');
    if ($ret)
    {
        echo "Execute new process: $i\n";
        exec("/usr/bin/php ./pool_mysql_calc.php $i > /dev/null &");
    }
    else
    {
        echo "WaitForResources Timeout! Killing zombies...\n";
        $multi->killAllResources();
        break;
    }
}

$ret = $multi->waitForTheEnd($timeout = 10, $interval = 500000, 'test_waitForTheEnd');
if ($ret == false)
{
    echo "WaitForTheEnd Timeout! Killing zombies...\n";
    $multi->killAllResources();
}

$multi->destroy();
echo "Finish.\n";

function test_waitForResource($multi)
{
    echo "Waiting for available resource ( {$multi->getLabel()} )...\n";
}

function test_waitForTheEnd($multi)
{
    echo "Waiting for all resources to finish ( {$multi->getLabel()} )...\n";
}

PHP 演示 ,進程子:

<?php

// pool_mysql_calc.php

require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolMySQL.php");

$dbh = new PDO("mysql:host=127.0.0.1;dbname=fuz", 'root', 'root');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

$multi = new ProcessesPoolMySQL($label = 'test', $dbh);

$multi->start();

// here I simulate job's execution
sleep(rand() % 7 + 1);

$multi->finish();

上面代碼的輸出是什么?

演示輸出那些演示 - 幸運的是 - 大約相同的輸出。 如果未達到超時(夢的情況),輸出為:

KolyMac:TaskManager ninsuo$ php pool_files_launcher.php 
Waiting for available resource ( test )...
Execute new process: 0
Waiting for available resource ( test )...
Execute new process: 1
Waiting for available resource ( test )...
Execute new process: 2
Waiting for available resource ( test )...
Execute new process: 3
Waiting for available resource ( test )...
Execute new process: 4
Waiting for available resource ( test )...
Execute new process: 5
Waiting for available resource ( test )...
Execute new process: 6
Waiting for available resource ( test )...
Execute new process: 7
Waiting for available resource ( test )...
Execute new process: 8
Waiting for available resource ( test )...
Execute new process: 9
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Execute new process: 10
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Execute new process: 11
Waiting for available resource ( test )...
Execute new process: 12
Waiting for available resource ( test )...
Execute new process: 13
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Execute new process: 14
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Execute new process: 15
Waiting for available resource ( test )...
Execute new process: 16
Waiting for available resource ( test )...
Execute new process: 17
Waiting for available resource ( test )...
Execute new process: 18
Waiting for available resource ( test )...
Execute new process: 19
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Finish.

演示輸出在更壞的情況下(我改變sleep(rand() % 7 + 1); sleep(rand() % 7 + 100);這給出:

KolyMac:TaskManager ninsuo$ php pool_files_launcher.php 
Waiting for available resource ( test )...
Execute new process: 0
Waiting for available resource ( test )...
Execute new process: 1
Waiting for available resource ( test )...
Execute new process: 2
Waiting for available resource ( test )...
Execute new process: 3
Waiting for available resource ( test )...
Execute new process: 4
Waiting for available resource ( test )...
Execute new process: 5
Waiting for available resource ( test )...
Execute new process: 6
Waiting for available resource ( test )...
Execute new process: 7
Waiting for available resource ( test )...
Execute new process: 8
Waiting for available resource ( test )...
Execute new process: 9
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
(...)
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
WaitForResources Timeout! Killing zombies...
Waiting for all resources to finish ( test )...
Finish.

轉到第2頁繼續閱讀此答案。

第2頁 :SO答案的身體有30k字符的限制,所以我需要創建一個新的。


保持對結果的控制

沒有錯誤允許

不容錯!

YEAH! 您可以在不處理資源的情況下啟動大量流程。 但是,如果一個子進程失敗怎么辦? 將有一個未完成或不完整的工作!...

實際上,這比控制流程執行更簡單(更簡單)。 我們有一個使用池執行的作業隊列,我們​​只需要知道它在執行后是否失敗或成功。 如果在執行整個池時出現故障,則將失敗的進程放在新池上,然后再次執行。

結果控制

如何在PHP中繼續?

此原則基於集群:隊列包含多個作業,但僅代表一個實體。 群集的每個計算都應該成功完成該實體。

路線圖:

1我們創建一個todo列表(與隊列不匹配,用於進程管理),包含集群的所有計算 每個作業都有一個狀態:等待(未執行),運行(執行和未完成),成功和錯誤(根據其結果),當然,在此步驟中,它們的狀態為WAITING。

2我們使用流程管理器運行所有作業(以保持對資源的控制),每個作業首先告訴它運行的任務管理器,並根據他自己的上下文,通過指示他的狀態(失敗或成功)來完成。

3執行整個隊列時,任務管理器會創建一個包含失敗作業的新隊列,然后再次循環。

4當所有工作成功時,你就完成了,你確定什么都沒有出錯。 您的群集已完成,您的實體可在上一級使用。

概念證明

關於這個主題沒什么可說的,所以讓我們寫一些代碼,繼續前面的示例代碼。

至於流程管理,您可以使用多種方式來同步父項和子項,但沒有邏輯,因此不需要抽象。 所以我開發了一個MySQL示例(編寫速度更快),您可以根據自己的需求和約束自由調整這個概念。

MySQL創建下表:

CREATE TABLE `tasks_manager` (
  `cluster_label` varchar(40),
  `calcul_label` varchar(40),
  `status` enum('waiting', 'running', 'failed', 'success') default 'waiting',
  `updated` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP,
  PRIMARY KEY  (`cluster_label`, `calcul_label`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

PHP這是TaskManager.php文件:

<?php

class TasksManager
{

    protected $_cluster_label;
    protected $_calcul_label;
    protected $_sql;

    const WAITING = "waiting";
    const RUNNING = "running";
    const SUCCESS = "success";
    const FAILED = "failed";

    public function __construct($label, PDO $sql)
    {
        $this->_sql = $sql;
        $this->_cluster_label = substr($label, 0, 40);
    }

    public function getClusterLabel()
    {
        return $this->_cluster_label;
    }

    public function getCalculLabel()
    {
        return $this->_calcul_label;
    }

    public function destroy()
    {
        $request = "
            DELETE FROM tasks_manager
            WHERE cluster_label = ?
        ";
        $this->_query($request, $this->_cluster_label);
        return $this;
    }

    public function start($calcul_label)
    {
        $this->_calcul_label = $calcul_label;
        $this->add($calcul_label, TasksManager::RUNNING);
        return $this;
    }

    public function finish($status = TasksManager::SUCCESS)
    {
        if (!$this->_isStatus($status))
        {
            throw new Exception("{$status} is not a valid status.");
        }
        if (is_null($this->_cluster_label))
        {
            throw new Exception("finish() called, but task never started.");
        }
        $request = "
            UPDATE tasks_manager
            SET status = ?
            WHERE cluster_label = ?
            AND calcul_label = ?
         ";
        $this->_query($request, $status, $this->_cluster_label, substr($this->_calcul_label, 0, 40));
        return $this;
    }

    public function add($calcul_label, $status = TasksManager::WAITING)
    {
        if (!$this->_isStatus($status))
        {
            throw new Exception("{$status} is not a valid status.");
        }
        $request = "
            INSERT INTO tasks_manager (
                cluster_label, calcul_label, status
            ) VALUES (
                ?, ?, ?
            )
            ON DUPLICATE KEY UPDATE
                status = ?
        ";
        $calcul_label = substr($calcul_label, 0, 40);
        $this->_query($request, $this->_cluster_label, $calcul_label, $status, $status);
        return $this;
    }

    public function delete($calcul_label)
    {
        $request = "
            DELETE FROM tasks_manager
            WHERE cluster_label = ?
            AND calcul_label = ?
        ";
        $this->_query($request, $this->_cluster_label, substr($calcul_label, 0, 40));
        return $this;
    }

    public function countStatus($status = TasksManager::SUCCESS)
    {
        if (!$this->_isStatus($status))
        {
            throw new Exception("{$status} is not a valid status.");
        }
        $request = "
            SELECT COUNT(*) AS cnt
            FROM tasks_manager
            WHERE cluster_label = ?
            AND status = ?
        ";
        $ret = $this->_query($request, $this->_cluster_label, $status);
        return $ret[0]['cnt'];
    }

    public function count()
    {
        $request = "
            SELECT COUNT(id) AS cnt
            FROM tasks_manager
            WHERE cluster_label = ?
        ";
        $ret = $this->_query($request, $this->_cluster_label);
        return $ret[0]['cnt'];
    }

    public function getCalculsByStatus($status = TasksManager::SUCCESS)
    {
        if (!$this->_isStatus($status))
        {
            throw new Exception("{$status} is not a valid status.");
        }
        $request = "
            SELECT calcul_label
            FROM tasks_manager
            WHERE cluster_label = ?
            AND status = ?
        ";
        $ret = $this->_query($request, $this->_cluster_label, $status);
        $array = array();
        if (!is_null($ret))
        {
            $array = array_map(function($row) {
                return $row['calcul_label'];
            }, $ret);
        }
        return $array;
    }

    public function switchStatus($statusA = TasksManager::RUNNING, $statusB = null)
    {
        if (!$this->_isStatus($statusA))
        {
            throw new Exception("{$statusA} is not a valid status.");
        }
        if ((!is_null($statusB)) && (!$this->_isStatus($statusB)))
        {
            throw new Exception("{$statusB} is not a valid status.");
        }
        if ($statusB != null)
        {
            $request = "
                UPDATE tasks_manager
                SET status = ?
                WHERE cluster_label = ?
                AND status = ?
            ";
            $this->_query($request, $statusB, $this->_cluster_label, $statusA);
        }
        else
        {
            $request = "
                UPDATE tasks_manager
                SET status = ?
                WHERE cluster_label = ?
            ";
            $this->_query($request, $statusA, $this->_cluster_label);
        }
        return $this;
    }

    private function _isStatus($status)
    {
        if (!is_string($status))
        {
            return false;
        }
        return in_array($status, array(
                self::FAILED,
                self::RUNNING,
                self::SUCCESS,
                self::WAITING,
        ));
    }

    protected function _query($request)
    {
        $return = null;

        $stmt = $this->_sql->prepare($request);
        if ($stmt === false)
        {
            return $return;
        }

        $params = func_get_args();
        array_shift($params);

        if ($stmt->execute($params) === false)
        {
            return $return;
        }

        if (strncasecmp(trim($request), 'SELECT', 6) === 0)
        {
            $return = $stmt->fetchAll(PDO::FETCH_ASSOC);
        }

        return $return;
    }

}

PHP task_launcher.php是用法示例

<?php

require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolMySQL.php");
require_once("TasksManager.php");

// Initializing database connection
$dbh = new PDO("mysql:host=127.0.0.1;dbname=fuz", 'root', 'root');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

// Initializing process pool
$pool = new ProcessesPoolMySQL($label = "pool test", $dbh);
$pool->create($max = "10");

// Initializing task manager
$multi = new TasksManager($label = "jobs test", $dbh);
$multi->destroy();

// Simulating jobs
$count = 20;
$todo_list = array ();
for ($i = 0; ($i < $count); $i++)
{
    $todo_list[$i] = "Job {$i}";
    $multi->add($todo_list[$i], TasksManager::WAITING);
}

// Infinite loop until all jobs are done
$continue = true;
while ($continue)
{
    $continue = false;

    echo "Starting to run jobs in queue ...\n";

    // put all failed jobs to WAITING status
    $multi->switchStatus(TasksManager::FAILED, TasksManager::WAITING);

    foreach ($todo_list as $job)
    {

        $ret = $pool->waitForResource($timeout = 10, $interval = 500000, "waitResource");

        if ($ret)
        {
            echo "Executing job: $job\n";
            exec(sprintf("/usr/bin/php ./tasks_program.php %s > /dev/null &", escapeshellarg($job)));
        }
        else
        {
            echo "waitForResource timeout!\n";
            $pool->killAllResources();

            // All jobs currently running are considered dead, so, failed
            $multi->switchStatus(TasksManager::RUNNING, TasksManager::FAILED);

            break;
        }
    }

    $ret = $pool->waitForTheEnd($timeout = 10, $interval = 500000, "waitEnd");
    if ($ret == false)
    {
        echo "waitForTheEnd timeout!\n";
        $pool->killAllResources();

        // All jobs currently running are considered dead, so, failed
        $multi->switchStatus(TasksManager::RUNNING, TasksManager::FAILED);
    }


    echo "All jobs in queue executed, looking for errors...\n";

    // Counts if there is failures
    $nb_failed = $multi->countStatus(TasksManager::FAILED);
    if ($nb_failed > 0)
    {
        $todo_list = $multi->getCalculsByStatus(TasksManager::FAILED);
        echo sprintf("%d jobs failed: %s\n", $nb_failed, implode(', ', $todo_list));
        $continue = true;
    }
}

function waitResource($multi)
{
    echo "Waiting for a resource ....\n";
}

function waitEnd($multi)
{
    echo "Waiting for the end .....\n";
}

// All jobs finished, destroying task manager
$multi->destroy();

// Destroying process pool
$pool->destroy();

echo "Finish.\n";

PHP這里是子程序(計算)

<?php

if (!isset($argv[1]))
{
    die("This program must be called with an identifier (calcul_label)\n");
}
$calcul_label = $argv[1];

require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolMySQL.php");
require_once("TasksManager.php");

// Initializing database connection
$dbh = new PDO("mysql:host=127.0.0.1;dbname=fuz", 'root', 'root');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

// Initializing process pool (with same label as parent)
$pool = new ProcessesPoolMySQL($label = "pool test", $dbh);

// Takes one resource in pool
$pool->start();

// Initializing task manager (with same label as parent)
$multi = new TasksManager($label = "jobs test", $dbh);
$multi->start($calcul_label);

// Simulating execution time
$secs = (rand() % 2) + 3;
sleep($secs);

// Simulating job status
$status = rand() % 3 == 0 ? TasksManager::FAILED : TasksManager::SUCCESS;

// Job finishes indicating his status
$multi->finish($status);

// Releasing pool's resource
$pool->finish();

演示輸出該演示給你的東西像這樣 (太大SO)。


進程之間的同步和通信

通信錯誤的示例

有解決方案可以輕松溝通而不會出錯。

喝杯咖啡,我們接近尾聲!

我們現在能夠推出大量的流程,它們都能給出預期的結果,這也不錯。 但是現在,我們所有的流程都是獨立執行的,實際上它們無法相互通信。 這是你的核心問題,並且有很多解決方案。

很難准確地告訴您需要什么樣的通信。 您正在談論您嘗試過的內容(IPC,使用文件或自制協議進行通信),而不是您的進程之間共享的信息類型。 無論如何,我邀請您考慮一個OOP解決方案。

PHP很強大。

PHP有魔術方法

  • __get($property)讓我們實現對對象的$property的訪問
  • __set($property, $value)讓我們實施的分配$property的對象上

PHP可以使用並發訪問管理來處理文件

  • fopen($file, 'c+')打開一個啟用了建議鎖定選項的文件(允許你使用flock
  • flock($descriptor, LOCK_SH)采用共享鎖(用於讀取)
  • flock($descriptor, LOCK_EX)采用獨占鎖(用於寫入)

最后,PHP有:

  • json_encode($object)獲取json_encode($object)的json表示
  • json_decode($string)從json字符串中獲取對象

你看我要去哪兒了? 我們將創建一個與stdClass類工作方式相同的Synchro類,但它總是在文件上安全地同步。 我們的流程將能夠在同一時間訪問該對象的同一個實例。

一些Linux系統欺騙

當然,如果您有150個進程同時處理同一個文件,那么您的硬盤驅動器會降低您的進程速度。 要解決此問題,為什么不在RAM上創建文件系統分區? 寫入該文件將與在內存中寫入一樣快!

shell以 root用戶身份鍵入以下命令:

mkfs -q /dev/ram1 65536
mkdir -p /ram
mount /dev/ram1 /ram

一些說明:

  • 65536以千字節為單位,這里有64M分區。

  • 如果要在啟動時掛載該分區,請創建一個shell腳本並在/etc/rc.local文件中調用它。

履行

PHP這是Synchro.php類。

<?php

class Synchro
{

   private $_file;

   public function __construct($file)
   {
       $this->_file = $file;
   }

   public function __get($property)
   {
       // File does not exist
       if (!is_file($this->_file))
       {
           return null;
       }

       // Check if file is readable
       if ((is_file($this->_file)) && (!is_readable($this->_file)))
       {
           throw new Exception(sprintf("File '%s' is not readable.", $this->_file));
       }

       // Open file with advisory lock option enabled for reading and writting
       if (($fd = fopen($this->_file, 'c+')) === false)
       {
           throw new Exception(sprintf("Can't open '%s' file.", $this->_file));
       }

       // Request a lock for reading (hangs until lock is granted successfully)
       if (flock($fd, LOCK_SH) === false)
       {
           throw new Exception(sprintf("Can't lock '%s' file for reading.", $this->_file));
       }

       // A hand-made file_get_contents
       $contents = '';
       while (($read = fread($fd, 32 * 1024)) !== '')
       {
           $contents .= $read;
       }

       // Release shared lock and close file
       flock($fd, LOCK_UN);
       fclose($fd);

       // Restore shared data object and return requested property
       $object = json_decode($contents);
       if (property_exists($object, $property))
       {
           return $object->{$property};
       }

       return null;
   }

   public function __set($property, $value)
   {
       // Check if directory is writable if file does not exist
       if ((!is_file($this->_file)) && (!is_writable(dirname($this->_file))))
       {
           throw new Exception(sprintf("Directory '%s' does not exist or is not writable.", dirname($this->_file)));
       }

       // Check if file is writable if it exists
       if ((is_file($this->_file)) && (!is_writable($this->_file)))
       {
           throw new Exception(sprintf("File '%s' is not writable.", $this->_file));
       }

       // Open file with advisory lock option enabled for reading and writting
       if (($fd = fopen($this->_file, 'c+')) === false)
       {
           throw new Exception(sprintf("Can't open '%s' file.", $this->_file));
       }

       // Request a lock for writting (hangs until lock is granted successfully)
       if (flock($fd, LOCK_EX) === false)
       {
           throw new Exception(sprintf("Can't lock '%s' file for writing.", $this->_file));
       }

       // A hand-made file_get_contents
       $contents = '';
       while (($read = fread($fd, 32 * 1024)) !== '')
       {
           $contents .= $read;
       }

       // Restore shared data object and set value for desired property
       if (empty($contents))
       {
           $object = new stdClass();
       }
       else
       {
           $object = json_decode($contents);
       }
       $object->{$property} = $value;

       // Go back at the beginning of file
       rewind($fd);

       // Truncate file
       ftruncate($fd, strlen($contents));

       // Save shared data object to the file
       fwrite($fd, json_encode($object));

       // Release exclusive lock and close file
       flock($fd, LOCK_UN);
       fclose($fd);

       return $value;
   }

}

示范

我們將通過使我們的流程彼此溝通來繼續(並完成)我們的流程/任務示例。

規則:

  • 我們的目標是獲得1到20之間所有數字的總和。
  • 我們有20個進程,ID為1到20。
  • 這些進程隨機排隊等待執行。
  • 每個進程(進程1除外)只能進行一次計算:其id +前一進程的結果
  • 過程1直接把他的id
  • 如果它可以進行計算(意味着,如果前一個進程的結果可用),則每個進程都會成功,否則它將失敗(並且是新隊列的候選者)
  • 池的超時在10秒后到期

嗯,它看起來很復雜但實際上,它很好地代表了你在現實生活中會發現什么。

PHP synchro_launcher.php文件。

<?php

require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolMySQL.php");
require_once("TasksManager.php");
require_once("Synchro.php");

// Removing old synchroized object
if (is_file("/tmp/synchro.txt"))
{
    unlink("/tmp/synchro.txt");
}

// Initializing database connection
$dbh = new PDO("mysql:host=127.0.0.1;dbname=fuz", 'root', 'root');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

// Initializing process pool
$pool = new ProcessesPoolMySQL($label = "synchro pool", $dbh);
$pool->create($max = "10");

// Initializing task manager
$multi = new TasksManager($label = "synchro tasks", $dbh);
$multi->destroy();

// Simulating jobs
$todo_list = array ();
for ($i = 1; ($i <= 20); $i++)
{
    $todo_list[$i] = $i;
    $multi->add($todo_list[$i], TasksManager::WAITING);
}

// Infinite loop until all jobs are done
$continue = true;
while ($continue)
{
    $continue = false;

    echo "Starting to run jobs in queue ...\n";

    // Shuffle all jobs (else this will be too easy :-))
    shuffle($todo_list);

    // put all failed jobs to WAITING status
    $multi->switchStatus(TasksManager::FAILED, TasksManager::WAITING);

    foreach ($todo_list as $job)
    {

        $ret = $pool->waitForResource($timeout = 10, $interval = 500000, "waitResource");

        if ($ret)
        {
            echo "Executing job: $job\n";
            exec(sprintf("/usr/bin/php ./synchro_program.php %s > /dev/null &", escapeshellarg($job)));
        }
        else
        {
            echo "waitForResource timeout!\n";
            $pool->killAllResources();

            // All jobs currently running are considered dead, so, failed
            $multi->switchStatus(TasksManager::RUNNING, TasksManager::FAILED);

            break;
        }
    }

    $ret = $pool->waitForTheEnd($timeout = 10, $interval = 500000, "waitEnd");
    if ($ret == false)
    {
        echo "waitForTheEnd timeout!\n";
        $pool->killAllResources();

        // All jobs currently running are considered dead, so, failed
        $multi->switchStatus(TasksManager::RUNNING, TasksManager::FAILED);
    }


    echo "All jobs in queue executed, looking for errors...\n";

    // Counts if there is failures
    $multi->switchStatus(TasksManager::WAITING, TasksManager::FAILED);
    $nb_failed = $multi->countStatus(TasksManager::FAILED);
    if ($nb_failed > 0)
    {
        $todo_list = $multi->getCalculsByStatus(TasksManager::FAILED);
        echo sprintf("%d jobs failed: %s\n", $nb_failed, implode(', ', $todo_list));
        $continue = true;
    }
}

function waitResource($multi)
{
    echo "Waiting for a resource ....\n";
}

function waitEnd($multi)
{
    echo "Waiting for the end .....\n";
}

// All jobs finished, destroying task manager
$multi->destroy();

// Destroying process pool
$pool->destroy();

// Recovering final result
$synchro = new Synchro("/tmp/synchro.txt");
echo sprintf("Result of the sum of all numbers between 1 and 20 included is: %d\n", $synchro->result20);

echo "Finish.\n";

PHP及其關聯的synchro_calcul.php文件。

<?php

if (!isset($argv[1]))
{
   die("This program must be called with an identifier (calcul_label)\n");
}
$current_id = $argv[1];

require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolMySQL.php");
require_once("TasksManager.php");
require_once("Synchro.php");

// Initializing database connection
$dbh = new PDO("mysql:host=127.0.0.1;dbname=fuz", 'root', 'root');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

// Initializing process pool (with same label as parent)
$pool = new ProcessesPoolMySQL($label = "synchro pool", $dbh);

// Takes one resource in pool
$pool->start();

// Initializing task manager (with same label as parent)
$multi = new TasksManager($label = "synchro tasks", $dbh);
$multi->start($current_id);

// ------------------------------------------------------
// Job begins here

$synchro = new Synchro("/tmp/synchro.txt");

if ($current_id == 1)
{
   $synchro->result1 = 1;
   $status = TasksManager::SUCCESS;
}
else
{
   $previous_id = $current_id - 1;
   if (is_null($synchro->{"result{$previous_id}"}))
   {
       $status = TasksManager::FAILED;
   }
   else
   {
       $synchro->{"result{$current_id}"} = $synchro->{"result{$previous_id}"} + $current_id;
       $status = TasksManager::SUCCESS;
   }
}

// ------------------------------------------------------

// Job finishes indicating his status
$multi->finish($status);

// Releasing pool's resource
$pool->finish();

產量

以下演示將為您提供類似此輸出的內容 (對於SO而言太大)


結論

由於缺少線程,PHP中的任務管理並不容易。 和許多開發人員一樣,我希望有一天這個功能會內置。 無論如何,這可以控制資源和結果,並在進程之間共享數據,因此我們可以通過一些工作來有效地完成任務管理。

同步和通信可以通過多種方式完成,但您需要根據您的約束和要求檢查每個方法的優缺點。 例如:

  • 如果你需要一次啟動500個任務並想要使用MySQL同步方法,你需要1 + 500個同時連接到數據庫(它可能不會很多)。

  • 如果您需要共享大量數據,則僅使用一個文件可能效率低下。

  • 如果您使用文件進行同步,請不要忘記查看系統內置工具,例如/dev/sdram

  • 嘗試在面向對象的編程中盡可能多地處理以解決您的麻煩。 自制協議或類似協議將使您的應用程序難以維護。

我給了你關於這個有趣主題的2美分,我希望它會給你一些解決問題的想法。

我建議你看看這個名為PHP-Queue的庫: https//github.com/CoderKungfu/php-queue

來自其github頁面的簡短描述:

用於不同排隊后端的統一前端。 包括REST服務器,CLI界面和守護程序運行程序。

查看其github頁面了解更多詳情。

通過一些修補,我認為這個庫將幫助您解決問題。

希望這可以幫助。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM