[英]Update global variable when worker fails (Python multiprocessing.pool ThreadPool)
I have a Python function that requests data via API and involves a rotating expiring key.我有一个 Python function,它通过 API 请求数据并涉及一个轮换过期密钥。 The volume of requests necessitates some parallelization of the function. I am doing this with the multiprocessing.pool module ThreadPool.
请求量需要 function 的一些并行化。我正在使用 multiprocessing.pool 模块 ThreadPool 执行此操作。 Example code:
示例代码:
import requests
from multiprocessing.pool import ThreadPool
from tqdm import tqdm
# Input is a list-of-dicts results of a previous process.
results = [...]
# Process starts by retrieving an authorization key.
headers = {"authorization": get_new_authorization()}
# api_call() is called on each existing result with the retrieved key.
results = thread(api_call, [(headers, result) for result in results])
# Function calls API with passed headers for given URL and returns dict.
def api_call(headers_plus_result):
headers, result = headers_plus_result
r = requests.get(result["url"]), headers=headers)
return json.loads(r.text)
# Threading function with default num_threads.
def thread(worker, jobs, num_threads=5):
pool = ThreadPool(num_threads)
results = list()
for result in tqdm(pool.imap_unordered(worker, jobs), total=len(jobs)):
if result:
results.append(result)
pool.close()
pool.join()
if results:
return results
# Function to get new authorization key.
def get_new_authorization():
...
return auth_key
I am trying to modify my mapping process so that, when the first worker fails (ie the authorization key expires), all other processes are paused until a new authorization key is retrieved.我试图修改我的映射过程,以便当第一个工作人员失败时(即授权密钥过期),所有其他进程都将暂停,直到检索到新的授权密钥。 Then, the processes proceed with the new key.
然后,进程继续使用新密钥。
Should this be inserted into the actual thread() function?是否应该将其插入到实际的 thread() function 中? If I put an exception in the api_call function itself, I don't see how I can stop the pool manager or update the header being passed to other workers.
如果我在 api_call function 本身中放置一个例外,我看不出如何停止池管理器或更新传递给其他工作人员的 header。
Additionally: is using ThreadPool even the best method if I want this kind of flexibility?另外:如果我想要这种灵活性,使用 ThreadPool 是否是最好的方法?
A simpler possibility might be to use a multiprocessing.Event
and a shared variable.一种更简单的可能性可能是使用
multiprocessing.Event
和共享变量。 The Event would indicate whether the authentication was legit or not, and the shared variable would contain the authentication. Event 将指示身份验证是否合法,共享变量将包含身份验证。
event = mp.Event()
sharedAuthentication = mp.Array('u', 100) # 100 = max length
So a worker would run:所以一个工人会运行:
event.wait();
authentication = sharedAuthentication.value
Your main thread would initially set the authentication with您的主线程最初将设置身份验证
sharedAuthentication.value = ....
event.set()
and later modify the authentication with然后修改身份验证
event.clear()
... calculate new authentication
sharedAuthentication.value = .....
event.set()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.