简体   繁体   English

Python:并行运行多个 Web 请求

[英]Python: running multiple web requests in parallel

I'm new to Python and I have a basic question but I'm struggling to find an answer online, because a lot of the examples online seem to refer to deprecated APIs, so sorry if this has been asked before.我是 Python 新手,我有一个基本问题,但我很难在网上找到答案,因为网上的很多示例似乎都参考了已弃用的 API,如果之前有人问过这个问题,很抱歉。

I'm looking for a way to execute multiple (similar) web requests in parallel, and retrieve the result in a list.我正在寻找一种方法来并行执行多个(类似的)Web 请求,并在列表中检索结果。

The synchronous version I have right now is something like:我现在拥有的同步版本是这样的:

urls = ['http://example1.org', 'http://example2.org', '...']

def getResult(urls):
  result = []
  for url in urls:
    result.append(get(url).json())
  return result

I'm looking for the asynchronous equivalent (where all the requests are made in parallel, but I then wait for all of them to be finished before returning the global result).我正在寻找异步等价物(其中所有请求都是并行发出的,但我在返回全局结果之前等待所有请求完成)。

From what I saw I have to use async/await and aiohttp but the examples seemed way too complicated for the simple task I'm looking for.从我所见,我必须使用 async/await 和 aiohttp,但是对于我正在寻找的简单任务来说,这些示例似乎太复杂了。

Thanks谢谢

I am going to try to explain the simplest possible way to achieve what you want.我将尝试解释实现您想要的最简单的方法。 Im sure there are more cleaner/better ways to do this but here it goes.我确信有更多更清洁/更好的方法可以做到这一点,但它就在这里。

You could preform what you want using the python "threading" library.您可以使用python“线程”库执行您想要的操作。 You can use it to create separate threads for each request and then run all the threads concurrently and get an answer.您可以使用它为每个请求创建单独的线程,然后并发运行所有线程并获得答案。

Since you are new to python, to simplify things further I am using a global list called RESULTS to store in the results of the get(url) rather than returning them from the function.由于您是 Python 新手,为了进一步简化事情,我使用了一个名为 RESULTS 的全局列表来存储 get(url) 的结果,而不是从函数中返回它们。

import threading

RESULTS=[] #List to store the results

#Request Single Url Result and store in global RESULTS
def getSingleResult(url):
    global RESULTS
    RESULTS.append( ( url, get(url).json()) )

#Your Original Function
def getResult(urls)
    ths=[]
    for url in urls:
        th=threading.Thread(target=getSingleResult, args=(url,)) #Create a Thread
        th.start() #Start it
        ths.append(th) #Add it to a thread list

    for th in ths:
        th.join() #Wait for all threads to finish

The usage of the global results is to make it easier rather than collecting results from the threads directly.使用全局结果是为了更容易,而不是直接从线程收集结果。 If you wish to do that you can check out this answer How to get the return value from a thread in python?如果你想这样做,你可以查看这个答案How to get the return value from a thread in python?

Of course one thing to note that multi-threading in python doesnt provide true parallelism but rather concurrency especially if you are using the standard python implementation due to what is known as the Global Interpreter Lock当然要注意一件事,python 中的多线程并没有提供真正的并行性,而是提供并发性,特别是如果您由于所谓的全局解释器锁而使用标准的 Python 实现

However for your use case it would still provide for you the speed up you need.但是,对于您的用例,它仍然会为您提供所需的速度。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM