[英]How to continuously pull data from a URL in Python?
I have a link, eg www.someurl.com/api/getdata?password=... , and when I open it in a web browser it sends a constantly updating document of text.我有一个链接,例如www.someurl.com/api/getdata?password=... ,当我在网络浏览器中打开它时,它会发送一个不断更新的文本文档。 I'd like to make an identical connection in Python, and dump this data to a file live as it's received.我想在 Python 中建立一个相同的连接,并将这些数据在收到时实时转储到一个文件中。 I've tried using requests.Session()
, but since the stream of data never ends (and dropping it would lose data), the get request also never ends.我尝试使用requests.Session()
,但由于数据流永远不会结束(并且丢弃它会丢失数据),因此获取请求也永远不会结束。
import requests
s = requests.Session()
x = s.get("www.someurl.com/api/getdata?password=...") #never terminates
What's the proper way to do this?这样做的正确方法是什么?
I found the answer I was looking for here: Python Requests Stream Data from API我在这里找到了我正在寻找的答案: Python Requests Stream Data from API
Full implementation:全面实施:
import requests
url = "www.someurl.com/api/getdata?password=..."
s = requests.Session()
with open('file.txt','a') as fp:
with s.get(url,stream=True) as resp:
for line in resp.iter_lines(chunk_size=1):
fp.write(str(line))
Note that chunk_size=1
is necessary for the data to immediately respond to new complete messages, rather than waiting for an internal buffer to fill before iterating over all the lines.请注意, chunk_size=1
是数据立即响应新的完整消息所必需的,而不是在遍历所有行之前等待内部缓冲区填满。 I believe chunk_size=None
is meant to do this, but it doesn't work for me.我相信chunk_size=None
是为了做到这一点,但它对我不起作用。
You can keep making get requests to the url您可以继续向 url 发出 get 请求
import requests
import time
url = "www.someurl.com/api/getdata?password=..."
sess = requests.session()
while True:
req = sess.get(url)
time.sleep(10)
this will terminate the request after 1 second ,这将在 1 秒后终止请求,
import multiprocessing
import time
import requests
data = None
def get_from_url(x):
s = requests.Session()
data = s.get("www.someurl.com/api/getdata?password=...")
if __name__ == '__main__':
while True:
p = multiprocessing.Process(target=get_from_url, name="get_from_url", args=(1,))
p.start()
# Wait 1 second for get request
time.sleep(1)
p.terminate()
p.join()
# do something with the data
print(data) # or smth else
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.