簡體   English   中英

如何在Python中對API查詢進行多線程處理並將結果存儲在csv文件中?

[英]How do I multithread an API query in Python and store the results in a csv file?

我是Python的新手,正在編寫一個簡單的查詢以從USPS API獲取信息,並將結果存儲在一個.csv文件中,以便以后使用。 我可以成功查詢API,但是我希望將查詢擴展到大約220萬個查詢。 在for循環中執行此操作需要花費數周的時間,因此我研究了多線程作為並行運行請求的一種方式。 我有2個問題:

  1. 當我執行15個以上線程時,出現連接錯誤。 該錯誤類似於該問題 ,但是由於我的較小查詢有效,因此我認為它一定會限制服務器。

  2. 如何保留字典的鍵值而不是將其更改為“ 0、1、2,...”?

  3. 如果我一次只能將查詢限制為小批量,是否可以保留在for循環運行時連續追加到的主文件? 我知道Python具有這種添加字典的結構

下面是我的代碼的最小示例(數據集必須很大,因為當我處理大量數據時會出現錯誤):

from xml.etree import ElementTree as ET
from threading import Thread
import numpy as np
import pandas as pd
import requests
import csv

# API Information
usps_uname = '536UNIVE4362'
usps_pw    = '462YK79VT194'
url = 'http://production.shippingapis.com/ShippingAPITest.dll?'
req = "StandardB"

# Single API query
def delivery(origin, destination):
  query = url + 'API=' + str(req) + '&XML=%3C' + str(req) + \
  'Request%20USERID=%22' + str(usps_uname) + '%22%3E' + \
  '%3COriginZip%3E' + "%05d" % origin + '%3C/OriginZip%3E' + \
  '%3CDestinationZip%3E' + "%05d" % destination + '%3C/DestinationZip%3E' + \
  '%3C/' + str(req) + 'Request%3E'
  data = requests.get(query, auth = (usps_uname, usps_pw))
  root = ET.fromstring(data.content)
  if root[2].text == "No Data":
    DeliveryTime = 99
  else:
    DeliveryTime = int(root[2].text)
  return DeliveryTime

# Returns delivery times from all origins to specified destination
def delivery_range(origin_range, destination, thread_index, store=None):
  store[thread_index] = [0] * len(origin_range)
  for i, x in enumerate(origin_range):
    store[thread_index][i] = delivery(x, destination)
  return store

# Threading attempt
def threaded_process(nthreads, origin_range):
  store = {}
  threads = []
  for i in range(nthreads):
    ids = origin_range.values()[i]
    destination = origin_range.keys()[i]
    t = Thread(target=delivery_range, args=(ids, destination, i, store))
    threads.append(t)
  [ t.start() for t in threads ]
  [ t.join() for t in threads ]
  return store

origin_range = {
2072: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
3063: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
6095: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
7001: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
8085: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
8691: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
15205: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
17013: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
17015: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
17339: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
18031: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
18202: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
19709: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
19720: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
21224: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
23803: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
23836: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
28027: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390],
29172: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390],
29303: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390],
30344: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390],
}

ans = threaded_process(len(origin_range), origin_range)

writer = csv.writer(open('DeliveryTimes.csv', 'wb'))
for key, value in ans.items():
   writer.writerow([key, value])

如果您第一次沒有遇到錯誤,請再次運行代碼,它應該會出錯。

您可以將python腳本設置為通過參數列表(例如,使用argparse模塊)將起點和終點作為輸入,然后使用GNU Parallel( http://www.gnu.org/software/parallel/ )使用作為參數傳入的所有可能組合調用此腳本。

GNU Parallel將對代碼進行OS級並行化,因此您不必擔心在Python中進行並行化。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM