[英]Implement multithreading (or multiprocessing?) with this script?
首先,我要說我對多線程沒有任何實際經驗,這是開始的。 我寫的這個腳本從一個文本文件中讀取了約4,400個地址,然后清理該地址並對其進行地址編碼。 我的兄弟提到了一些有關使用多線程來提高其速度的信息。 我在線上讀到,如果您只使用一個文本文件,那么多線程並沒有多大區別。 如果將單個文本文件拆分為2個文本文件,是否可以工作? 無論如何,如果有人可以向我展示如何對該腳本實施多線程或多處理以提高速度,我將非常感激。 如果不可能,你能告訴我為什么嗎? 謝謝!
from geopy.geocoders import Bing
from geopy.exc import GeocoderTimedOut
geolocator = Bing('vadrPcGdNLSX5bPNL7tw~ySbwhthllg7rNA4VSJ-O4g~Ag28cbu9Slxp5Sh_AsBDuQ9WypPuEhl9pHVPCAkiPf4A9FgCBf3l0KyQTKKsLCHw')
import tkinter as tk
from tkinter import filedialog
root = tk.Tk()
root.withdraw()
def cleanAddress(dirty):
try:
clean = geolocator.geocode(dirty)
x = clean.address
address, city, zipcode, country = x.split(",")
address = address.lower()
if 'first' in address:
address = address.replace('first', '1st')
elif 'second' in address:
address = address.replace('second', '2nd')
elif 'third' in address:
address = address.replace('third', '3rd')
elif 'fourth' in address:
address = address.replace('fourth', '4th')
elif 'fifth' in address:
address = address.replace('fifth', '5th')
elif 'sixth' in address:
address = address.replace('ave', '')
address = address.replace('avenue', '')
address = address.replace('sixth', 'avenue of the americas')
elif '6th' in address:
address = address.replace('ave', '')
address = address.replace('avenue', '')
address = address.replace('6th', 'avenue of the americas')
elif 'seventh' in address:
address = address.replace('seventh', '7th')
elif 'fashion' in address:
address = address.replace('fashion', '7th')
elif 'eighth' in address:
address = address.replace('eighth', '8th')
elif 'ninth' in address:
address = address.replace('ninth', '9th')
elif 'tenth' in address:
address = address.replace('tenth', '10th')
elif 'eleventh' in address:
address = address.replace('eleventh', '11th')
zipcode = zipcode[3:]
print(address + ",", zipcode.lstrip() + ",", str(clean.latitude) + ",", str(clean.longitude))
except AttributeError:
print('Can not be cleaned')
except ValueError:
print('Can not be cleaned')
except GeocoderTimedOut as e:
print('Can not be cleaned')
def main():
root.update()
fpath = filedialog.askopenfilename()
f = open(fpath)
for line in f:
dirty = line + " nyc"
cleanAddress(dirty)
f.close()
if __name__ == '__main__':
main()
簡短的答案是: 不,您不能 。
Python multiprocessing
庫允許您將計算分布在多個進程中,從而減少了進行所有計算所需的時間。 它可以加快腳本的整體運行速度,但是只有在要計算大量CPU時才可以。
在您的示例中,大多數時間用於連接到為您運行地理位置事務的Web服務,因此總執行時間取決於您或服務Internet的連接速度,而不是計算機的整體速度。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.