简体   繁体   English

使用此脚本实现多线程(或多处理?)?

[英]Implement multithreading (or multiprocessing?) with this script?

Let me first start this off by saying I don't have any real experience with multithreading. 首先,我要说我对多线程没有任何实际经验,这是开始的。 This script that I wrote reads ~4,400 addresses from a text file and then cleans the address and geocodes it. 我写的这个脚本从一个文本文件中读取了约4,400个地址,然后清理该地址并对其进行地址编码。 My brother mentioned something about using multithreading to improve the speed of it. 我的兄弟提到了一些有关使用多线程来提高其速度的信息。 I read online that multithreading doesn't make much of a difference if you're just using a single text file. 我在线上读到,如果您只使用一个文本文件,那么多线程并没有多大区别。 Would it work if I split the single text file into 2 text files? 如果将单个文本文件拆分为2个文本文件,是否可以工作? Anyways, i'd really appreciate it if someone could show me how to implement multithreading or multiprocessing to this script to increase the speed. 无论如何,如果有人可以向我展示如何对该脚本实施多线程或多处理以提高速度,我将非常感激。 If it's not possible, could you tell me why? 如果不可能,你能告诉我为什么吗? Thanks! 谢谢!

from geopy.geocoders import Bing
from geopy.exc import GeocoderTimedOut
geolocator = Bing('vadrPcGdNLSX5bPNL7tw~ySbwhthllg7rNA4VSJ-O4g~Ag28cbu9Slxp5Sh_AsBDuQ9WypPuEhl9pHVPCAkiPf4A9FgCBf3l0KyQTKKsLCHw')
import tkinter as tk
from tkinter import filedialog

root = tk.Tk()
root.withdraw()


def cleanAddress(dirty):
    try:
        clean = geolocator.geocode(dirty)
        x = clean.address
        address, city, zipcode, country = x.split(",")
        address = address.lower()
        if 'first' in address:
            address = address.replace('first', '1st')
        elif 'second' in address:
            address = address.replace('second', '2nd')
        elif 'third' in address:
            address = address.replace('third', '3rd')
        elif 'fourth' in address:
            address = address.replace('fourth', '4th')
        elif 'fifth' in address:
            address = address.replace('fifth', '5th')
        elif 'sixth' in address:
            address = address.replace('ave', '')
            address = address.replace('avenue', '')
            address = address.replace('sixth', 'avenue of the americas')
        elif '6th' in address:
            address = address.replace('ave', '')
            address = address.replace('avenue', '')
            address = address.replace('6th', 'avenue of the americas')
        elif 'seventh' in address:
            address = address.replace('seventh', '7th')
        elif 'fashion' in address:
            address = address.replace('fashion', '7th')
        elif 'eighth' in address:
            address = address.replace('eighth', '8th')
        elif 'ninth' in address:
            address = address.replace('ninth', '9th')
        elif 'tenth' in address:
            address = address.replace('tenth', '10th')
        elif 'eleventh' in address:
            address = address.replace('eleventh', '11th')
        zipcode = zipcode[3:]
        print(address + ",", zipcode.lstrip() + ",", str(clean.latitude) + ",", str(clean.longitude))
    except AttributeError:
        print('Can not be cleaned')
    except ValueError:
        print('Can not be cleaned')
    except GeocoderTimedOut as e:
        print('Can not be cleaned')        


def main():
    root.update()
    fpath = filedialog.askopenfilename()
    f = open(fpath)
    for line in f:
        dirty = line + " nyc"
        cleanAddress(dirty)
    f.close()

if __name__ == '__main__':
    main()

Short answer is: no, you cannot . 简短的答案是: 不,您不能

Python multiprocessing library allows you to decrease time needed to do all calculations by distributing them over several processes. Python multiprocessing库允许您将计算分布在多个进程中,从而减少了进行所有计算所需的时间。 It can speed up whole run of your script, but only when there is a lot to calculate for CPU. 它可以加快脚本的整体运行速度,但是只有在要计算大量CPU时才可以。

In your example most time takes connection to web services that run geo-location stuff for you, so total execution time depends rather on your or service internet connection speed rather that your computer overall. 在您的示例中,大多数时间用于连接到为您运行地理位置事务的Web服务,因此总执行时间取决于您或服务Internet的连接速度,而不是计算机的整体速度。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM