加快Python代码时间

Question

start = time.time()
import csv
f = open('Speed_Test.csv','r+')
coordReader = csv.reader(f, delimiter = ',')
count = -1
successful_trip = 0
trips = 0

for line in coordReader:
    successful_single = 0
    count += 1
    R = interval*0.30
    if count == 0:
        continue
    if 26 < float(line[0]) < 48.7537144 and 26 < float(line[2]) <   48.7537144 and -124.6521017 < float(line[1]) < -68 and -124.6521017 < float(line[3]) < -68:
        y2,x2,y1,x1 = convertCoordinates(float(line[0]),float(line[1]),float(line[2]),float(line[3]))
        coords_line,interval = main(y1,x1,y2,x2)

        for item in coords_line:
            loop_count = 0
            r = 0
            min_dist = 10000

            for i in range(len(df)):
                dist = math.sqrt((item[1]-df.iloc[i,0])**2 + (item[0]-df.iloc[i,1])**2)
                if dist < R:
                    loop_count += 1
                    if dist < min_dist:
                        min_dist = dist
                        r = i
            if loop_count != 0:
                successful_single += 1
                df.iloc[r,2] += 1

        trips += 1
        if successful_single == (len(coords_line)):
            successful_trip += 1

end = time.time()
print('Percent Successful:',successful_trip/trips)
print((end - start))

I have this code and explaining it would be extremely time consuming but it doesn't run as fast as I need it to in order to be able to compute as much as I'd like. 我有这个代码并解释它将非常耗时，但它不能像我需要的那样快速运行，以便能够像我想的那样计算。 Is there anything anyone sees off the bat that I could do to speed the process up? 有没有人能看到我能做些什么来加速这个过程？ Any suggestions would be greatly appreciated. 任何建议将不胜感激。

In essence it reads in 2 lat and long coordinates and changes them to a cartesian coordinate and then goes through every coordinate along the path from on origin coordinate to the destination coordinate in certain interval lengths depending on distance. 实质上，它读入2个纬度和长度坐标，并将它们更改为笛卡尔坐标，然后根据距离以一定的间隔长度沿着从原点坐标到目标坐标的路径的每个坐标。 As it is doing this though there is a data frame (df) with 300+ coordinate locations that it checks against each one of the trips intervals and sees if one is within radius R and then stores the shortest on. 正如这样做虽然有一个具有300多个坐标位置的数据框（df），它会检查每个行程间隔，并查看是否在半径R内，然后存储最短的坐标。

Answer 1

Take advantage of any opportunity to break out of a for loop once the result is known. 一旦结果已知，就利用任何机会突破for循环。 For example, at the end of the for line loop you check to see if successful_single == len(coords_line) . 例如，在for line循环结束时，检查successful_single == len(coords_line) 。 But that will happen any time the statement if loop_count != 0 is False, because at that point successful_single will not get incremented; 但是，这不会发生任何时间声明if loop_count != 0是假的，因为在这一点上successful_single将不会增加; you know that its value will never reach len(coords_line) . 你知道它的价值永远不会达到len(coords_line) 。 So you could break out of the for item loop right there - you already know it's not a "successful_trip." 所以你可以在那里打破for item循环 - 你已经知道它不是“successful_trip”。 There may be other situations like this. 可能还有其他类似的情况。

Answer 2

have you considered pooling and running these calculations in parallel ? 您是否考虑过并行汇集和运行这些计算？

https://docs.python.org/2/library/multiprocessing.html https://docs.python.org/2/library/multiprocessing.html

Your code also suggests the variable R,interval might create a dependency and requires a linear solution 您的代码还建议变量R，interval可能会创建依赖关系并需要线性解决方案

加快Python代码时间

问题描述

2 个解决方案

解决方案1
0 2017-03-29 01:44:24

解决方案2
0 2017-03-29 02:15:44

加快Python代码时间

问题描述

2 个解决方案

解决方案1 0 2017-03-29 01:44:24

解决方案2 0 2017-03-29 02:15:44

解决方案1
0 2017-03-29 01:44:24

解决方案2
0 2017-03-29 02:15:44