如何使以下代码更快？

Question

I am trying to optimize code for a challenge in codewars but I am having some trouble making it fast enough.我正在尝试优化代码以应对 codewars 中的挑战，但在使其足够快时遇到了一些麻烦。 The kata in question is this .有问题的卡塔是这样的。 The description is explained in a detailed way in the link I have given and I do not reproduce it here not to bloat the question.该描述在我给出的链接中以详细方式进行了解释，我不会在此处复制它以免使问题变得臃肿。 My attempted code is the following (in python)我尝试的代码如下（在python中）

# first I define the manhattan distance between any two given points
def manhattan(p1,p2):
return abs(p1[0]-p2[0])+abs(p1[1]-p2[1])

# I write a function that takes the minimum of the list of the distances from a given point to all agents

def distance(p,agents):
return min(manhattan(p,i) for i in agents)

# and the main function

def advice(agents,n):

# there could be agents outside of the range of the grid so I redefine agents cropping the agents that may be outside
agents=[i for i in agents if i[0]<n and i[1]<n]
# in case there is no grid or there are agents everywhere
if n==0 or len(agents)==n*n:
    return []
# if there is no agent I output the whole matrix
if agents==[]:
    return [(i,j) for i in range(n) for j in range(n)]
# THIS FOLLOWING PART IS THE PART THAT I THINK I NEED TO OPTIMIZE
# I define a dictionary with key point coordinates and value the output of the function distance then make a list with those entries with lowest values. This ends the script
dct=dict( ( (i,j),distance([i,j],agents) ) for i in range(n) for j in range(n))
# highest distance to an agent
mxm=max(dct.values())
return [ nn for nn,mm in dct.items() if mm==mxm]

The part I think I need to improve is marked in the code above.我认为需要改进的部分在上面的代码中有标记。 Any ideas of how to make this faster?关于如何加快速度的任何想法？

Answer 1

Your algorithm is doing a brute force check of all positions against all agents.您的算法正在对所有代理的所有位置进行强力检查。 If you want to accelerate the process, you have to use a strategy that allows you to skip large parts of these nxnxa combinations.如果你想加速这个过程，你必须使用一种策略，让你跳过这些 nxnxa 组合的大部分。

For example, you could group the points by distance in a dictionary starting with all the points at the largest possible distance.例如，您可以在字典中按距离对点进行分组，从最大可能距离的所有点开始。 Then, go through the agents and redistribute only the farthest points to closer distances.然后，通过代理并将最远的点重新分配到更近的距离。 Repeat this process until the farthest distance retains at least one point after going through all agents.重复这个过程，直到最远的距离经过所有代理后至少保留一个点。

This would not eliminate all the distance checks but it skips a lot of distance computations for points that are already known to be closer to an agent than the farthest ones.这不会消除所有的距离检查，但它会跳过许多已知距离代理比最远的点更近的点的距离计算。

Here's an example:下面是一个例子：

def myAdvice2(agents,n):
    maxDist    = 2*n
    distPoints = { maxDist:[ (x,y) for x in range(n) for y in range(n)] }
    index      = 0
    retained   = 0
    lastMax    = None
    # keep refining until farthest distance is kept for all agents
    while retained < len(agents):
        ax,ay = agents[index]
        index = (index + 1) % len(agents)

        # redistribute points at farthest distance for this agent         
        points  = distPoints.pop(maxDist)
        for x,y in points:
            distance = min(maxDist,abs(x-ax)+abs(y-ay))
            if distance not in distPoints:
               distPoints[distance] = []                   
            distPoints[distance].append((x,y))
        maxDist = max(distPoints) 

        # count retained agents at MaxDist      
        retained  += 1
        # reset count when maxDist is reduced 
        if lastMax and maxDist < lastMax : retained = 0
        lastMax   = maxDist

    # once all agents have been applied to largest distance, we have a solution
    return distPoints[maxDist]

In my performance tests this is about 3x faster and performs better on larger grids.在我的性能测试中，这大约快了 3 倍，并且在更大的网格上表现更好。

[EDIT] This algorithm can be further accelerated by sorting the agents based on their distance to the center. [编辑] 通过根据代理到中心的距离对代理进行排序，可以进一步加速该算法。 This will ensure that points at the farthest distances are eliminated faster because agents at central positions cover more ground.这将确保最远距离的点被更快地消除，因为中心位置的代理覆盖更多的地面。

You can add this sort at the beginning of the function:您可以在函数的开头添加这种排序：

agents = sorted(agents, key=lambda p: abs(p[0]-n//2)+abs(p[1]-n//2))

This will yield a 10% to 15% improvement on speed depending on the number of agent, their positions and the size of the area.根据代理的数量、他们的位置和区域的大小，这将使速度提高 10% 到 15%。 Further optimisations could determine when this sort is worthwhile based on data.进一步的优化可以根据数据确定这种排序何时值得。

[EDIT2] if you're going to use a brute force approach, leveraging parallel processing (using the GPU) could give suprizingly fast results. [EDIT2] 如果您打算使用蛮力方法，利用并行处理（使用 GPU）可以提供非常快的结果。

This example, using numpy, is 3 to 10 times faster than my original approach.这个使用 numpy 的例子比我原来的方法快 3 到 10 倍。 I'm guessing that this will only be the case up to a point (having more agents slows it down considerably) but it was much faster in all small scale tests I did.我猜这只会在一定程度上是这种情况（有更多的代理会大大减慢速度），但在我所做的所有小规模测试中，它的速度要快得多。

import numpy as np
def myAdvice4(agents,n):
    ax,ay     = [np.array(c)[:,None,None] for c in zip(*agents)]
    px        = np.repeat(np.arange(n)[None,:],n,axis=0)
    py        = np.repeat(np.arange(n)[:,None],n,axis=1)
    agentDist = abs(ax - px) + abs(ay - py)
    minDist   = np.min(agentDist,axis=0)
    maxDist   = np.max(minDist)
    maxCoord  = zip(*np.where(minDist == maxDist))
    return list(maxCoord)

Answer 2

您可以将sklearn.metrics.pairwise_distances与 manhattan 指标一起使用。

如何使以下代码更快？

问题描述

2 个解决方案

解决方案1
1 2020-02-04 02:50:08

解决方案2
0 2020-02-03 17:15:30

如何使以下代码更快？

问题描述

2 个解决方案

解决方案1 1 2020-02-04 02:50:08

解决方案2 0 2020-02-03 17:15:30

解决方案1
1 2020-02-04 02:50:08

解决方案2
0 2020-02-03 17:15:30