简体   繁体   English

在 Pandas Dataframe 中查找最接近输入的值

[英]Find Closest Value to Input in Pandas Dataframe

The first rows of my dataframe are shown below.我的 dataframe 的第一行如下所示。 The columns are longitude, latitude, and value.列是经度、纬度和值。 This dataframe extends for 30 million rows.这个 dataframe 扩展了 3000 万行。

-179.979166666666657 89.9791666666666714 -3.39999995214436425e+38
-179.9375 89.9791666666666714 -3.39999995214436425e+38
-179.895833333333343 89.9791666666666714 -3.39999995214436425e+38
-179.854166666666657 89.9791666666666714 -3.39999995214436425e+38
-179.8125 89.9791666666666714 -3.39999995214436425e+38
-179.770833333333343 89.9791666666666714 -3.39999995214436425e+38
-179.729166666666657 89.9791666666666714 -3.39999995214436425e+38
-179.6875 89.9791666666666714 -3.39999995214436425e+38
-179.645833333333343 89.9791666666666714 -3.39999995214436425e+38

I am trying to find the closest longitude and latitude point to a given input, and then print out the value associated with the closest longitude and latitude.我试图找到与给定输入最近的经度和纬度点,然后打印出与最接近的经度和纬度相关的值。 I have tried to convert the dataframe into an array, and then search for the minimum value using this algorithm:我曾尝试将 dataframe 转换为数组,然后使用此算法搜索最小值:

def match (lon, lat):
    min=10000
    minindex=-1
    for x in range (len (mintemparr)):
        if (abs ((float (lon))-float (mintemparr [x][0])))+(abs ((float (lat))-float (mintemparr [x] 
        [1])))<min:
             
              min=(abs ((float (lon))-float (mintemparr [x][0])))+(abs ((float (lat))-float 
             (mintemparr [x][1])))
       minindex=x
    result=mintemparr [minindex][2]
    return result 

However, this is very slow.但是,这非常缓慢。 Is there a more direct way to search for the closest value within pandas rather than converting it into an array.有没有更直接的方法来搜索 pandas 中最接近的值,而不是将其转换为数组。

Thanks in advance.提前致谢。

def find_closest(df, lat, lon):
    dist = (df['lat'] - lat).abs() + (df['lon'] - lon).abs()
    return df.loc[dist.idxmin()]

>>> find_closest(df, -179, 90)
lat     -1.796458e+02
lon      8.997917e+01
value   -3.400000e+38
Name: 8, dtype: float64

You can do this by using pandas.您可以通过使用 pandas 来做到这一点。 I will make a column for the differences with the given values and then the square root of the sum of squared differences of latitude and longitude.我将为与给定值的差异创建一个列,然后是纬度和经度的平方差之和的平方根。 Then I'm getting the min.然后我得到了分钟。 Assuming that your data frame is called df with columns latitude, longitude and value :假设您的数据框被称为dflatitude、longitude和 value

lon=-179.979164

lat=89.979162

df['sumofdiff']=df.assign(landif=df['longitude']-lon).assign(latdiff=df['latitude']-lat).eval("x=(landif*landif)+(latdiff*latdiff)")['x'].apply(np.sqrt)

df[df.sumofdiff == df.sumofdiff.min()]


    longitude   latitude         value      diff  sumofdiff
0 -179.979167  89.979167 -3.400000e+38  0.000005   0.000005

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM