简体   繁体   English

通过坐标的熊猫数据框在单元格中查找点

[英]Find points in cells through pandas dataframes of coordinates

I have to find which points are inside a grid of square cells, given the points coordinates and the coordinates of the bounds of the cells, through two pandas dataframes. 我必须通过两个熊猫数据框找到给定的点坐标和单元格边界的坐标,然后在正方形单元格的网格中找到哪些点。 I'm calling dfc the dataframe containing the code and the boundary coordinates of the cells (I simplify the problem, in the real analysis I have a big grid with geographical points and tons of points to check): 我将dfc称为包含代码和单元格边界坐标的数据 (我简化了问题,在实际分析中,我有一个带有地理点和大量要检查点的大网格):

Code,minx,miny,maxx,maxy
01,0.0,0.0,2.0,2.0
02,2.0,2.0,3.0,3.0

and dfp the dataframe containing an Id and the coordinates of the points: dfp包含ID和点坐标的数据框:

Id,x,y
0,1.5,1.5
1,1.1,1.1
2,2.2,2.2
3,1.3,1.3
4,3.4,1.4
5,2.0,1.5

Now I would like to perform a search returning in dfc dataframe a new column (called 'GridCode') of the grid in which the point is in. The cells should be perfectly squared, so I would like to perform an analysis through: 现在,我想执行一次搜索,以在dfc数据帧中返回指向该点所在的网格的新列(称为“ GridCode”)。像元应完美地平方,因此我想通过以下方式执行分析:

a = np.where(
            (dfp['x'] > dfc['minx']) &
            (dfp['x'] < dfc['maxx']) &
            (dfp['y'] > dfc['miny']) &
            (dfp['y'] < dfc['maxy']),
            r2['Code'],
            'na')

avoiding several loops on the dataframes. 避免在数据帧上出现多个循环。 The lenghts of the dataframes are not the same. 数据帧的长度不同。 The resulting dataframe should be as follows: 结果数据框应如下所示:

   Id    x    y GridCode
0   0  1.5  1.5   01
1   1  1.1  1.1   01
2   2  2.2  2.2   02
3   3  1.3  1.3   01
4   4  3.4  1.4   na
5   5  2.0  1.5   na

Thanks in advance for your help! 在此先感谢您的帮助!

Probably a better way, but since this has been sitting out there for awhile.. 可能是一种更好的方法,但是由于这种方法已经存在了一段时间。

Using Pandas boolean indexing to filter the dfc data frame instead of np.where() 使用Pandas布尔值索引而不是np.where()过滤dfc数据帧

def findGrid(dfp):  
    c = dfc[(dfp['x'] > dfc['minx']) &
            (dfp['x'] < dfc['maxx']) &
            (dfp['y'] > dfc['miny']) &
            (dfp['y'] < dfc['maxy'])].Code

    if len(c) == 0:        
        return None
    else:        
        return c.iat[0]

Then use the pandas apply() function 然后使用pandas apply()函数

dfp['GridCode'] = dfp.apply(findGrid,axis=1)

Will yield this 会产生这个

    Id  x   y   GridCode
0   0   1.5 1.5 1
1   1   1.1 1.1 1
2   2   2.2 2.2 2
3   3   1.3 1.3 1
4   4   3.4 1.4 NaN
5   5   2.0 1.5 NaN

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM