简体   繁体   English

需要合并两个 pandas dataframe 使用两列纬度和经度

[英]Need to merge two pandas dataframe using two columns latitude and longitude

this is my dataframe#1:city names with its latitude and longitude这是我的数据框#1:城市名称及其纬度和经度

df1 = {"city":['delhi','new york','london','paris','chennai'],"lat":[12.23,22.444,23.233,45.32,34.22],"long":[11.22,22.332,34.23,55.23,24.22]

this is dataframe#2: country names with latitude and longitude这是数据框#2:带有纬度和经度的国家名称

df2 = pd.DataFrame({"country":['India','US','UK','France','India'],"lat":[12.13,22.54,22.33,45.32,34.22],"long":[11.12,22.132,34.23,54.23,24.22]})

I need to match these two columns lat and long to merge these two tables.我需要匹配这两列 lat 和 long 来合并这两个表。 the problem is the lat and long is not exactly matching and the values are + or - 0.1 or 0.2.问题是 lat 和 long 不完全匹配,值为 + 或 - 0.1 或 0.2。 (if matched I can use the pd.merge option) lat and longs are not real here. (如果匹配,我可以使用 pd.merge 选项) lat 和 long 在这里不是真实的。 just an example只是一个例子

Expected Result:预期结果:

result = pd.DataFrame({"city":['delhi','new york','london','paris','chennai'],"country":['India','US','UK','France','India'],"lat":[12.13,22.54,22.33,45.32,34.22],"long":[11.12,22.132,34.23,54.23,24.22]})

what is the best approach to merge these tables?合并这些表的最佳方法是什么?

For example of a cross merge:例如交叉合并:

(df1.assign(dummy=1)
    .merge(df2.assign(dummy=1),on='dummy')
    .query('abs(lat_x-lat_y)<=0.1 and abs(long_x-long_y)<=0.2')
    .drop('dummy', axis=1)
)

Output: Output:

        city   lat_x  long_x country  lat_y  long_y
0      delhi  12.230  11.220   India  12.13  11.120
6   new york  22.444  22.332      US  22.54  22.132
24   chennai  34.220  24.220   India  34.22  24.220

Geopandas may be use here. Geopandas可以在这里使用。

Provided that you have boundaries of countries as polygons, you can use spacial joins .如果您将国家边界作为多边形,则可以使用空间连接

In your question, you are reducing countries to single points which may not be the best representation.在您的问题中,您将国家减少到可能不是最佳代表的单点。

Example from the documentation:文档中的示例:

In a Spatial Join, two geometry objects are merged based on their spatial relationship to one another.在空间连接中,两个几何对象基于它们彼此的空间关系进行合并。

# One GeoDataFrame of countries, one of Cities.
# Want to merge so we can get each city's country.
In [11]: countries.head()
Out[11]: 


                                           geometry                   country
0  MULTIPOLYGON (((180.000000000 -16.067132664, 1...                      Fiji
1  POLYGON ((33.903711197 -0.950000000, 34.072620...                  Tanzania
2  POLYGON ((-8.665589565 27.656425890, -8.665124...                 W. Sahara
3  MULTIPOLYGON (((-122.840000000 49.000000000, -...                    Canada
4  MULTIPOLYGON (((-122.840000000 49.000000000, -...  United States of America

In [12]: cities.head()
Out[12]: 
           name                           geometry
0  Vatican City  POINT (12.453386545 41.903282180)
1    San Marino  POINT (12.441770158 43.936095835)
2         Vaduz   POINT (9.516669473 47.133723774)
3    Luxembourg   POINT (6.130002806 49.611660379)
4       Palikir  POINT (158.149974324 6.916643696)

# Execute spatial join
In [13]: cities_with_country = geopandas.sjoin(cities, countries, how="inner", op='intersects')

In [14]: cities_with_country.head()
Out[14]: 
             name                           geometry  index_right  country
0    Vatican City  POINT (12.453386545 41.903282180)          141    Italy
1      San Marino  POINT (12.441770158 43.936095835)          141    Italy
192          Rome  POINT (12.481312563 41.897901485)          141    Italy
2           Vaduz   POINT (9.516669473 47.133723774)          114  Austria
184        Vienna  POINT (16.364693097 48.201961137)          114  Austria

If you don't have the polygons representing the countries, you need to extend the point representing each country to an area.如果没有代表国家的多边形,则需要将代表每个国家的点扩展到一个区域。 You can do this using the buffer method in Shapely that is extending a point to an area given a distance:您可以使用Shapely中的buffer方法执行此操作,该方法将点扩展到给定距离的区域:

Point(0, 0).buffer(10.0),

assuming a point at coordinates [0,0] and a distance of 10.0 .假设坐标[0,0]处的点和距离为10.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM