简体   繁体   English

如何在某些条件下将字典中的数据合并到 pandas dataframe

[英]how to merge data from a dictionary to a pandas dataframe with certain conditions

i'm struggling with this one.我正在努力解决这个问题。 I have this dictionary:我有这本字典:

{'Rosario': [-60.63932, -32.946819], 'Concordia': [-74.448212, 40.31094], 'Avellaneda': [-58.367439, -34.660179], 'Corrientes': [-58.834099, -27.4806], 'Caballito': [-58.44104, -34.622639], 'Buenos Aires': [-78.497498, -9.12417], 'Paraná': [-60.5238, -31.73197], 'Santa Fé': [-78.14917, 8.65194], 'San Carlos de Bariloche': [-71.30822, -41.145569], 'Mendoza': [-68.827171, -32.890839]} {'Rosario':[-60.63932,-32.946819],'Concordia':[-74.448212,40.31094],'Avellaneda':[-58.367439,-34.660179],'Corrientes':[-58.834099,-27.4806],'Caball ':[-58.44104,-34.622639],'布宜诺斯艾利斯':[-78.497498,-9.12417],'巴拉那':[-60.5238,-31.73197],'圣达菲':[-78.14917,8.65194],'圣卡洛斯德巴里洛切':[-71.30822,-41.145569],'门多萨':[-68.827171,-32.890839]}

which contains cities and its coordinates.其中包含城市及其坐标。 and I would like to merge the cities names as a column to a dataframe which also contains coordinates.我想将城市名称作为一列合并到 dataframe 中,其中还包含坐标。 Is there a way to do it based on the (latitude and longitude) condition?有没有办法根据(纬度和经度)条件来做到这一点?

this is a sample of a dataframe:这是 dataframe 的示例:

在此处输入图像描述

as you can see it has similar values on lat and lon.如您所见,它在纬度和经度上具有相似的值。 I also mention that dataframe has coordinates which are only in the dictionary.我还提到 dataframe 的坐标仅在字典中。 I would really apreciate the help on this one.我真的很感激这方面的帮助。

here's a sample of the dataframe, I was on the phone so i taked a screenshot instead: the dataframe has a lot of columns thats why i only take a few这是 dataframe 的示例,我正在打电话,所以我截取了一个屏幕截图:dataframe 有很多列,这就是为什么我只取几个

        sunset    temp feels_like pressure      lat      lon
0   1659463668   255.3      248.3     1012 -60.6393 -32.9468
1   1659377129  263.67     256.67      984 -60.6393 -32.9468
2   1659290591  258.31     253.58      983 -60.6393 -32.9468
3   1659204054  266.81     262.63      970 -60.6393 -32.9468
4   1659117518  255.42     255.42      979 -60.6393 -32.9468

1st step will be to make your dictionary into a dataframe:第一步是将您的字典变成 dataframe:

cities_dict = {'Rosario': [-60.63932, -32.946819], 'Concordia': [-74.448212, 40.31094], 'Avellaneda': [-58.367439, -34.660179], 'Corrientes': [-58.834099, -27.4806], 'Caballito': [-58.44104, -34.622639], 'Buenos Aires': [-78.497498, -9.12417], 'Paraná': [-60.5238, -31.73197], 'Santa Fé': [-78.14917, 8.65194], 'San Carlos de Bariloche': [-71.30822, -41.145569], 'Mendoza': [-68.827171, -32.890839]}

cities = pd.DataFrame.from_dict(cities_dict, 'index', columns=['lat', 'lon'])
print(cities)

# Output:
                               lat        lon
Rosario                 -60.639320 -32.946819
Concordia               -74.448212  40.310940
Avellaneda              -58.367439 -34.660179
Corrientes              -58.834099 -27.480600
Caballito               -58.441040 -34.622639
Buenos Aires            -78.497498  -9.124170
Paraná                  -60.523800 -31.731970
Santa Fé                -78.149170   8.651940
San Carlos de Bariloche -71.308220 -41.145569
Mendoza                 -68.827171 -32.890839

From there, I think working with these geometrically will be easiest:从那里开始,我认为以几何方式处理这些将是最简单的:

pip install geopandas pygeos

import geopandas as gp

cities = gp.GeoSeries.from_xy(cities.lat, cities.lon)
cities = cities.reset_index().rename(columns={'index':'city'})

df['geometry'] = gp.GeoSeries.from_xy(df.lat, df.lon)
df = gp.GeoDataFrame(df)

out = gp.sjoin_nearest(df, cities)
print(out)

# Output:

       sunset    temp  feels_like  pressure      lat      lon                     geometry  index_right     city
0  1659463668  255.30      248.30      1012 -60.6393 -32.9468  POINT (-60.63930 -32.94680)            0  Rosario
1  1659377129  263.67      256.67       984 -60.6393 -32.9468  POINT (-60.63930 -32.94680)            0  Rosario
2  1659290591  258.31      253.58       983 -60.6393 -32.9468  POINT (-60.63930 -32.94680)            0  Rosario
3  1659204054  266.81      262.63       970 -60.6393 -32.9468  POINT (-60.63930 -32.94680)            0  Rosario
4  1659117518  255.42      255.42       979 -60.6393 -32.9468  POINT (-60.63930 -32.94680)            0  Rosario

See: geopandas.sjoin_nearest参见: geopandas.sjoin_nearest

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM