[英]Create list of dictionary items from lists
I am working on a project that involves going through two columns of latitude and longitude values.我正在开展一个项目,该项目涉及通过两列纬度和经度值。 If the lat/long in one pair of columns are blank, then I need to figure out which pair of lat/long values in another two columns are (geographically) closest to those in the destination.
如果一对列中的纬度/经度为空白,那么我需要找出另外两列中的哪对纬度/经度值(在地理上)最接近目的地中的纬度/经度值。 The dataframe looks like this:
dataframe 看起来像这样:
origin_lat | origin_lon | destination_lat | destination_lon
----------------------------------------------------------------
20.291326 -155.838488 25.145242 -98.491404
25.611236 -80.551706 25.646763 -81.466360
26.897654 -75.867564 nan nan
I am trying to build two dictionaries, one with the origin lat and long, and the other with the destination lat and long, in this format:我正在尝试构建两个字典,一个带有原始纬度和经度,另一个带有目的地纬度和经度,格式如下:
tmplist = [{'origin_lat': 39.7612992, 'origin_lon': -86.1519681},
{'origin_lat': 39.762241, 'origin_lon': -86.158436 },
{'origin_lat': 39.7622292, 'origin_lon': -86.1578917}]
What I want to do is for every row where the destination lat/lon are blank, compare the origin lat/lon in the same row to a dictionary of all the non-nan destination lat/lon values, then print the geographically closest lat/lon from the dictionary of destination lat/lon to the row in place of the nan values.我想要做的是对于目的地纬度/经度为空白的每一行,将同一行中的原始纬度/经度与所有非南目的地纬度/经度值的字典进行比较,然后打印地理上最近的纬度/ lon 从目标纬度/经度字典到代替 nan 值的行。 I've been playing around with creating lists of dictionary objects but can't seem to build a dictionary in the correct format.
我一直在尝试创建字典对象列表,但似乎无法以正确的格式构建字典。 Any help would be appreciated!
任何帮助,将不胜感激!
If df
is your pandas.DataFrame
, you can generate the requested dictionaries by iterating through the rows of df
:如果
df
是您的pandas.DataFrame
,您可以通过遍历df
的行来生成请求的字典:
origin_dicts = [{'origin_lat': row['origin_lat'], 'origin_long': row['origin_lon']} for _, row in df.iterrows()]
and analogously for destination_dicts
.和类似的
destination_dicts
。
Remark: if the only reason for creating the dictionaries is the calculation of values replacing the nan
-entries, it might be easier to do this directly on the data frame, eg备注:如果创建字典的唯一原因是计算替换
nan
条目的值,则直接在数据框上执行此操作可能更容易,例如
df['destination_lon'] = df.apply(find_closest_lon, axis=1)
df['destination_lat'] = df.apply(find_closest_lat, axis=1)
where find_closest_lon
, find_closes_lat
are functions receiving a data frame row as an argument and having access to the values of the origin-columns of the data frame.其中
find_closest_lon
, find_closes_lat
是接收数据帧行作为参数并可以访问数据帧的原始列的值的函数。
The format that you want is the built-in 'records' format:您想要的格式是内置的“记录”格式:
df[['origin_lat','origin_lon']].to_dict(orient = 'records')
produces生产
[{'origin_lat': 20.291326, 'origin_lon': -155.83848799999998},
{'origin_lat': 25.611235999999998, 'origin_lon': -80.55170600000001},
{'origin_lat': 26.897654, 'origin_lon': -75.867564}]
and of course you can equally have当然你也可以同样拥有
df[['destination_lat','destination_lon']].to_dict(orient = 'records')
But I agree with @ctenar that you do not need to generate dictionaries for your ultimate task, Pandas provide enough functionality for that但我同意@ctenar 的观点,即您不需要为最终任务生成字典,Pandas 提供了足够的功能
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.