简体   繁体   English

将字典列表与 pandas 系列进行比较并根据匹配填充字典

[英]Compare list of dicts with pandas series and populate dict based on match

I have a list of dicts that I want to populate by updating the dict if a match of several values within a dict is found in a pandas series.如果在 pandas 系列中找到一个字典中的多个值的匹配项,我有一个字典列表,我想通过更新字典来填充这些字典。 eg例如

lis_of_dicts = [{'A':'a', 'B':'b','C':'c', 'D':'d'},
                {'A':'1', 'B':'2','C':'3','D':'4'}, 
                {'A':'M','B':'N','C':'O','D':'P'}]

dd = {'col1':['b', 'M'], 'col2':['d','P'], 'col3':['7.5','29']}
df = pd.Dataframe(dd, columns = ['col1', 'col2', 'col3']) 
pd_series = pd.Series(tuple(value) for value in df.values), index=df.index)

which generated:这产生了:

0 (b, d, 7.5)
1 (M, P, 29)

Desired result:期望的结果:

[{'A':'a', 'B':'b','C':'c', 'val': '7.5','D':'d'}, 
 {'A':'1', 'B':'2','C':'3', 'val':'NA', 'D':'4'}, 
 {'A':'M','B':'N','C':'O','val':'29','D':'P'}]

I tried this but I could not even get the match so could not proceed:我试过了,但我什至无法获得比赛,所以无法继续:

for i in pd_series:
    for x in lis_of_dicts:
        if [x[key] == i[0] in x and [x[key] == i[1] in x for key in x]:
            x.update({'val':'i[2]'})
        else:
            x.update({'val':'NA'})

I am unable to generate any result.我无法产生任何结果。 Notice the order of the dict should remain the same except for the value been added should be before the last item in the dictionary.请注意,字典的顺序应该保持不变,除了添加的值应该在字典中的最后一项之前。

I would go for a nested loop based solution:对于基于嵌套循环的解决方案,我会使用 go:

>>> df_2 = pd.DataFrame(lis_of_dicts)
>>> df_2
   A  B  C  D
0  a  b  c  d
1  1  2  3  4
2  M  N  O  P
>>> cols = df_2.columns
>>> for ix, row in df_2.iterrows():
...     for item in pd_series:
...         if set(row[cols]) & set(item):
...             df_2.loc[ix, 'val'] = item[2]
...             break
...     else:
...         df_2.loc[ix, 'val'] = 'NA'

>>> df_2.to_dict('r')
[{'A': 'a', 'B': 'b', 'C': 'c', 'D': 'd', 'val': 7.5},
 {'A': '1', 'B': '2', 'C': '3', 'D': '4', 'val': 'NA'},
 {'A': 'M', 'B': 'N', 'C': 'O', 'D': 'P', 'val': 29}]

EDIT : It can be simplified, as follows:编辑:可以简化如下:

output = []
for d in lis_of_dicts:
    for item in pd_series:
        if set(d.values()) & set(item):
            d['val'] = item[2]
            break
    else:
        d['val'] = 'NA'
    output.append(d)
>>> output
[{'A': 'a', 'B': 'b', 'C': 'c', 'D': 'd', 'val': 7.5},
 {'A': '1', 'B': '2', 'C': '3', 'D': '4', 'val': 'NA'},
 {'A': 'M', 'B': 'N', 'C': 'O', 'D': 'P', 'val': 29}]

EDIT 2 :编辑 2

NOTE: This will only reliably work in python version >= 3.7, as order is not guaranteed to be maintained in dict s for lower versions.注意:这只能在 python 版本 >= 3.7 中可靠地工作,因为对于较低版本,不能保证在dict中保持顺序。

To place val in 2nd last element:要将val放在倒数第二个元素中:

output = []
for d in lis_of_dicts:
    last = d.popitem()
    for item in pd_series:
        if set(d.values()) & set(item):
            d['val'] = item[2]
            d.update([last])
            break
    else:
        d['val'] = 'NA'
        d.update([last])
    output.append(d)

>>> output
[{'A': 'a', 'B': 'b', 'C': 'c', 'val': 7.5, 'D': 'd'},
 {'A': '1', 'B': '2', 'C': '3', 'val': 'NA', 'D': '4'},
 {'A': 'M', 'B': 'N', 'C': 'O', 'val': 29, 'D': 'P'}]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM