[英]Merging mulitple rows to one row of a dataframe column
我當前的數據框如下所示,
0 1 2
0 HA-567034786 AB-1018724 None
1 AB-6348403 HA-7298656 None
使用apply()
,我就像這樣
def make_dict(row):
s = set(x for x in row if x)
return {x: list(s - {x}) for x in s}
result = df.apply(make_dict, axis=1).to_frame(name = 'duplicates')
duplicates
1 {'HA-567034786': ['AB-1018724'],'AB-1018724':['HA-567034786']}
2 {'AB-6348403': ['HA-7298656'],'HA-7298656':['AB-6348403']}
現在,我堅持要使其成為一個單一的三維字典,如下所示,
{
'HA-567034786': ['AB-1018724'],'AB-1018724':['HA-567034786'],
'AB-6348403': ['HA-7298656'],'HA-7298656':['AB-6348403']
}
相反, apply
使用字典理解與扁平化:
print (df)
0 1
0 HA-567034786 AB-1018724
1 AB-6348403 HA-7298656
def make_dict(row):
s = set(x for x in row if x)
return {x: list(s - {x}) for x in s}
result = {k:v for x in df.values for k, v in make_dict(x).items()}
print (result)
{'HA-567034786': ['AB-1018724'],
'AB-1018724': ['HA-567034786'],
'HA-7298656': ['AB-6348403'],
'AB-6348403': ['HA-7298656']}
另一種解決方案與apply
:
result = {k:v for x in df.apply(make_dict, axis=1) for k, v in x.items()}
您也可以使用collections.ChainMap()將所有字典歸為一組,如下所示:
from collections import ChainMap
res =dict(ChainMap(*result['duplicates']))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.