Suppose I have the following dataframe:
df = pd.DataFrame({'id': [1,2,3,3,3], 'v1': ['a', 'a', 'c', 'c', 'd'], 'v2': ['z', 'y', 'w', 'y', 'z']})
df
id v1 v2
1 a z
2 a y
3 c w
3 c y
3 d z
And I want to transform it to this format:
{1: [('a', 'z')], 2: [('a', 'y')], 3: [('c', 'w'), ('c', 'y'), ('d', 'z')]}
I basically want to create a dict where the keys are the id and the values is a list of tuples of the (v1,v2) of this id.
I tried using groupby in id:
df.groupby('id')[['v1', 'v2']].apply(list)
But this didn't work
Create tuples first and then pass to groupby
with aggregate list
:
d = df[['v1', 'v2']].agg(tuple, 1).groupby(df['id']).apply(list).to_dict()
print (d)
{1: [('a', 'z')], 2: [('a', 'y')], 3: [('c', 'w'), ('c', 'y'), ('d', 'z')]}
Another idea is using MultiIndex
:
d = df.set_index(['v1', 'v2']).groupby('id').apply(lambda x: x.index.tolist()).to_dict()
You can use defaultdict from the collections
library:
from collections import defaultdict
d = defaultdict(list)
for k, v, s in df.to_numpy():
d[k].append((v, s))
defaultdict(list,
{1: [('a', 'z')],
2: [('a', 'y')],
3: [('c', 'w'), ('c', 'y'), ('d', 'z')]})
df['New'] = [tuple(x) for x in df[['v1','v2']].to_records(index=False)]
df=df[['id','New']]
df=df.set_index('id')
df.to_dict()
Output:
{'New': {1: ('a', 'z'), 2: ('a', 'y'), 3: ('d', 'z')}}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.