简体   繁体   中英

How to group by a multikey dict in Python by one key?

I have a multikey dict here. I am trying to group by the dict by a the first key (A, B) and convert it to a transposed dataframe and write it to a csv file.

>>> dic= { ('A',1): 4, ('A',1):2, ('B', 1): 2, ('A', 2): 5, ('B', 2):3}
>>> dic
{('A', 1): 2, ('B', 1): 2, ('A', 2): 5, ('B', 2): 3}
>>> df = pd.DataFrame(dic.items()).groupby(0).sum()
>>> df
        1
0
(A, 1)  2
(A, 2)  5
(B, 1)  2
(B, 2)  3

here is what I have been doing so far:

>>> df = pd.DataFrame(dic.items()).groupby(0).sum()
>>> df
        1
0
(A, 1)  4
(A, 2)  5
(B, 1)  2
(B, 2)  3

>>> df_t = df.T
0  (A, 1)  (A, 2)  (B, 1)  (B, 2)
1       4       5       2       3
>>> df_t.to_csv(./file.csv)

What I am looking to get is something like this:

    1     2
A   6     5
B   2     3  

First of all, a dictionary never contains duplicated keys (ie A dictionary can hold 1 key to N values, but not N keys to 1 value). In current scenario your dic contain duplicate keys so while executing it will take most recent value only. If your dic contain duplicate keys possible solution is to put the values inside lists. Something like

dic = { ('A',1): 4, ('A',1):2, ('B', 1): 2, ('A', 2): 5, ('B', 2):3}

should be,

dic = {('A',1):[4,2], ('B', 1): [2], ('A', 2): [5], ('B', 2):[3]}

Now the solution part,

import pandas as pd

#data
dic = {('A',1):[4,2], ('B', 1): [2], ('A', 2): [5], ('B', 2):[3]}

#Converting dic to dataframe object
df = pd.DataFrame(dic.items())

#Explode will convert list of values to row like structure 
exp = df[1].explode().to_frame().reset_index()

#Merging df and exp to combine results
df = df.reset_index().merge(exp, on = 'index', how = 'left')

#Converting tuple of keys into separate columns
df[['i1','i2']] = df[0].apply(pd.Series)

#Summing up the result and then pivoting them to get desired result
df.groupby(['i1','i2'])['1_y'].sum().reset_index().pivot(index=['i1'],columns=['i2'],values=['1_y'])

#Renaming columns and index
res.columns = ['1','2']
res.index.names = ['']
res

Output:

      1 2
    
A     6 5
B     2 3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM