如何按 Python 中的多键字典一键分组？

Question

I have a multikey dict here.我这里有一个多键字典。 I am trying to group by the dict by a the first key (A, B) and convert it to a transposed dataframe and write it to a csv file.我正在尝试按第一个键（A，B）按字典分组并将其转换为转置的 dataframe 并将其写入 csv 文件。

>>> dic= { ('A',1): 4, ('A',1):2, ('B', 1): 2, ('A', 2): 5, ('B', 2):3}
>>> dic
{('A', 1): 2, ('B', 1): 2, ('A', 2): 5, ('B', 2): 3}
>>> df = pd.DataFrame(dic.items()).groupby(0).sum()
>>> df
        1
0
(A, 1)  2
(A, 2)  5
(B, 1)  2
(B, 2)  3

here is what I have been doing so far:这是我到目前为止一直在做的事情：

>>> df = pd.DataFrame(dic.items()).groupby(0).sum()
>>> df
        1
0
(A, 1)  4
(A, 2)  5
(B, 1)  2
(B, 2)  3

>>> df_t = df.T
0  (A, 1)  (A, 2)  (B, 1)  (B, 2)
1       4       5       2       3
>>> df_t.to_csv(./file.csv)

What I am looking to get is something like this:我想要得到的是这样的：

    1     2
A   6     5
B   2     3

Answer 1

First of all, a dictionary never contains duplicated keys (ie A dictionary can hold 1 key to N values, but not N keys to 1 value).首先，字典从不包含重复的键（即字典可以保存 1 个键对应 N 个值，但不能保存 N 个键对应 1 个值）。 In current scenario your dic contain duplicate keys so while executing it will take most recent value only.在当前情况下，您的dic包含重复的键，因此在执行时它将仅采用最新值。 If your dic contain duplicate keys possible solution is to put the values inside lists.如果您的dic包含重复键，可能的解决方案是将值放在列表中。 Something like就像是

dic = { ('A',1): 4, ('A',1):2, ('B', 1): 2, ('A', 2): 5, ('B', 2):3}

should be,应该，

dic = {('A',1):[4,2], ('B', 1): [2], ('A', 2): [5], ('B', 2):[3]}

Now the solution part,现在解决方案部分，

import pandas as pd

#data
dic = {('A',1):[4,2], ('B', 1): [2], ('A', 2): [5], ('B', 2):[3]}

#Converting dic to dataframe object
df = pd.DataFrame(dic.items())

#Explode will convert list of values to row like structure 
exp = df[1].explode().to_frame().reset_index()

#Merging df and exp to combine results
df = df.reset_index().merge(exp, on = 'index', how = 'left')

#Converting tuple of keys into separate columns
df[['i1','i2']] = df[0].apply(pd.Series)

#Summing up the result and then pivoting them to get desired result
df.groupby(['i1','i2'])['1_y'].sum().reset_index().pivot(index=['i1'],columns=['i2'],values=['1_y'])

#Renaming columns and index
res.columns = ['1','2']
res.index.names = ['']
res

Output: Output：

如何按 Python 中的多键字典一键分组？

问题描述

1 个解决方案

解决方案1
0 2021-02-12 06:07:28

如何按 Python 中的多键字典一键分组？

问题描述

1 个解决方案

解决方案1 0 2021-02-12 06:07:28

解决方案1
0 2021-02-12 06:07:28