大熊猫-根据数据框分组创建键值对

Question

I have a data frame with three columns, I would like to create a dictionary after applying groupby function on first and second column.I can do this by for loops, but is there any pandas way of doing it? 我有一个包含三列的数据框，我想在第一列和第二列上应用groupby函数后创建一个字典，我可以通过for循环来做到这一点，但是有什么熊猫方法吗？

DataFrame: 数据帧：

Col X    Col Y    Sum
 A         a       3
 A         b       2
 A         c       1
 B         p       5
 B         q       6
 B         r       7

After grouping by on Col X and Col Y : df.groupby(['Col X','Col Y']).sum() 在对Col X和Col Y进行分组之后：df.groupby（['Col X'，'Col Y']）。sum（）

                  Sum
Col X    Col Y    
 A         a       3
           b       2
           c       1
 B         p       5
           q       6
           r       7

Dictionary I want to create 我要创建的字典

{A:{'a':3,'b':2,'c':1}, B:{'p':5,'q':6,'r':7}}

Answer 1

Use a dictionary comprehension while iterating via a groupby object 通过groupby对象进行迭代时使用字典理解

{name: dict(zip(g['Col Y'], g['Sum'])) for name, g in df.groupby('Col X')}

{'A': {'a': 3, 'b': 2, 'c': 1}, 'B': {'p': 5, 'q': 6, 'r': 7}}

If you insisted on using to_dict somewhere, you could do something like this: 如果您坚持在某处使用to_dict ，则可以执行以下操作：

s = df.set_index(['Col X', 'Col Y']).Sum
{k: s.xs(k).to_dict() for k in s.index.levels[0]}

{'A': {'a': 3, 'b': 2, 'c': 1}, 'B': {'p': 5, 'q': 6, 'r': 7}}

Keep in mind, that the to_dict method is just using some comprehension under the hood. 请记住， to_dict方法只是在to_dict使用某种理解。 If you have a special use case that requires something more than what the orient options provide for... there is no shame in constructing your own comprehension. 如果您有一个特殊的用例，而所需的东西超出了orient选项提供的东西，那么……构建自己的理解力就不会感到羞耻。

Answer 2

You can iterate over the MultiIndex series: 您可以遍历MultiIndex系列：

>>> s = df.set_index(['ColX', 'ColY'])['Sum']
>>> {k: v.reset_index(level=0, drop=True).to_dict() for k, v in s.groupby(level=0)}
{'A': {'a': 3, 'b': 2, 'c': 1}, 'B': {'p': 5, 'q': 6, 'r': 7}}

Answer 3

#A to_dict() solution

d = df.groupby(['Col X','Col Y']).sum().reset_index().pivot(columns='Col X',values='Sum').to_dict()

Out[70]: 
{'A': {0: 3.0, 1: 2.0, 2: 1.0, 3: nan, 4: nan, 5: nan},
 'B': {0: nan, 1: nan, 2: nan, 3: 5.0, 4: 6.0, 5: 7.0}}

#if you need to get rid of the nans:
{k1:{k2:v2 for k2,v2 in v1.items() if pd.notnull(v2)} for k1,v1 in d.items()}
Out[73]: {'A': {0: 3.0, 1: 2.0, 2: 1.0}, 'B': {3: 5.0, 4: 6.0, 5: 7.0}}

大熊猫-根据数据框分组创建键值对

问题描述

3 个解决方案

解决方案1
1 已采纳 2017-05-23 03:45:26

解决方案2
1 2017-05-23 04:23:29

解决方案3
0 2017-05-23 04:08:43

大熊猫-根据数据框分组创建键值对

问题描述

3 个解决方案

解决方案1 1 已采纳 2017-05-23 03:45:26

解决方案2 1 2017-05-23 04:23:29

解决方案3 0 2017-05-23 04:08:43

解决方案1
1 已采纳 2017-05-23 03:45:26

解决方案2
1 2017-05-23 04:23:29

解决方案3
0 2017-05-23 04:08:43