简体   繁体   English

大熊猫-根据数据框分组创建键值对

[英]pandas - create key value pair from grouped by data frame

I have a data frame with three columns, I would like to create a dictionary after applying groupby function on first and second column.I can do this by for loops, but is there any pandas way of doing it? 我有一个包含三列的数据框,我想在第一列和第二列上应用groupby函数后创建一个字典,我可以通过for循环来做到这一点,但是有什么熊猫方法吗?

DataFrame: 数据帧:

Col X    Col Y    Sum
 A         a       3
 A         b       2
 A         c       1
 B         p       5
 B         q       6
 B         r       7  

After grouping by on Col X and Col Y : df.groupby(['Col X','Col Y']).sum() 在对Col X和Col Y进行分组之后:df.groupby(['Col X','Col Y'])。sum()

                  Sum
Col X    Col Y    
 A         a       3
           b       2
           c       1
 B         p       5
           q       6
           r       7 

Dictionary I want to create 我要创建的字典

{A:{'a':3,'b':2,'c':1}, B:{'p':5,'q':6,'r':7}}

Use a dictionary comprehension while iterating via a groupby object 通过groupby对象进行迭代时使用字典理解

{name: dict(zip(g['Col Y'], g['Sum'])) for name, g in df.groupby('Col X')}

{'A': {'a': 3, 'b': 2, 'c': 1}, 'B': {'p': 5, 'q': 6, 'r': 7}}

If you insisted on using to_dict somewhere, you could do something like this: 如果您坚持在某处使用to_dict ,则可以执行以下操作:

s = df.set_index(['Col X', 'Col Y']).Sum
{k: s.xs(k).to_dict() for k in s.index.levels[0]}

{'A': {'a': 3, 'b': 2, 'c': 1}, 'B': {'p': 5, 'q': 6, 'r': 7}}

Keep in mind, that the to_dict method is just using some comprehension under the hood. 请记住, to_dict方法只是在to_dict使用某种理解。 If you have a special use case that requires something more than what the orient options provide for... there is no shame in constructing your own comprehension. 如果您有一个特殊的用例,而所需的东西超出了orient选项提供的东西,那么……构建自己的理解力就不会感到羞耻。

You can iterate over the MultiIndex series: 您可以遍历MultiIndex系列:

>>> s = df.set_index(['ColX', 'ColY'])['Sum']
>>> {k: v.reset_index(level=0, drop=True).to_dict() for k, v in s.groupby(level=0)}
{'A': {'a': 3, 'b': 2, 'c': 1}, 'B': {'p': 5, 'q': 6, 'r': 7}}
#A to_dict() solution

d = df.groupby(['Col X','Col Y']).sum().reset_index().pivot(columns='Col X',values='Sum').to_dict()

Out[70]: 
{'A': {0: 3.0, 1: 2.0, 2: 1.0, 3: nan, 4: nan, 5: nan},
 'B': {0: nan, 1: nan, 2: nan, 3: 5.0, 4: 6.0, 5: 7.0}}

#if you need to get rid of the nans:
{k1:{k2:v2 for k2,v2 in v1.items() if pd.notnull(v2)} for k1,v1 in d.items()}
Out[73]: {'A': {0: 3.0, 1: 2.0, 2: 1.0}, 'B': {3: 5.0, 4: 6.0, 5: 7.0}}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从分组数据框中的行值创建新列? - Create new column from a row value in a grouped data frame? 如何从Pandas数据框中创建JSON,其中列是关键 - How to create a json from pandas data frame where columns are the key 如何找到按月分组的 pandas 数据帧中的最高中值? - How to find the highest median value in pandas data frame grouped by month? 如何从给定键的行创建列:pandas 列中的值对? - How to create columns from rows given key:value pair in the column in pandas? 将列表中的键值对字符串转换为数据框中的列 Python - Convert key value pair strings in a list to columns in a Data Frame Python 熊猫:将分组的df转换为具有两列作为键,值对的字典列表 - Pandas: convert grouped df into list of dict with two columns as key, value pair 从分组的pandas数据框中绘制堆积图 - Plotting stacked plot from grouped pandas data frame pandas从分组数据框中重新排序列的子集 - pandas reorder subset of columns from a grouped data frame 从 Pandas 数据帧创建分组的、堆叠的 Arrays - Creating Grouped, Stacked Arrays from Pandas Data Frame 我可以在分组的数据帧上应用使用“移位”的函数,并从熊猫返回一个简单的数据帧吗? - Can I apply a function that uses 'shift' on a grouped data frame, and return a simple data frame from pandas?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM