根据唯一列值将数据框拆分为更小的数据框

Question

this is my data frame:这是我的数据框：

    Quantity     Code         Value       
0       1757     08951201     717.0
1       1100     08A85800       0.0
2       2500     08A85800       0.0
3        323     08951201       0.0
4        800     08A85800       0.0

and i what to split this into smaller data frames created based on Code column.我如何将其拆分为基于代码列创建的较小数据框。 (Eg this one should split into df1 with all 08951201 codes and df2 with 08A85800) （例如，这个应该分成带有所有 08951201 代码的 df1 和带有 08A85800 的 df2）

Edit: And I'd love to have a way to merge them back into original dataframe in the same order after some value calculations im gonna perform.编辑：我很想有一种方法将它们合并回原始 dataframe 在我将执行一些价值计算之后以相同的顺序。

Answer 1

Use groupby and apply your custom function to process your sub dataframe:使用groupby并应用您的自定义 function 来处理您的子 dataframe：

groups = df.groupby('Code')
print(list(groups))

# Output:
[('08951201',    Quantity      Code  Value
0      1757  08951201  717.0
3       323  08951201    0.0),

('08A85800',    Quantity      Code  Value
1      1100  08A85800    0.0
2      2500  08A85800    0.0
4       800  08A85800    0.0)]

Now suppose you want to sum by Value :现在假设您想按Value sum ：

>>> df.groupby('Code')['Value'].sum()
Code
08951201    717.0
08A85800      0.0
Name: Value, dtype: float64

Answer 2

As suggested you could use groupby() on your dataframe to segregate by one column name values:正如建议的那样，您可以在 dataframe 上使用groupby()以按一列名称值分隔：

import pandas as pd

cols = ['Quantity', 'Code', 'Value']
data = [[1757,     '08951201',     717.0],
 [1100,     '08A85800',       0.0],
 [2500,     '08A85800',       0.0],
 [323,    '08951201',      0.0],
 [800,    '08A85800',       0.0]]

df = pd.DataFrame(data, columns=cols)

groups =df.groupby(['Code'])

Then you can recover indices by groups.indices , this will return a dict with 'Code' values as keys, and index as values.然后您可以通过groups.indices恢复索引，这将返回一个以“代码”值作为键，索引作为值的字典。 For last if you want to get every sub-dataframe you can call group_list = list(groups) .最后，如果您想获取每个子数据帧，您可以调用group_list = list(groups) 。 I suggest to do the work in 2 steps (first group by, then call list), because this way you can call other methods over the groupDataframe ( group )我建议分两步完成工作（首先分组，然后调用列表），因为这样您可以通过 groupDataframe （ group ）调用其他方法

EDIT编辑

Then if you want a particular dataframe you could call然后，如果你想要一个特定的 dataframe 你可以打电话

 df_i = group_list[i][1]

group_list[i] is the i-th element of sub-dataframe, but it's a tupple containing (group_val,group_df) . group_list[i]是子数据帧的第 i 个元素，但它是一个包含(group_val,group_df)的元组。 where group_val is the value associated to this new dataframe ( '08951201' or '08A85800' ) and group_df is the new dataframe.其中group_val是与这个新的 dataframe（ '08951201'或'08A85800' ）关联的值， group_df是新的 dataframe。

根据唯一列值将数据框拆分为更小的数据框

问题描述

2 个解决方案

解决方案1
0 2021-11-30 14:17:08

解决方案2
0 2021-11-30 14:27:33

根据唯一列值将数据框拆分为更小的数据框

问题描述

2 个解决方案

解决方案1 0 2021-11-30 14:17:08

解决方案2 0 2021-11-30 14:27:33

解决方案1
0 2021-11-30 14:17:08

解决方案2
0 2021-11-30 14:27:33