简体   繁体   English

根据唯一列值将数据框拆分为更小的数据框

[英]Splitting data frame into smaller data frames based on unique column values

this is my data frame:这是我的数据框:

    Quantity     Code         Value       
0       1757     08951201     717.0
1       1100     08A85800       0.0
2       2500     08A85800       0.0
3        323     08951201       0.0
4        800     08A85800       0.0

and i what to split this into smaller data frames created based on Code column.我如何将其拆分为基于代码列创建的较小数据框。 (Eg this one should split into df1 with all 08951201 codes and df2 with 08A85800) (例如,这个应该分成带有所有 08951201 代码的 df1 和带有 08A85800 的 df2)

Edit: And I'd love to have a way to merge them back into original dataframe in the same order after some value calculations im gonna perform.编辑:我很想有一种方法将它们合并回原始 dataframe 在我将执行一些价值计算之后以相同的顺序。

Use groupby and apply your custom function to process your sub dataframe:使用groupby并应用您的自定义 function 来处理您的子 dataframe:

groups = df.groupby('Code')
print(list(groups))

# Output:
[('08951201',    Quantity      Code  Value
0      1757  08951201  717.0
3       323  08951201    0.0),

('08A85800',    Quantity      Code  Value
1      1100  08A85800    0.0
2      2500  08A85800    0.0
4       800  08A85800    0.0)]

Now suppose you want to sum by Value :现在假设您想按Value sum

>>> df.groupby('Code')['Value'].sum()
Code
08951201    717.0
08A85800      0.0
Name: Value, dtype: float64

As suggested you could use groupby() on your dataframe to segregate by one column name values:正如建议的那样,您可以在 dataframe 上使用groupby()以按一列名称值分隔:

import pandas as pd

cols = ['Quantity', 'Code', 'Value']
data = [[1757,     '08951201',     717.0],
 [1100,     '08A85800',       0.0],
 [2500,     '08A85800',       0.0],
 [323,    '08951201',      0.0],
 [800,    '08A85800',       0.0]]

df = pd.DataFrame(data, columns=cols)

groups =df.groupby(['Code'])

Then you can recover indices by groups.indices , this will return a dict with 'Code' values as keys, and index as values.然后您可以通过groups.indices恢复索引,这将返回一个以“代码”值作为键,索引作为值的字典。 For last if you want to get every sub-dataframe you can call group_list = list(groups) .最后,如果您想获取每个子数据帧,您可以调用group_list = list(groups) I suggest to do the work in 2 steps (first group by, then call list), because this way you can call other methods over the groupDataframe ( group )我建议分两步完成工作(首先分组,然后调用列表),因为这样您可以通过 groupDataframe ( group )调用其他方法


EDIT编辑

Then if you want a particular dataframe you could call然后,如果你想要一个特定的 dataframe 你可以打电话

 df_i = group_list[i][1]

group_list[i] is the i-th element of sub-dataframe, but it's a tupple containing (group_val,group_df) . group_list[i]是子数据帧的第 i 个元素,但它是一个包含(group_val,group_df)的元组。 where group_val is the value associated to this new dataframe ( '08951201' or '08A85800' ) and group_df is the new dataframe.其中group_val是与这个新的 dataframe( '08951201''08A85800' )关联的值, group_df是新的 dataframe。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据唯一的列组合将数据框拆分为多个数据框 - Split data frame into multiple data frames based on unique column combinations 根据列名拆分pandas数据框 - Splitting pandas data frame based on column name 在python中使用列的动态列表按列拆分数据帧 - Splitting a data frame by a column using a dynamic list of that columns unique values in python 从列中的唯一值创建一个较小的数据框 - create a smaller data frame from unique value from a column 根据列中的一组参数将数据框拆分为多个数据框 - Split data frame into multiple data frames based on a group of parameters in a column 使用 pandas 将列从较小的数据帧复制(组装)到较大的数据帧中 - Copying (assembling) the column from smaller data frames into the bigger data frame with pandas 根据 pandas 中另一个数据帧中的某些条件将值从一个数据帧拆分到另一个数据帧 - Splitting values from one data frame to another data frame based on certain conditions in another data frame in pandas 将 dataframe 均匀地拆分为许多较小的数据帧 - Splitting a dataframe into many smaller data frames evenly-ish 合并唯一值的数据框 - Merging Data Frames on Unique Values 使用基于数据框年份的值向数据框添加新列 - Adding new column to data frame with values based on years of data frame
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM