简体   繁体   English

当在 dataframe 列中找到某个值时,如何将 pandas dataframe 分解为子数据帧?

[英]How to break a pandas dataframe into sub dataframes when a certain value is found in the dataframe column?

I have dataframe that looks like this:我有 dataframe 看起来像这样:

data = pd.DataFrame({"event": ["A", "B", "C", "A", "A", "E", "P", "S", "A", "Y", "A"]})
data.head(15)

    event
  0 A
  1 B
  2 C
  3 A
  4 A
  5 E
  6 P
  7 S
  8 A
  9 Y
 10 A

I want to break this dataframe into 5 small dataframes whenever the event "A" is found.每当发现事件“A”时,我想将这个 dataframe 分成 5 个小数据帧。 So the five dataframes I want to create, would look like this in the case:所以我想创建的五个数据框在这种情况下看起来像这样:

1)    event
    0   A
    1   B
    2   C

2)    event
    0   A

3)    event
    0   A
    1   E
    2   P
    3   S
    
4)    event
    0   A
    1   Y

5)    event
    0   A

Is there any elegant way to do this with Python Pandas and also Pyspark?有什么优雅的方法可以用 Python Pandas 和 Pyspark 做到这一点吗?

With pandas, use groupby with a helper grouper using data['event'].eq('A').cumsum() :对于 pandas,使用data['event'].eq('A').cumsum()groupby与辅助石斑鱼一起使用:

dfs = [g for _,g in data.groupby(data['event'].eq('A').cumsum())]

or to get a new index, add a reset_index :或者要获取新索引,请添加reset_index

dfs = [g.reset_index(drop=True)
       for _,g in data.groupby(data['event'].eq('A').cumsum())]

output (without reset_index ): output(没有reset_index ):

[  event
 0     A
 1     B
 2     C,
   event
 3     A,
   event
 4     A
 5     E
 6     P
 7     S,
   event
 8     A
 9     Y,
    event
 10     A]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 当找到列中的特定字符串时,在子数据框中切片 Dataframe - Slice Dataframe in sub-dataframes when specific string in column is found 如何按列值将 Pandas 数据帧拆分/切片为多个数据帧? - How to split/slice a Pandas dataframe into multiple dataframes by column value? 找到特定的列值后如何将一个数据帧拆分为多个数据帧 - How to split a dataframe in multiple dataframes after a specific column value is found 如何根据列值将pandas数据帧划分为更小的数据帧? - How to divide a pandas dataframe into smaller dataframes, based on a column value? 如何将列值中的数据分解为多行 pandas dataframe - how to break up data in column value to multiple rows in pandas dataframe 将Pandas数据框拆分为子数据框(而不是数据框列表) - Split Pandas Dataframe Into Sub Dataframes (not list of dataframes) 如何在pandas数据框中将一列分为两部分 - How to break a column into two in pandas dataframe 如何检查pandas数据帧中是否存在具有特定列值的行 - How to check if there exists a row with a certain column value in pandas dataframe 如何删除Pandas DataFrame某列值为NaN的行 - How to drop rows of Pandas DataFrame whose value in a certain column is NaN 如何更改熊猫数据框中某个类别的另一列中的值? - How to change the value in another column of a certain category in a pandas dataframe?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM