简体   繁体   English

Python 数据框 - 基于列删除连续行

[英]Python dataframe - drop consecutive rows based on a column

I need to remove consecutive rows based on a column value.我需要根据列值删除连续的行。 my dataframe looks like below我的数据框如下所示

df = pd.DataFrame({
            "CustID":
                ["c1","c1","c1","c1","c1","c1","c1","c1","c1","c1","c2","c2","c2","c2","c2","c2"],
            "saleValue":
                [10, 12, 13, 6, 4 , 2, 11, 17, 1,5,8,2,16,13,1,4],
             "Status":
                [0, 0, 0, 1, 1 ,1, 0, 0, 1,1,1,1,0,0,1,1]
            
            
    })

dataframe looks like below

  CustID    saleValue   Status
    c1            10    0
    c1            12    0
    c1            13    0
    c1             6    1
    c1             4    1
    c1             2    1
    c1            11    0
    c1            17    0
    c1             1    1
    c1             5    1
    c2             8    1
    c2             2    1
    c2            16    0
    c2            13    0
    c2             1    1
    c2             4    1
    

I need to drop the consecutive rows for each CustID only when the Status is 1 .Can you please let me know the best way to do it仅当状态为 1 时,我才需要删除每个 CustID 的连续行。你能告诉我最好的方法吗

so the output should look like below.
 

CustID  saleValue   Status
    c1        10          0
    c1        12          0
    c1        13          0
    c1         6          1
    c1        11          0
    c1        17          0
    c1         1          1
    c2         8          1
    c2        16          0
    c2        13          0
    c2         1          1

Create a Boolean mask for the entire DataFrame.为整个 DataFrame 创建一个布尔掩码。

Given the DataFrame is already grouped by ID, find rows where the value is 1, the previous row is also 1, and where the ID is the same as the ID on the previous row.给定DataFrame已经按ID分组,查找值为1,前一行也是1,且ID与前一行ID相同的行。 These are the rows to drop, so keep the rest.这些是要删除的行,所以保留其余的行。

to_drop = (df['Status'].eq(1) & df['Status'].shift().eq(1)  # Consecutive 1s
           & df['CustID'].eq(df['CustID'].shift()))         # Within same ID  

df[~to_drop]

   CustID  saleValue  Status
0      c1         10       0
1      c1         12       0
2      c1         13       0
3      c1          6       1
6      c1         11       0
7      c1         17       0
8      c1          1       1
10     c2          8       1
12     c2         16       0
13     c2         13       0
14     c2          1       1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 基于 Python 中的列比较连续的数据帧行 - Compare consecutive dataframe rows based on columns in Python 如何根据另一列中的连续两行添加 dataframe 列 - How to add a dataframe column based on two consecutive rows in another column 根据一列的连续值获取数据框的行 - Get the rows of dataframe based on the consecutive values of one column 使用 Groupby 根据 Pandas 中列中的值从 DataFrame 中选择 CONSECUTIVE 行 - Select CONSECUTIVE rows from a DataFrame based on values in a column in Pandas with Groupby 将Pandas数据帧分组一列,根据另一列删除行 - Group Pandas dataframe by one column, drop rows based on another column 根据 dataframe 中列表列列表的条件过滤和删除行 - Filter and Drop rows based on a condition for a list of list column in dataframe 在 Pandas 数据框中在多个条件下(基于 2 列)删除行 - Drop rows on multiple conditions (based on 2 column) in pandas dataframe 根据匹配的列值与其他数据框的组合删除行熊猫 - Drop rows pandas based on combination of matched column values with other dataframe Python DataFrame - Select dataframe rows based on values in a column of same dataframe - Python DataFrame - Select dataframe rows based on values in a column of same dataframe 基于 2 个连续行的值过滤 pandas Dataframe - Filter of pandas Dataframe based on values of 2 consecutive rows
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM