根据列值从分组数据框中删除行

Question

I would like to remove from each subgroup in a data frame, the rows which satisfy certain conditions.我想从数据框中的每个子组中删除满足特定条件的行。 The subgroups are grouped based on the two columns Days & ID , here's my data frame:子组根据两列Days & ID进行分组，这是我的数据框：

df = pd.DataFrame({'Days':[5,5,5,5,6,6],
                   'ID':['A11','A11','A11','A11','B12','B12'],
                    'Country':['DE','DE','FR','US','US','US'],
                    'Car1':['BMW','Volvo','Audi','BMW','Mercedes','BMW'],
                     'Car2':['Volvo','Mercedes','BMW','Volvo','Volvo','Volvo'],
                    'Car3':['Mercedes',nan,'Volvo',nan,nan,nan]},
                       )

    Days  ID   Country    Car1      Car2      Car3
0     5  A11      DE       BMW     Volvo  Mercedes
1     5  A11      DE     Volvo  Mercedes       nan
2     5  A11      FR      Audi       BMW     Volvo
3     5  A11      US       BMW     Volvo       nan
4     6  B12      US  Mercedes     Volvo       nan
5     6  B12      US       BMW     Volvo       nan

I would like to remove the rows from each group satisfying the following three conditions:我想从满足以下三个条件的每个组中删除行：

1. Car3=nan
2. Car1=Car2 (from another row within the group)
3. Car2=Car3 (from another row within the group)

The expected data frame I would like to have:我想要的预期数据框：

   Days   ID Country      Car1   Car2      Car3
0     5  A11      DE       BMW  Volvo  Mercedes
1     5  A11      FR      Audi    BMW     Volvo
2     6  B12      US  Mercedes  Volvo       nan
3     6  B12      US       BMW  Volvo       nan

Answer 1

You can check membership per groups in lambda function with Series.isin and filter invert mask by ~ in boolean indexing :您可以使用Series.isin检查 lambda 函数中每个组的成员资格，并在boolean indexing中通过~过滤反转掩码：

m = (df.groupby(['Days','ID'], group_keys=False)
      .apply(lambda x: x['Car1'].isin(x['Car2']) & x['Car2'].isin(x['Car3'])) & 
       df['Car3'].isna())

df = df[~m].reset_index(drop=True)
print (df)
   Days   ID Country      Car1   Car2      Car3
0     5  A11      DE       BMW  Volvo  Mercedes
1     5  A11      FR      Audi    BMW     Volvo
2     6  B12      US  Mercedes  Volvo       NaN
3     6  B12      US       BMW  Volvo       NaN

根据列值从分组数据框中删除行

问题描述

1 个解决方案

解决方案1
2 已采纳 2022-12-22 06:28:59

根据列值从分组数据框中删除行

问题描述

1 个解决方案

解决方案1 2 已采纳 2022-12-22 06:28:59

解决方案1
2 已采纳 2022-12-22 06:28:59