从 Pandas Dataframe 中选择一列中具有相同值而另一列仅缺失的行

Question

In the following code, under column A, foo and tog have only missing values in column B. However, I can't simply use is_na() to filter all missing values, since there is one bar that has a missing value.在下面的代码中，在 A 列下，foo 和 tog 在 B 列中只有缺失值。但是，我不能简单地使用is_na()过滤所有缺失值，因为有一个 bar 具有缺失值。

df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
                          'tog', 'bar', 'bar'],
                   'B' : [np.nan, 2, np.nan, 4, np.nan, 6, np.nan],
                   'C' : [2.0, 5., 8., 1., 2., 9., 3.]})

I've tried with df.groupby('A').filter(df['B'] == 'NaN') , but that returns an error:我试过df.groupby('A').filter(df['B'] == 'NaN') ，但返回错误：

'Series' object is not callable. '系列' object 不可调用。

How can I filter or select for foo and tog?如何为 foo 和 tog 过滤或 select？ Much appreciated!非常感激！

Edit: I'm cleaning a dataset that has a few missing values, but spread out amongst a lot of rows.编辑：我正在清理一个包含一些缺失值但分布在很多行中的数据集。 As such, I can't just simply select for named elements corresponding with column A (eg foo and tog).因此，对于与 A 列对应的命名元素（例如 foo 和 tog），我不能简单地使用 select。

In other words, I need the following换句话说，我需要以下

    A   B   C
1   bar 2.0 5.0
3   bar 4.0 1.0
5   bar 6.0 9.0
6   bar NaN 3.0

Answer 1

filter expects a function and you can pass one that checks if not all of the values in B are NaN : filter需要一个 function 并且您可以传递一个检查B中是否并非所有值都是NaN的值：

df.groupby("A").filter(lambda x: ~x.B.isna().all())

to get要得到

     A    B    C
1  bar  2.0  5.0
3  bar  4.0  1.0
5  bar  6.0  9.0
6  bar  NaN  3.0

where foo and tog are filtered out since they have all NaN's in B column.其中foo和tog被过滤掉，因为它们在 B 列中具有所有 NaN。

从 Pandas Dataframe 中选择一列中具有相同值而另一列仅缺失的行

问题描述

1 个解决方案

解决方案1
0 已采纳 2021-04-23 14:38:23

从 Pandas Dataframe 中选择一列中具有相同值而另一列仅缺失的行

问题描述

1 个解决方案

解决方案1 0 已采纳 2021-04-23 14:38:23

解决方案1
0 已采纳 2021-04-23 14:38:23