值未在数据框中返回

Question

I am trying to drop rows with a 'count' values of less than 10 in my dataframe. 我正在尝试删除数据框中小于10的“计数”值的行。 My dataframe currently looks something like this: 我的数据框目前看起来像这样：

    person  id     count
0   p1  760431192   20
1   p2  101663519   1
2   p3  325694288   2
3   p4  338468584   1
4   p5  2337087786  18

I merged the count column with the df.merge function based off of the id column: 我根据id列将count列与df.merge函数合并：

df = df.merge(dframe, on='id', how='left')

So when I try to drop rows with a count < 10, i get the following error: 因此，当我尝试删除计数小于10的行时，出现以下错误：

df = df[df.count>=10]
KeyError: True

However, when I use this same code on any other column, say: 但是，当我在其他任何列上使用相同的代码时，请说：

df = df[df.id==760431192]
df = df[df.person==p2]

The code works perfectly, and i get the dataframe I was expecting. 该代码运行完美，并且我得到了我期望的数据帧。 Any idea why the code is not working on the merged column 'count'? 知道为什么代码对合并的列“ count”不起作用吗？

Answer 1

df.count isn't the column, it's the method DataFrame.count . df.count不是列，而是DataFrame.count 方法。 So you're not comparing a dataframe against a number (giving elementwise boolean results), you're comparing a method against a number, which there's no rule for. 因此，您不是将数据框与数字进行比较（给出按元素的布尔结果），而是将方法与数字进行比较，这是没有规则的。 In Python 2, when there's no rule for a comparison, it falls back to a default "arbitrary but consistent" rule, which gives a single boolean answer. 在Python 2中，当没有比较规则时，它会退回到默认的“任意但一致”规则，该规则给出一个布尔值答案。

In Python 3, that default rule has been removed, and the error you get gives you a much better idea of what's going on: 在Python 3中，该默认规则已删除，并且您收到的错误使您对发生的事情有了更好的了解：

>>> df.count >= 10
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unorderable types: method() >= int()

In any case, the solution is to get that column as df['count'] instead: 无论如何，解决方案是将该列改为df['count'] ：

>>> df[df['count'] >= 10]
  person          id  count
0     p1   760431192     20
4     p5  2337087786     18

Answer 2

Another way to add the count of unique items back to the original DataFrame is to use groupby together with transform : 将唯一项计数添加回原始DataFrame的另一种方法是将groupby与transform一起使用：

df['count'] = df.groupby('id').transform('count')

You can now filter out the rows with a count less than ten: 现在，您可以过滤出少于十个的行：

df = df[df['count'] >= 10]

值未在数据框中返回

问题描述

2 个解决方案

解决方案1
1 已采纳 2015-07-30 00:39:37

解决方案2
0 2015-07-30 01:19:37

值未在数据框中返回

问题描述

2 个解决方案

解决方案1 1 已采纳 2015-07-30 00:39:37

解决方案2 0 2015-07-30 01:19:37

解决方案1
1 已采纳 2015-07-30 00:39:37

解决方案2
0 2015-07-30 01:19:37