[英]values not being returned in dataframe
I am trying to drop rows with a 'count' values of less than 10 in my dataframe. 我正在尝试删除数据框中小于10的“计数”值的行。 My dataframe currently looks something like this:
我的数据框目前看起来像这样:
person id count
0 p1 760431192 20
1 p2 101663519 1
2 p3 325694288 2
3 p4 338468584 1
4 p5 2337087786 18
I merged the count column with the df.merge function based off of the id column: 我根据id列将count列与df.merge函数合并:
df = df.merge(dframe, on='id', how='left')
So when I try to drop rows with a count < 10, i get the following error: 因此,当我尝试删除计数小于10的行时,出现以下错误:
df = df[df.count>=10]
KeyError: True
However, when I use this same code on any other column, say: 但是,当我在其他任何列上使用相同的代码时,请说:
df = df[df.id==760431192]
df = df[df.person==p2]
The code works perfectly, and i get the dataframe I was expecting. 该代码运行完美,并且我得到了我期望的数据帧。 Any idea why the code is not working on the merged column 'count'?
知道为什么代码对合并的列“ count”不起作用吗?
df.count
isn't the column, it's the method DataFrame.count
. df.count
不是列,而是DataFrame.count
方法 。 So you're not comparing a dataframe against a number (giving elementwise boolean results), you're comparing a method against a number, which there's no rule for. 因此,您不是将数据框与数字进行比较(给出按元素的布尔结果),而是将方法与数字进行比较,这是没有规则的。 In Python 2, when there's no rule for a comparison, it falls back to a default "arbitrary but consistent" rule, which gives a single boolean answer.
在Python 2中,当没有比较规则时,它会退回到默认的“任意但一致”规则,该规则给出一个布尔值答案。
In Python 3, that default rule has been removed, and the error you get gives you a much better idea of what's going on: 在Python 3中,该默认规则已删除,并且您收到的错误使您对发生的事情有了更好的了解:
>>> df.count >= 10
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: method() >= int()
In any case, the solution is to get that column as df['count']
instead: 无论如何,解决方案是将该列改为
df['count']
:
>>> df[df['count'] >= 10]
person id count
0 p1 760431192 20
4 p5 2337087786 18
Another way to add the count of unique items back to the original DataFrame is to use groupby
together with transform
: 将唯一项计数添加回原始DataFrame的另一种方法是将
groupby
与transform
一起使用:
df['count'] = df.groupby('id').transform('count')
You can now filter out the rows with a count less than ten: 现在,您可以过滤出少于十个的行:
df = df[df['count'] >= 10]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.