简体   繁体   English

在Panda Dataframe中附加布尔列

[英]Appending Boolean Column in Panda Dataframe

I am learning pandas and got stuck with this problem here. 我正在学习熊猫,并在这里遇到这个问题。

I created a dataframe that tracks all users and the number of times they did something. 我创建了一个跟踪所有用户的数据框以及他们执行某些操作的次数。

To better understand the problem I created this example: 为了更好地理解我创建此示例的问题:

import pandas as pd
data = [
    {'username': 'me',  'bought_apples': 2, 'bought_pears': 0},
    {'username': 'you', 'bought_apples': 1, 'bought_pears': 1}
]
df = pd.DataFrame(data)
df['bought_something'] = df['bought_apples'] > 0 or df['bought_pears'] > 0

In the last line I want to add a column that indicates if they user has bought something at all. 在最后一行中,我想添加一个列,指示用户是否已经购买了一些东西。

This error pops up: 弹出此错误:

ValueError: The truth value of a Series is ambiguous. ValueError:Series的真值是不明确的。 Use a.empty, a.bool(), a.item(), a.any() or a.all(). 使用a.empty,a.bool(),a.item(),a.any()或a.all()。

I understand the point of ambiguity in panda's Series ( also explained here ) but I could not relate it to the problem. 我理解熊猫系列中的歧义点( 这里也有解释 ),但我无法将其与问题联系起来。

Interestingly this works 有趣的是,这有效

df['bought_something'] = df['bought_apples'] > 0

Can anyone help me? 谁能帮我?

You can call sum row-wise and compare if this is greater than 0 : 您可以逐行调用sum并比较它是否大于0

In [105]:
df['bought_something'] = df[['bought_apples','bought_pears']].sum(axis=1) > 0
df

Out[105]:
   bought_apples  bought_pears username bought_something
0              2             0       me             True
1              1             1      you             True

Regarding your original attempt, the error message is telling you that it's ambiguous to compare a scalar with an array, if you want to or boolean conditions then you need to use the bit-wise operator | 关于您的原始尝试,错误消息告诉您比较标量与数组是不明确的,如果您想要or布尔条件,那么您需要使用逐位运算符| and wrap the conditions in parentheses due to operator precedence: 并且由于运算符优先级而将条件包装在括号中:

In [111]:
df['bought_something'] = ((df['bought_apples'] > 0) | (df['bought_pears'] > 0))
df

Out[111]:
   bought_apples  bought_pears username bought_something
0              2             0       me             True
1              1             1      you             True

The reason for that error is you use 'or' to 'join' two boolean vectors instead of boolean scalar. 出现该错误的原因是您使用'或'来'加入'两个布尔向量而不是布尔标量。 That's why it says it is ambiguous. 这就是它说它含糊不清的原因。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM