根据系列条件创建新的熊猫列

Question

Coming from R to Python and I can't seem to figure out a simple case of creating a new column, based on conditionally checking other columns. 从R到Python ，我似乎无法根据有条件地检查其他列来弄清楚创建新列的简单情况。

# In R, create a 'z' column based on values in x and y columns
df <- data.frame(x=rnorm(100),y=rnorm(100))
df$z <- ifelse(df$x > 1.0 | df$y < -1.0, 'outlier', 'normal')
table(df$z)
# output below
normal outlier 
     66      34

Attempt at the equivalent statement in Python: 尝试使用Python中的等效语句：

import numpy as np
import pandas as pd
df = pd.DataFrame({'x': np.random.standard_normal(100), 'y': np.random.standard_normal(100)})
df['z'] = 'outlier' if df.x > 1.0 or df.y < -1.0 else 'normal'

However, the following exception is thrown: ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). 但是，将引发以下异常： ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

What is the pythonic way of achieving this? 实现这一目标的Python方法是什么？ Many thanks :) 非常感谢：）

Answer 1

尝试这个：

df['z'] = np.where((df.x > 1.0) | (df.y < -1.0), 'outlier', 'normal')

Answer 2

If you want to do elementwise operations on columns you can't adress your columns like this. 如果要对列执行元素化操作，则无法像这样处理您的列。 Use numpy where 使用numpy其中

根据系列条件创建新的熊猫列

问题描述

2 个解决方案

解决方案1
3 已采纳 2017-06-23 19:25:28

解决方案2
1 2017-06-23 19:27:00

根据系列条件创建新的熊猫列

问题描述

2 个解决方案

解决方案1 3 已采纳 2017-06-23 19:25:28

解决方案2 1 2017-06-23 19:27:00

解决方案1
3 已采纳 2017-06-23 19:25:28

解决方案2
1 2017-06-23 19:27:00