[英]Pandas dataframe add new column based on if other columns have data or not
I have a pandas dataframe like below:我有一个如下所示的熊猫数据框:
x y z
1 2 3
na 1 4
na na 5
Now I want to add another column a whose value depend on x, y and z.现在我想添加另一列 a,其值取决于 x、y 和 z。 If x is available then a would be "yes".
如果 x 可用,则 a 将为“是”。 If it is na then it will check y.
如果是 na 那么它会检查 y。 If y is available then, a would be "no", otherwise a would be same as z(if it is available otherwise it will be 0).
如果 y 可用,则 a 将为“否”,否则 a 将与 z 相同(如果可用,则为 0)。 I have the following function in R:
我在 R 中有以下功能:
cur_sta <- function(data){
sta <- ifelse(!is.na(data$x),"yes",
ifelse(!is.na(data$y),"no",
ifelse(!is.na(data$z),data$z,0)))
}
How can I achieve the same in python?我如何在 python 中实现相同的目标?
EDIT:编辑:
I tried the following:我尝试了以下方法:
conditions = [
(not pd.isnull(data["x"].item())),
(not pd.isnull(data["y"].item())),
(not pd.isnull(data["z"].item()))]
choices = ['yes', 'no', data["z"]]
data['col_sta'] = np.select(conditions, choices, default='0')
but I am getting the following error:但我收到以下错误:
ValueError: can only convert an array of size 1 to a Python scalar
How can I fix this?我怎样才能解决这个问题?
Use Series.notna
for test non missing values:使用
Series.notna
测试非缺失值:
conditions = [data["x"].notna(),
data["y"].notna(),
data["z"].notna()]
choices = ['yes', 'no', data["z"]]
data['col_sta'] = np.select(conditions, choices, default='0')
print (data)
x y z col_sta
0 1.0 2.0 3 yes
1 NaN 1.0 4 no
2 NaN NaN 5 5
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.