繁体   English   中英

熊猫-根据另一列更改列中的值

[英]pandas - change value in column based on another column

说我有一个all_data这样的数据all_data

Id  Zone        Neighb
1   NaN         IDOTRR
2   RL          Veenker
3   NaN         IDOTRR
4   RM          Crawfor
5   NaN         Mitchel

我想在“区域”(Zone)列中输入缺失的值,以便在“邻居”(Neighb)为“ IDOTRR”的情况下,将“区域”(Zone)设置为“ RM”,而在“邻居”(Neighb)为“ Mitchel”的情况下,我设置为“ RL”。

all_data.loc[all_data.MSZoning.isnull() 
             & all_data.Neighborhood == "IDOTRR", "MSZoning"] = "RM"
all_data.loc[all_data.MSZoning.isnull() 
             & all_data.Neighborhood == "Mitchel", "MSZoning"] = "RL"

我得到:

TypeError:无效的类型比较

C:\\ Users \\ pprun \\ Anaconda3 \\ lib \\ site-packages \\ pandas \\ core \\ ops.py:798:FutureWarning:逐元素比较失败; 而是返回标量,但将来将执行元素比较
结果= getattr(x,名称)(y)

我敢肯定这应该很简单,但是我已经把它弄乱了太久了。 请帮忙。

使用np.select即

df['Zone'] = np.select([df['Neighb'] == 'IDOTRR',df['Neighb'] == 'Mitchel'],['RM','RL'],df['Zone'])
Id Zone   Neighb
0   1   RM   IDOTRR
1   2   RL  Veenker
2   3   RM   IDOTRR
3   4   RM  Crawfor
4   5   RL  Mitchel

如果您有条件,可以使用

# Boolean mask of condition 1 
m1 = (all_data.MSZoning.isnull()) & (all_data.Neighborhood == "IDOTRR")
# Boolean mask of condition 2
m2 = (all_data.MSZoning.isnull()) & (all_data.Neighborhood == "Mitchel")

np.select([m1,m2],['RM','RL'],all_data["MSZoning"])
df.Zone=df.Zone.fillna(df.Neighb.replace({'IDOTRR':'RM','Mitchel':'RL'}))
df
Out[784]: 
   Id Zone   Neighb
0   1   RM   IDOTRR
1   2   RL  Veenker
2   3   RM   IDOTRR
3   4   RM  Crawfor
4   5   RL  Mitchel

在Python中, &优先于==

http://www.annedawson.net/Python_Precedence.htm

因此,当您执行all_data.MSZoning.isnull() & all_data.Neighborhood == "Mitchel" ,这被解释为(all_data.MSZoning.isnull() & all_data.Neighborhood) == "Mitchel" ,现在Python尝试AND带有str系列的boolean系列,并查看它是否等于单个str "Mitchel" 解决方案是将测试括在括号中: (all_data.MSZoning.isnull()) & (all_data.Neighborhood == "Mitchel") 有时候,如果我有很多选择器,我会将它们分配给变量,然后将它们AND ,例如:

null_zoning = all_data.MSZoning.isnull()
Mitchel_neighb = all_data.Neighborhood == "Mitchel"
all_data.loc[null_zoning & Mitchel_neighb, "MSZoning"] = "RL"

这不仅可以解决操作顺序问题,还意味着all_data.loc[null_zoning & Mitchel_neighb, "MSZoning"] = "RL"放在一行上。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM