熊猫-根据另一列更改列中的值

Question

说我有一个all_data这样的数据all_data ：

Id  Zone        Neighb
1   NaN         IDOTRR
2   RL          Veenker
3   NaN         IDOTRR
4   RM          Crawfor
5   NaN         Mitchel

我想在“区域”（Zone）列中输入缺失的值，以便在“邻居”（Neighb）为“ IDOTRR”的情况下，将“区域”（Zone）设置为“ RM”，而在“邻居”（Neighb）为“ Mitchel”的情况下，我设置为“ RL”。

all_data.loc[all_data.MSZoning.isnull() 
             & all_data.Neighborhood == "IDOTRR", "MSZoning"] = "RM"
all_data.loc[all_data.MSZoning.isnull() 
             & all_data.Neighborhood == "Mitchel", "MSZoning"] = "RL"

我得到：

TypeError：无效的类型比较

C：\\ Users \\ pprun \\ Anaconda3 \\ lib \\ site-packages \\ pandas \\ core \\ ops.py：798：FutureWarning：逐元素比较失败； 而是返回标量，但将来将执行元素比较
结果= getattr（x，名称）（y）

我敢肯定这应该很简单，但是我已经把它弄乱了太久了。 请帮忙。

Answer 1

使用np.select即

df['Zone'] = np.select([df['Neighb'] == 'IDOTRR',df['Neighb'] == 'Mitchel'],['RM','RL'],df['Zone'])

Id Zone   Neighb
0   1   RM   IDOTRR
1   2   RL  Veenker
2   3   RM   IDOTRR
3   4   RM  Crawfor
4   5   RL  Mitchel

如果您有条件，可以使用

# Boolean mask of condition 1 
m1 = (all_data.MSZoning.isnull()) & (all_data.Neighborhood == "IDOTRR")
# Boolean mask of condition 2
m2 = (all_data.MSZoning.isnull()) & (all_data.Neighborhood == "Mitchel")

np.select([m1,m2],['RM','RL'],all_data["MSZoning"])

Answer 2

df.Zone=df.Zone.fillna(df.Neighb.replace({'IDOTRR':'RM','Mitchel':'RL'}))
df
Out[784]: 
   Id Zone   Neighb
0   1   RM   IDOTRR
1   2   RL  Veenker
2   3   RM   IDOTRR
3   4   RM  Crawfor
4   5   RL  Mitchel

Answer 3

在Python中， &优先于==

http://www.annedawson.net/Python_Precedence.htm

因此，当您执行all_data.MSZoning.isnull() & all_data.Neighborhood == "Mitchel" ，这被解释为(all_data.MSZoning.isnull() & all_data.Neighborhood) == "Mitchel" ，现在Python尝试AND带有str系列的boolean系列，并查看它是否等于单个str "Mitchel" 。 解决方案是将测试括在括号中： (all_data.MSZoning.isnull()) & (all_data.Neighborhood == "Mitchel") 。 有时候，如果我有很多选择器，我会将它们分配给变量，然后将它们AND ，例如：

null_zoning = all_data.MSZoning.isnull()
Mitchel_neighb = all_data.Neighborhood == "Mitchel"
all_data.loc[null_zoning & Mitchel_neighb, "MSZoning"] = "RL"

这不仅可以解决操作顺序问题，还意味着all_data.loc[null_zoning & Mitchel_neighb, "MSZoning"] = "RL"放在一行上。

熊猫-根据另一列更改列中的值

问题描述

3 个解决方案

解决方案1
3 2017-11-02 15:51:04

解决方案2
2 2017-11-02 16:28:01

解决方案3
1 已采纳 2017-11-02 16:21:56

熊猫-根据另一列更改列中的值

问题描述

3 个解决方案

解决方案1 3 2017-11-02 15:51:04

解决方案2 2 2017-11-02 16:28:01

解决方案3 1 已采纳 2017-11-02 16:21:56

解决方案1
3 2017-11-02 15:51:04

解决方案2
2 2017-11-02 16:28:01

解决方案3
1 已采纳 2017-11-02 16:21:56