基于另一个列替换列中的值

Question

I'm trying to replace a value in a column based on another existing column. 我正在尝试基于另一个现有列替换列中的值。

The 2 columns look like this 2列看起来像这样

id_30       DeviceInfoShort

Android     SAMSUNG
iOS         iOS
None        Windows
None        None
Mac         MacOS
Windows     Windows
None        None

id_30 columns have "None" which is not showing in the picture. id_30列的“无”未在图片中显示。 What I want is for all "None" values in id_30 columns, it will check if the value in DeviceInfoShort is "Windows", if yes, replace "None" in id_30 with "Windows", otherwise "Android" 我想要的是id_30列中的所有“ None”值，它将检查DeviceInfoShort中的值是否为“ Windows”，如果是，则将id_30中的“ None”替换为“ Windows”，否则替换为“ Android”

Code below is what I have. 下面的代码是我所拥有的。 It works fine but it took 10 mins to run. 它工作正常，但运行了10分钟。 I think I can use map/apply here to make it faster...is there a more elegant way of doing this using pandas? 我想我可以在这里使用map / apply使其更快...是否有使用熊猫的更优雅的方式？

%%time
for r in train_all_data.index:
    if train_all_data.loc[r, 'id_30'] == 'None':
        if train_all_data.loc[r, 'DeviceInfoShort'] == 'Windows':
            train_all_data.loc[r, 'id_30'] = 'Windows'
        else:
            train_all_data.loc[r, 'id_30'] = 'Android'

Answer 1

Using Pandas / Numpy where : 使用Pandas / Numpy where ：

df['id_30'] = df['id_30'].where(
    df['id_30'].notna(), 
    np.where(df['DeviceInfoShort'] == 'Windows', 'Windows', 'Android'))

Answer 2

temp = train_all_data[train_all_data['id_30'] == 'None']
train_all_data.loc[temp, 'id_30'] = 'Andorid'
temp1 = train_all_data[(train_all_data['id_30'] == 'None') & (train_all_data['DeviceInfoShort'] == 'Windows')]
train_all_data.loc[temp1, 'id_30'] = 'Windows'

Answer 3

Maybe this will be faster: 也许这会更快：

df['id_30'] = df.apply(lambda x: "Windows" if x.id_30 == "None" and x.DeviceInfoShort == "Windows" else "Android")

From my experience using apply() is always faster than looping through 根据我的经验，使用apply（）总是比循环遍历更快

基于另一个列替换列中的值

问题描述

3 个解决方案

解决方案1
1 已采纳 2019-08-13 20:31:13

解决方案2
0 2019-08-13 20:28:11

解决方案3
0 2019-08-13 20:29:00

基于另一个列替换列中的值

问题描述

3 个解决方案

解决方案1 1 已采纳 2019-08-13 20:31:13

解决方案2 0 2019-08-13 20:28:11

解决方案3 0 2019-08-13 20:29:00

解决方案1
1 已采纳 2019-08-13 20:31:13

解决方案2
0 2019-08-13 20:28:11

解决方案3
0 2019-08-13 20:29:00