[英]Replacing a column value by another column value based on regex - Python
[英]Replacing a value in an column based on another Column
我正在嘗試基於另一個現有列替換列中的值。
2列看起來像這樣
id_30 DeviceInfoShort
Android SAMSUNG
iOS iOS
None Windows
None None
Mac MacOS
Windows Windows
None None
id_30列的“無”未在圖片中顯示。 我想要的是id_30列中的所有“ None”值,它將檢查DeviceInfoShort中的值是否為“ Windows”,如果是,則將id_30中的“ None”替換為“ Windows”,否則替換為“ Android”
下面的代碼是我所擁有的。 它工作正常,但運行了10分鍾。 我想我可以在這里使用map / apply使其更快...是否有使用熊貓的更優雅的方式?
%%time
for r in train_all_data.index:
if train_all_data.loc[r, 'id_30'] == 'None':
if train_all_data.loc[r, 'DeviceInfoShort'] == 'Windows':
train_all_data.loc[r, 'id_30'] = 'Windows'
else:
train_all_data.loc[r, 'id_30'] = 'Android'
使用Pandas / Numpy where
:
df['id_30'] = df['id_30'].where(
df['id_30'].notna(),
np.where(df['DeviceInfoShort'] == 'Windows', 'Windows', 'Android'))
temp = train_all_data[train_all_data['id_30'] == 'None']
train_all_data.loc[temp, 'id_30'] = 'Andorid'
temp1 = train_all_data[(train_all_data['id_30'] == 'None') & (train_all_data['DeviceInfoShort'] == 'Windows')]
train_all_data.loc[temp1, 'id_30'] = 'Windows'
也許這會更快:
df['id_30'] = df.apply(lambda x: "Windows" if x.id_30 == "None" and x.DeviceInfoShort == "Windows" else "Android")
根據我的經驗,使用apply()總是比循環遍歷更快
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.