[英]Replace multiple strings and numbers from multiple columns with NaN in Pandas
如果我有以下 dataframe,我想通過將多個字符串和數字替換為NaN
來清理數據:即。 68, Tardeo Road
和0
來自state
, 567
來自dept
和#ERROR!
和123
來自phonenumber
:
id state dept \
0 1 Abu Dhabi {Marketing}
1 2 MO {Other}
2 3 68, Tardeo Road {"Human Resources"}
3 4 National Capital Territory of Delhi {"Human Resources"}
4 5 Aargau Canton {Marketing}
5 6 Aargau Canton 567
6 18 NB {"Finance & Administration"}
7 19 0 {Sales}
8 20 Abu Dhabi {"Human Resources"}
9 21 Aargau {"Finance & Administration"}
phonenumber
0 123
1 5635888000
2 18006708450
3 #ERROR!
4 12032722596
5 18003928343
6 NaN
7 #ERROR!
8 NaN
9 NaN
我嘗試了以下代碼:
解決方案1:
mask = (df.state == '0') | (df.state == '68, Tardeo Road')
df.loc[mask, ['state']] = np.nan
解決方案2:
df.loc[(df.state == '68, Tardeo Road') | (df.state == 0), 'state'] = np.nan
解決方案3:
df.loc[df.state == '0', 'state'] = np.nan
df.loc[df.state == '68, Tardeo Road', 'state'] = np.nan
它們都有效,但如果我將它們應用於多個列,它會有點長。
只是想知道是否有可能使它更簡潔和高效? 例如,通過使用str.replace
。 謝謝。
你可以做一個替換:
df = df.replace({'state':['68, Tardeo Road','0'],
'dept':['567'],
'phonenumber':['#ERROR!','123']}, np.nan)
Output:
id state dept phonenumber
-- ---- ----------------------------------- ---------------------------- -------------
0 1 Abu Dhabi {Marketing} nan
1 2 MO {Other} 5635888000
2 3 nan {"Human Resources"} 18006708450
3 4 National Capital Territory of Delhi {"Human Resources"} nan
4 5 Aargau Canton {Marketing} 12032722596
5 6 Aargau Canton nan 18003928343
6 18 NB {"Finance & Administration"} nan
7 19 nan {Sales} nan
8 20 Abu Dhabi {"Human Resources"} nan
9 21 Aargau {"Finance & Administration"} nan
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.