[英]Applying a function to columns in a dataframe whose column headings contain a specific string
我有一個名為passenger_details 的數據框,如下所示
Passenger Age Gender Commute_to_work Commute_mode Commute_time ...
Passenger1 32 Male I drive to work car 1 hour
Passenger2 26 Female I take the metro train NaN ...
Passenger3 33 Female NaN NaN 30 mins ...
Passenger4 29 Female I take the metro train NaN ...
...
我想應用一個 if 函數,該函數會將缺失值(NaN 值)變為 0 並將當前值變為 1,到其中包含字符串“Commute”的列標題。
這基本上就是我想要實現的
Passenger Age Gender Commute_to_work Commute_mode Commute_time ...
Passenger1 32 Male 1 1 1
Passenger2 26 Female 1 1 0 ...
Passenger3 33 Female 0 0 1 ...
Passenger4 29 Female 1 1 0 ...
...
但是,我正在為如何表達我的代碼而苦苦掙扎。 這就是我所做的
passenger_details = passenger_details.filter(regex = 'Location_', axis = 1).apply(lambda value: str(value).replace('value', '1', 'NaN','0'))
但我得到一個類型錯誤
'replace() takes at most 3 arguments (4 given)'
任何幫助,將不勝感激
Seelct列由Index.contains
和測試都不缺值DataFrame.notna
和最后澆鑄到整數的True/False
以1/0
圖:
c = df.columns.str.contains('Commute')
df.loc[:, c] = df.loc[:, c].notna().astype(int)
print (df)
Passenger Age Gender Commute_to_work Commute_mode Commute_time
0 Passenger1 32 Male 1 1 1
1 Passenger2 26 Female 1 1 0
2 Passenger3 33 Female 0 0 1
3 Passenger4 29 Female 1 1 0
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.