[英]Applying a function to columns in a dataframe whose column headings contain a specific string
我有一个名为passenger_details 的数据框,如下所示
Passenger Age Gender Commute_to_work Commute_mode Commute_time ...
Passenger1 32 Male I drive to work car 1 hour
Passenger2 26 Female I take the metro train NaN ...
Passenger3 33 Female NaN NaN 30 mins ...
Passenger4 29 Female I take the metro train NaN ...
...
我想应用一个 if 函数,该函数会将缺失值(NaN 值)变为 0 并将当前值变为 1,到其中包含字符串“Commute”的列标题。
这基本上就是我想要实现的
Passenger Age Gender Commute_to_work Commute_mode Commute_time ...
Passenger1 32 Male 1 1 1
Passenger2 26 Female 1 1 0 ...
Passenger3 33 Female 0 0 1 ...
Passenger4 29 Female 1 1 0 ...
...
但是,我正在为如何表达我的代码而苦苦挣扎。 这就是我所做的
passenger_details = passenger_details.filter(regex = 'Location_', axis = 1).apply(lambda value: str(value).replace('value', '1', 'NaN','0'))
但我得到一个类型错误
'replace() takes at most 3 arguments (4 given)'
任何帮助,将不胜感激
Seelct列由Index.contains
和测试都不缺值DataFrame.notna
和最后浇铸到整数的True/False
以1/0
图:
c = df.columns.str.contains('Commute')
df.loc[:, c] = df.loc[:, c].notna().astype(int)
print (df)
Passenger Age Gender Commute_to_work Commute_mode Commute_time
0 Passenger1 32 Male 1 1 1
1 Passenger2 26 Female 1 1 0
2 Passenger3 33 Female 0 0 1
3 Passenger4 29 Female 1 1 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.