[英]Applying a function to columns in a dataframe whose column headings contain a specific string
I have a dataframe called passenger_details which is shown below我有一个名为passenger_details 的数据框,如下所示
Passenger Age Gender Commute_to_work Commute_mode Commute_time ...
Passenger1 32 Male I drive to work car 1 hour
Passenger2 26 Female I take the metro train NaN ...
Passenger3 33 Female NaN NaN 30 mins ...
Passenger4 29 Female I take the metro train NaN ...
...
I want to apply an if function that will turn missing values(NaN values) to 0 and present values to 1, to column headings that have the string 'Commute' in them.我想应用一个 if 函数,该函数会将缺失值(NaN 值)变为 0 并将当前值变为 1,到其中包含字符串“Commute”的列标题。
This is basically what I'm trying to achieve这基本上就是我想要实现的
Passenger Age Gender Commute_to_work Commute_mode Commute_time ...
Passenger1 32 Male 1 1 1
Passenger2 26 Female 1 1 0 ...
Passenger3 33 Female 0 0 1 ...
Passenger4 29 Female 1 1 0 ...
...
However, I'm struggling with how to phrase my code.但是,我正在为如何表达我的代码而苦苦挣扎。 This is what I have done
这就是我所做的
passenger_details = passenger_details.filter(regex = 'Location_', axis = 1).apply(lambda value: str(value).replace('value', '1', 'NaN','0'))
But I get a Type Error of但我得到一个类型错误
'replace() takes at most 3 arguments (4 given)'
Any help would be appreciated任何帮助,将不胜感激
Seelct columns by Index.contains
and test not missing values by DataFrame.notna
and last cast to integer for True/False
to 1/0
map: Seelct列由
Index.contains
和测试都不缺值DataFrame.notna
和最后浇铸到整数的True/False
以1/0
图:
c = df.columns.str.contains('Commute')
df.loc[:, c] = df.loc[:, c].notna().astype(int)
print (df)
Passenger Age Gender Commute_to_work Commute_mode Commute_time
0 Passenger1 32 Male 1 1 1
1 Passenger2 26 Female 1 1 0
2 Passenger3 33 Female 0 0 1
3 Passenger4 29 Female 1 1 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.