简体   繁体   English

使用lambda条件和pandas str.contains来整理字符串

[英]Using lambda conditional and pandas str.contains to lump strings

Trying to learn some stuff, I'm messing around with the global shark attack database on Kaggle and I'm trying to find the best way to lump strings using a lambda function and str.contains . 试图学习一些东西,我正在搞乱Kaggle上的全球鲨鱼攻击数据库,我正试图找到使用lambda函数和str.contains字符串的最佳方法。

Basically anywhere a string contains a phrase with skin diving eg 'skin diving for abalone' , in the data['Activity'] column I want to replace the activity with skin diving . 基本上任何字符串都包含skin diving的短语,例如'skin diving for abalone' ,在data['Activity']栏中,我想用skin diving取代活动。 (there are 92 variations for skin diving hence trying to use the lambda function) (皮肤潜水有92种变种因此尝试使用lambda功能)

I can return a boolean series using 我可以使用返回一个布尔系列

data['Activity].str.contains('skin diving')

But I'm unsure how to change the value if this condition is true 但是如果这个条件成立,我不确定如何更改值

My lambda function = data.apply(lambda x: 'free diving' if x.str.contains('free diving)) but i'm getting a syntax error and i'm not familiar enough with lambda functions and pandas to get it right, any help would be appreciated. 我的lambda函数= data.apply(lambda x: 'free diving' if x.str.contains('free diving))但我得到一个语法错误,我不熟悉lambda函数和pandas得到它对,任何帮助将不胜感激。

您可以使用lambda中的in运算符来测试子字符串,而不是使用Series.str方法

data['activity'] = data['activity'].apply(lambda x: 'skin diving' if 'skin diving' in x else x)

You could use str.contains method with np.where 你可以在np.where使用str.contains方法

In [141]: df
Out[141]:
         activity
0  free diving ok
1              ok

In [142]: df.activity = np.where(df.activity.str.contains('free diving'),
                                 'free diving', df.activity)

In [143]: df
Out[143]:
      activity
0  free diving
1           ok

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM