[英]How to use .apply function on a pandas DataFrame that has been filtered by regex?
I have a pandas DataFrame with data scraped from a couple Wiki tables. 我有一个pandas DataFrame,其数据来自几个Wiki表。 The DataFrame has a column for names and some of these names are followed by "\\r\\n(head coach)".
DataFrame有一个名称列,其中一些名称后跟“\\ r \\ n(主教练)”。 I would like to remove that and so I tried this:
我想删除它,所以我尝试了这个:
df['name'][df.name.str.contains(r'coach')] =\
df['name'][df.name.str.contains(r'coach')].apply(lambda x: x[0:-14])
When this runs, I get a SettingWithCopyWarning. 当它运行时,我得到一个SettingWithCopyWarning。 I tried using .loc as suggested in this SO Q&A :
我尝试使用.loc,如本问答所示 :
mask = df.loc[:,'name'] == df['name'].str.contains(r'coach')
But every value returns as False and so I get an empty Series when I use this with my DataFrame. 但是每个值都返回False,因此当我在DataFrame中使用它时,我得到一个空系列。
I'm not sure where I am going wrong with this. 我不确定我的错在哪里。 Any pointers?
有什么指针吗?
You can try this: 你可以试试这个:
mask = df.name.str.contains(r'coach')]
df.loc[mask, 'name'] = df.loc[mask, 'name'].str[:-14]
Or as @piRSquared commented, this simple line should also work: 或者@piRSquared评论说,这个简单的行也应该有效:
df.loc[mask, 'name'] = df.name.str[:-14]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.