过滤 dataframe 中的特殊字符

Question

I have the following dataframe called data :我有以下 dataframe 称为data ：

    metrics    artists

0    0.21    ['ZhanÃ©']
2    0.14    ['Mose Allison']
3    0.87    ['水柳仙']
4    0.25    ['Shel Silverstein']

Some records of the column "artists" have special characters, I want to make another df with the records that have special characters, that is, the following output: “艺术家”列的一些记录有特殊字符，我想用有特殊字符的记录再做一个df，即以下output：

data:数据：

     metrics    artists

0    0.14    ['Mose Allison']
1    0.25    ['Shel Silverstein']

data2:数据2：

     metrics    artists

0    0.21    ['ZhanÃ©']
1    0.14    ['水柳仙']

use:利用：

 data2=data.artists[data.artists.str.contains("[^a-zA-Z0-9]")]

but I get the original df,但我得到了原始的df，

I also tried with:我也尝试过：

data2 = []
for x in data['artists']:
    if x is not "[^a-zA-Z0-9 ]":
         data2[x]=data[x]
    print(data2)

but it gives me the error:但它给了我错误：

KeyError: "['ZhanÃ©']"

and with:与：

if x is "[^ a-zA-Z0-9]"

returns empty records.返回空记录。

Answer 1

use:利用：

data2=data.artists[data.artists.str.contains("[^a-zA-Z0-9]")] data2=data.artists[data.artists.str.contains("[^a-zA-Z0-9]")]

but I get the original df,但我得到了原始的df，

You're missing a space in "[^a-zA-Z0-9]" which is why you're getting the original df.您在“[^a-zA-Z0-9]”中缺少一个空格，这就是您获得原始 df 的原因。 Tested with Python3 in a Jupyter notebook.在 Jupyter 笔记本中使用 Python3 进行了测试。

过滤 dataframe 中的特殊字符

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-05-25 03:19:20

过滤 dataframe 中的特殊字符

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-05-25 03:19:20

解决方案1
1 已采纳 2021-05-25 03:19:20