简体   繁体   English

正则表达式删除非字母数字字符不起作用

[英]Regular Expression to remove non alpha numeric characters is not working

Converted a column of a Pandas dataframe to list. 将Pandas数据框的列转换为列表。 Then lowercased all the elements in the list. 然后将列表中的所有元素都小写。 Now want to keep only alphabets in the elements of the list. 现在只想在列表的元素中保留字母。 I wrote a regular expression for that. 我为此写了一个正则表达式。 The regex is not working. 正则表达式不起作用。

df_smer_orig = pd.read_csv('sample.csv', engine='python')
df_smer = df_smer_orig['Item'].tolist()
df_smer = [x.lower() for x in df_smer] 

for x in df_smer:
    print(x)
    regex = re.compile('[^a-zA-Z]')
    regex.sub('', x)
    print(x)

print(df_smer)

Partial output of the code which shows the regex did not work: 显示正则表达式的代码的部分输出不起作用:

agarbathi / incense sticks
agarbathi / incense sticks
worcestershire sauce- 295ml
worcestershire sauce- 295ml

Is that right? 那正确吗?

text = re.sub(r'[^a-zA-Z]', '', text)

demo: http://tpcg.io/ZADE7f 演示: http//tpcg.io/ZADE7f

Your code is correct but you have to assign the result back to the variable get the desired output. 您的代码是正确的,但是您必须将结果分配回变量以获得所需的输出。

df_smer_orig = pd.read_csv('sample.csv', engine='python')
df_smer = df_smer_orig['Item'].tolist()
df_smer = [x.lower() for x in df_smer] 

for x in df_smer:
    print(x)
    regex = re.compile('[^a-zA-Z]')
    x = regex.sub('', x)
    print(x)

print(df_smer)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM