[英]Regular Expression to remove non alpha numeric characters is not working
Converted a column of a Pandas dataframe to list. 将Pandas数据框的列转换为列表。 Then lowercased all the elements in the list. 然后将列表中的所有元素都小写。 Now want to keep only alphabets in the elements of the list. 现在只想在列表的元素中保留字母。 I wrote a regular expression for that. 我为此写了一个正则表达式。 The regex is not working. 正则表达式不起作用。
df_smer_orig = pd.read_csv('sample.csv', engine='python')
df_smer = df_smer_orig['Item'].tolist()
df_smer = [x.lower() for x in df_smer]
for x in df_smer:
print(x)
regex = re.compile('[^a-zA-Z]')
regex.sub('', x)
print(x)
print(df_smer)
Partial output of the code which shows the regex did not work: 显示正则表达式的代码的部分输出不起作用:
agarbathi / incense sticks
agarbathi / incense sticks
worcestershire sauce- 295ml
worcestershire sauce- 295ml
Is that right? 那正确吗?
text = re.sub(r'[^a-zA-Z]', '', text)
demo: http://tpcg.io/ZADE7f 演示: http : //tpcg.io/ZADE7f
Your code is correct but you have to assign the result back to the variable get the desired output. 您的代码是正确的,但是您必须将结果分配回变量以获得所需的输出。
df_smer_orig = pd.read_csv('sample.csv', engine='python')
df_smer = df_smer_orig['Item'].tolist()
df_smer = [x.lower() for x in df_smer]
for x in df_smer:
print(x)
regex = re.compile('[^a-zA-Z]')
x = regex.sub('', x)
print(x)
print(df_smer)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.