简体   繁体   English

删除scrapy python中的特殊字符

[英]remove special character in scrapy python

I try to remove the special characters between the following text: 我尝试删除以下文本之间的特殊字符:

sample_sample_sample_2.18.14

I tried following patterns for remove those special characters: 我尝试了以下模式来删除这些特殊字符:

item['xxxx'] = item['aaaa'].replace('_' '' ,'-' '')

I can able to remove the _ characters alone. 我可以单独删除_字符。

I try to remove all characters like: . , _ , - , ( , ) 我尝试删除所有字符,例如: . , _ , - , ( , ) . , _ , - , ( , ) . . , _ , - , ( , )

From what I understand, you want to remove non-alphanumeric chars from the string. 据我了解,您想从字符串中删除非字母数字字符。 In this case, it makes more sense to list the characters you want to leave instead of trying to specify every "special" character that you want to remove. 在这种情况下,列出要保留的字符比尝试指定要删除的每个“特殊”字符要有意义。

You can use re.sub() : 您可以使用re.sub()

>>> import re
>>> s = "sample_sample_sample_2.18.14"
>>> re.sub(r'[^a-zA-Z0-9]', '', s)
'samplesamplesample21814'

Here is a solution for removing certain characters. 这是删除某些字符的解决方案。

>>> text = 'sample_sample_sample_2.18.14'
>>> ''.join(c for c in text if c not in '._-()')
'samplesamplesample21814'

Another solution is to keep certain characters but it depends what you want to do. 另一个解决方案是保留某些字符,但这取决于您要做什么。


A more speed optimized equivalent: 更优化的速度等效项:

Python 2: Python 2:

>>> text.translate(None, '._-()')
'samplesamplesample21814'

Python 3: Python 3:

>>> text.translate(str.maketrans('', '', '._-()'))
'samplesamplesample21814'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM