简体   繁体   中英

remove special character in scrapy python

I try to remove the special characters between the following text:

sample_sample_sample_2.18.14

I tried following patterns for remove those special characters:

item['xxxx'] = item['aaaa'].replace('_' '' ,'-' '')

I can able to remove the _ characters alone.

I try to remove all characters like: . , _ , - , ( , ) . , _ , - , ( , ) .

From what I understand, you want to remove non-alphanumeric chars from the string. In this case, it makes more sense to list the characters you want to leave instead of trying to specify every "special" character that you want to remove.

You can use re.sub() :

>>> import re
>>> s = "sample_sample_sample_2.18.14"
>>> re.sub(r'[^a-zA-Z0-9]', '', s)
'samplesamplesample21814'

Here is a solution for removing certain characters.

>>> text = 'sample_sample_sample_2.18.14'
>>> ''.join(c for c in text if c not in '._-()')
'samplesamplesample21814'

Another solution is to keep certain characters but it depends what you want to do.


A more speed optimized equivalent:

Python 2:

>>> text.translate(None, '._-()')
'samplesamplesample21814'

Python 3:

>>> text.translate(str.maketrans('', '', '._-()'))
'samplesamplesample21814'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM