简体   繁体   中英

How to remove escape characters from string in python?

I have string that look like this text = u'\\xd7\\nRecord has been added successfully, record id: 92' . I tried to remove the escape character \\xd7 and \\n from my string so that I could use it for another purpose.

I tried str(text) . It works but it could not remove character \\xd7 .

UnicodeEncodeError: 'ascii' codec can't encode character u'\\xd7' in position 0: ordinal not in range(128)

Any way I could do to remove any escape character as such above from string? Thanks

You can try the following using replace :

text=u'\xd7\nRecord has been added successfully, record id: 92'
bad_chars = ['\xd7', '\n', '\x99m', "\xf0"] 
for i in bad_chars : 
    text = text.replace(i, '') 
text

You could do it by 'slicing' the string:

string = '\xd7\nRecord has been added successfully, record id: 92'
text = string[2:]

It seems you have a unicode string like in python 2.x we have unicode strings like

inp_str = u'\\xd7\\nRecord has been added successfully, record id: 92'

if you want to remove escape charecters which means almost special charecters, i hope this is one of the way for getting only ascii charecters without using any regex or any Hardcoded.

inp_str = u'\xd7\nRecord has been added successfully, record id: 92'
print inp_str.encode('ascii',errors='ignore').strip('\n')

Results :  'Record has been added successfully, record id: 92'

First i did encode because it is already a unicode, So while encoding to ascii if any charecters not in ascii level,It will Ignore.And you just strip '\\n'

Hope this helps you :)

I believe Regex can help

import re
text = u'\xd7\nRecord has been added successfully, record id: 92'
res = re.sub('[^A-Za-z0-9]+', ' ', text).strip()

Result:

'Record has been added successfully record id 92'

You could use the built-in regex library.

import re
text = u'\xd7\nRecord has been added successfully, record id: 92'
result = re.sub('[^A-Za-z0-9]+', ' ', text)

print(result)

That spits out Record has been added successfully record id 92

This seems to pass your test case if you can live without the punctuation.

Try regex .


import re
def escape_ansi(line):
    ansi_escape =re.compile(r'(\xd7|\n)')
    return ansi_escape.sub('', line)

text = u'\xd7\nRecord has been added successfully, record id: 92'
print(escape_ansi(text))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM