繁体   English   中英

如何从python中的字符串中删除转义符?

[英]How to remove escape characters from string in python?

我有一个看起来像这样的字符串text = u'\\xd7\\nRecord has been added successfully, record id: 92' 我试图从字符串中删除转义字符\\xd7\\n ,以便可以将其用于其他目的。

我尝试了str(text) 它可以工作,但是不能删除字符\\xd7

UnicodeEncodeError:'ascii'编解码器无法在位置0编码字符u'\\ xd7':序数不在范围内(128)

我有什么办法可以从字符串中删除上述任何转义字符? 谢谢

您可以使用replace尝试以下操作:

text=u'\xd7\nRecord has been added successfully, record id: 92'
bad_chars = ['\xd7', '\n', '\x99m', "\xf0"] 
for i in bad_chars : 
    text = text.replace(i, '') 
text

您可以通过“切片”字符串来实现:

string = '\xd7\nRecord has been added successfully, record id: 92'
text = string[2:]

似乎您有一个像python 2.x这样的unicode字符串,我们有一个像

inp_str = u'\\ xd7 \\ n已成功添加记录,记录ID:92'

如果要删除转义字符,这意味着几乎是特殊的字符,我希望这是不使用任何正则表达式或任何硬编码的仅获取ascii字符的方法之一。

inp_str = u'\xd7\nRecord has been added successfully, record id: 92'
print inp_str.encode('ascii',errors='ignore').strip('\n')

Results :  'Record has been added successfully, record id: 92'

首先我确实进行了编码,因为它已经是unicode了,所以在编码为ascii时,如果有任何字符不在ascii级别,它将被忽略。您只需去除'\\ n'

希望这对您有所帮助:)

我相信正则表达式可以提供帮助

import re
text = u'\xd7\nRecord has been added successfully, record id: 92'
res = re.sub('[^A-Za-z0-9]+', ' ', text).strip()

结果:

'Record has been added successfully record id 92'

您可以使用内置的正则表达式库。

import re
text = u'\xd7\nRecord has been added successfully, record id: 92'
result = re.sub('[^A-Za-z0-9]+', ' ', text)

print(result)

吐出Record has been added successfully record id 92

如果您可以生活在没有标点符号的情况下,这似乎可以通过您的测试案例。

尝试regex


import re
def escape_ansi(line):
    ansi_escape =re.compile(r'(\xd7|\n)')
    return ansi_escape.sub('', line)

text = u'\xd7\nRecord has been added successfully, record id: 92'
print(escape_ansi(text))

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM