如何摆脱python字符串中的怪异字符？

Question

I have lines that contains some pesky control characters: 我的行包含一些讨厌的控制字符：

在此处输入图片说明

When I tried to read the file and then do a str.replace() , these control characters didn't get replaced. 当我尝试读取文件然后执行str.replace() ，这些控制字符没有被替换。 I've tried this but it's still sticking around. 我已经尝试过了，但它仍然存在。

with io.open('infile', 'r', encoding='utf8') as fin:
    for line in fin:
        line = line.replace(u'\u0094', '"').replace(u'\u0093', '"').replace(u'\u0092', "'").replace(u'\u0096', '"').replace(u'\u0084', '"')

How do I get these strings replaces? 如何获得这些字符串替换？ Is there a cannonical way to replace these strings (they look like quotation marks / whitespaces of various kind)? 有没有一种规范的方法可以替换这些字符串（它们看起来像引号/各种空白）？

What are these characters anyway? 这些字符到底是什么？ What is u'\' ? 什么是u'\' ？

Answer 1

上次遇到该问题时，是因为我从ascii范围以外获取字符，所以边界错误。

如何摆脱python字符串中的怪异字符？

问题描述

1 个解决方案

解决方案1
0 2015-04-29 14:32:18

如何摆脱python字符串中的怪异字符？

问题描述

1 个解决方案

解决方案1 0 2015-04-29 14:32:18

解决方案1
0 2015-04-29 14:32:18