从文本文件Python中删除ASCII控制字符

Question

I have a text file from which I have to read a lot of numbers (double). 我有一个文本文件，必须从中读取很多数字（双精度）。 It has ASCII control characters like DLE, NUL etc. which are visible in the text file. 它具有ASCII控制字符，如DLE，NUL等，这些字符在文本文件中可见。 so when I read them to get only the doubles/ints from a line, I am getting erros like "invalid literals \\x10". 因此，当我阅读它们以仅从一行中获取双精度数/整数时，我得到了类似“无效文字\\ x10”的错误提示。 Shown below are the first 2 lines of my file. 下面显示的是我文件的前两行。 ? ？

DLE NUL NUL NUL [1, 167, 133, 6]DLE NUL NUL   
YS FS NUL[0.0, 4.3025989e-07, 1.5446712e-06, 3.1393029e-06, 5.0430463e-06, 7.1382601e-06

How do I remove all these control characters from a text file at once, using Python? 如何使用Python一次从文本文件中删除所有这些控制字符？ I want this to be done before I parse the file into numbers ... 我希望在将文件解析为数字之前完成此操作...

Any help is appreciated! 任何帮助表示赞赏！

Answer 1

Use string.printable . 使用string.printable 。

>>> import string
>>> filter(string.printable.__contains__, '\x00\x01XYZ\x00\x10')
'XYZ'

Answer 2

I know it is very old post, but I am answering as I think, it could help others. 我知道这是一个非常古老的职位，但是我正在回答，因为它可以帮助其他人。

I did as follows. 我做了如下。 It will replace all ASCII control characters by an empty string. 它将用空字符串替换所有ASCII控制字符。

line = re.sub(r'[\x00-\x1F]+', '', line)

Ref: ASCII (American Standard Code for Information Interchange) Code 参考：ASCII（美国信息交换标准代码）代码

Ref: Python re.sub() 参考：Python re.sub（）

从文本文件Python中删除ASCII控制字符

问题描述

2 个解决方案

解决方案1
2 已采纳 2013-07-05 03:39:38

解决方案2
0 2017-04-20 13:54:33

从文本文件Python中删除ASCII控制字符

问题描述

2 个解决方案

解决方案1 2 已采纳 2013-07-05 03:39:38

解决方案2 0 2017-04-20 13:54:33

解决方案1
2 已采纳 2013-07-05 03:39:38

解决方案2
0 2017-04-20 13:54:33