Python - 从文本字符串中删除方形符号

Question

a='ÿþ"[]B[]a[]l[]a[]n[]c[]e'

NOTE: The open and close square brackets represent this square symbol. 注意：打开和关闭方括号表示此方形符号。 I cannot however copy and paste the square symbol into here to show you exactly what I'm looking at. 然而，我无法将方形符号复制并粘贴到此处以向您显示我正在查看的内容。

The characters in 'a' represent the beginning of a file I've downloaded. 'a'中的字符代表我下载的文件的开头。 It is a csv file, unicode. 它是一个csv文件，unicode。 How do I remove these unwanted characters? 如何删除这些不需要的字符？ I would just like to recover the word 'balance' from a. 我只想恢复一个'平衡'这个词。

The code I've used to simply this example: 我习惯的代码就是这个例子：

fi = open(path+fn, 'r')
data = fi.read()
fi.close()
print(data)

Where fn is a csv file. 其中fn是csv文件。

Tried: 尝试：

data=data.encode()
d=replace('\x00','')

which produced error: 产生了错误：

TypeError: expected bytes, bytearray or buffer compatible object

Answer 1

You need to specify the right encoding when opening the file. 打开文件时需要指定正确的编码。 Try 尝试

open(path+fn, 'r', encoding="utf-16")

(I'm guessing utf-16 because ASCII characters seem to be encoded in two bytes in the sample string) （我猜utf-16因为ASCII字符似乎在样本字符串中以两个字节编码）

Answer 2

If you don't want to mess with encoding, string.printable is a list of 'printable' chars which may be what you're looking for. 如果你不想搞乱编码， string.printable是一个'可打印'字符列表，可能是你正在寻找的。

>>> from string import printable
>>> best_string_ever = filter(lambda x: x in printable, a)
>>> best_string_ever
'"Balance'

Answer 3

If you can show the character value, then you can use the strip(u'\\uxxx\u0026#39;) command 如果可以显示字符值，则可以使用strip（u'\\ uxxx'）命令

use the replace() method 使用replace（）方法

newstring = textstring.replace(u'\uxxx', '')

In this case pass in the actual character encoding that you want. 在这种情况下，传入您想要的实际字符编码。

Python - 从文本字符串中删除方形符号

问题描述

3 个解决方案

解决方案1
2 已采纳 2014-02-26 17:29:34

解决方案2
0 2014-02-26 17:29:54

解决方案3
0 2014-02-26 17:33:07

Python - 从文本字符串中删除方形符号

问题描述

3 个解决方案

解决方案1 2 已采纳 2014-02-26 17:29:34

解决方案2 0 2014-02-26 17:29:54

解决方案3 0 2014-02-26 17:33:07

解决方案1
2 已采纳 2014-02-26 17:29:34

解决方案2
0 2014-02-26 17:29:54

解决方案3
0 2014-02-26 17:33:07