[英]Python UnicodeDecodeError
I am writing a Python program to read in a DOS tree command outputted into a text document. 我正在编写一个Python程序来读取输出到文本文档中的DOS树命令。 When I reach the 533th iteration of the loop, Eclipse gives an error:
当我到达循环的第533次迭代时,Eclipse会给出错误:
Traceback (most recent call last):
File "E:\Peter\Documents\Eclipse Workspace\MusicManagement\InputTest.py", line 24, in <module>
input = myfile.readline()
File "C:\Python33\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 3551: character maps to undefined
I have read other posts, and setting the encoding to latin-1 does not resolve this issue, as it returns a UnicodeDecodeError
on another character, and the same with trying to use utf-8. 我已阅读其他帖子,并将编码设置为latin-1无法解决此问题,因为它会在另一个字符上返回
UnicodeDecodeError
,并且尝试使用utf-8也是如此。
The following is the code: 以下是代码:
import os
from Album import *
os.system("tree F:\\Music > tree.txt")
myfile = open('tree.txt')
myfile.readline()
myfile.readline()
myfile.readline()
albums = []
x = 0
while x < 533:
if not input: break
input = myfile.readline()
if len(input) < 14:
artist = input[4:-1]
elif input[13] != '-':
artist = input[4:-1]
else:
albums.append(Album(artist, input[15:-1], input[8:12]))
x += 1
for x in albums:
print(x.artist + ' - ' + x.title + ' (' + str(x.year) + ')')
You need to figure out what encoding tree.com
used; 您需要弄清楚
tree.com
使用的编码tree.com
; according to this post that could any of the MS-DOS codepages. 根据这篇文章 ,可以任何MS-DOS代码页。
You could go through each of the MS-DOS encodings ; 您可以浏览每个MS-DOS编码 ; most of those have a codec in the python standard library .
其中大多数都在python标准库中有一个编解码器 。 I'd try
cp437
and cp500
first; 我先试试
cp437
和cp500
; the latter is the MS-DOS predecessor of cp1252 I think. 后者是我认为的cp1252的MS-DOS前身。
Pass the encoding to open()
: 将编码传递给
open()
:
myfile = open('tree.txt', encoding='cp437')
You really should look into using os.walk()
instead of using tree.com
for this task though, it'll save you having to deal with issues like these at least. 你真的应该考虑使用
os.walk()
而不是使用tree.com
来完成这项任务,它至少可以帮助你解决这些问题。
In this line: 在这一行:
myfile = open('tree.txt')
you should specify the encoding of your file. 您应该指定文件的编码。 On windows try:
在Windows上尝试:
myfile = open('tree.txt',encoding='cp1250')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.