简体   繁体   English

Python UnicodeDecodeError

[英]Python UnicodeDecodeError

I am writing a Python program to read in a DOS tree command outputted into a text document. 我正在编写一个Python程序来读取输出到文本文档中的DOS树命令。 When I reach the 533th iteration of the loop, Eclipse gives an error: 当我到达循环的第533次迭代时,Eclipse会给出错误:

Traceback (most recent call last):
  File "E:\Peter\Documents\Eclipse Workspace\MusicManagement\InputTest.py", line 24, in  <module>
    input = myfile.readline()
  File "C:\Python33\lib\encodings\cp1252.py", line 23, in decode
   return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 3551: character maps  to undefined

I have read other posts, and setting the encoding to latin-1 does not resolve this issue, as it returns a UnicodeDecodeError on another character, and the same with trying to use utf-8. 我已阅读其他帖子,并将编码设置为latin-1无法解决此问题,因为它会在另一个字符上返回UnicodeDecodeError ,并且尝试使用utf-8也是如此。

The following is the code: 以下是代码:

import os
from Album import *

os.system("tree F:\\Music > tree.txt")

myfile = open('tree.txt')
myfile.readline()
myfile.readline()
myfile.readline()

albums = []
x = 0

while x < 533:
    if not input: break
    input = myfile.readline()
    if len(input) < 14:
        artist = input[4:-1]
    elif input[13] != '-':
        artist = input[4:-1]
    else:
        albums.append(Album(artist, input[15:-1], input[8:12]))
    x += 1

for x in albums:
    print(x.artist + ' - ' + x.title + ' (' + str(x.year) + ')')

You need to figure out what encoding tree.com used; 您需要弄清楚tree.com使用的编码tree.com ; according to this post that could any of the MS-DOS codepages. 根据这篇文章 ,可以任何MS-DOS代码页。

You could go through each of the MS-DOS encodings ; 您可以浏览每个MS-DOS编码 ; most of those have a codec in the python standard library . 其中大多数都在python标准库中有一个编解码器 I'd try cp437 and cp500 first; 我先试试cp437cp500 ; the latter is the MS-DOS predecessor of cp1252 I think. 后者是我认为的cp1252的MS-DOS前身。

Pass the encoding to open() : 将编码传递给open()

myfile = open('tree.txt', encoding='cp437')

You really should look into using os.walk() instead of using tree.com for this task though, it'll save you having to deal with issues like these at least. 你真的应该考虑使用os.walk()而不是使用tree.com来完成这项任务,它至少可以帮助你解决这些问题。

In this line: 在这一行:

myfile = open('tree.txt')

you should specify the encoding of your file. 您应该指定文件的编码。 On windows try: 在Windows上尝试:

myfile = open('tree.txt',encoding='cp1250')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM