UnicodeDecodeError无法解码通过写入和读取（使用熊猫）到文件中解决的字节

Question

I have an excel-like data structure composed of bytes that I was not able to decode. 我有一个类似Excel的数据结构，由无法解码的字节组成。

It is a list that looks like: 它是一个看起来像这样的列表：

my_object = [b'\xd0\xcf\x11\xe0\xa1\xb1\x1a\xe1..., ........, b'\x00\x00\x00\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff']

(Note that the last line of my_object is an actual one and is fully written here.) （请注意，my_object的最后一行是实际的一行，并在此处完整编写。）

If I try decoding lines independently I get: 如果我尝试独立解码行，则会得到：

my_object[-1].decode()
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 3: invalid start byte

(Note that I tried several different codecs, including: 'utf8', 'ascii', 'ISO-8859-2', 'gbk', 'latin_1', ...) （请注意，我尝试了几种不同的编解码器，包括：'utf8'，'ascii'，'ISO-8859-2'，'gbk'，'latin_1'，...）

However, if I try to save my_object to a file first, using: 但是，如果我尝试先使用以下方法将my_object保存到文件中：

f = open('test.xls','wb')
[f.write(my_object[i]) for i in range(len(my_object))]
f.close()

and then open it using pandas like: 然后使用像这样的熊猫打开它：

import pandas as pd
pd.read_excel('test.xls')

I get the expected result: 我得到了预期的结果：

     Time (s)  Acceleration x (m/s^2)  Acceleration y (m/s^2)  \
0    0.000000                0.863679                0.196953   
1    0.002500                0.892268                0.206483   
2    0.005001                0.844621                0.196953  
......

This is a nice workaround, however, I really would like to avoid writing and reading from and to the disk to perform such an operation. 这是一个不错的解决方法，但是，我真的很想避免对磁盘进行读写操作来执行此操作。

Can anyone help? 有人可以帮忙吗？

Thank you in advance. 先感谢您。

Answer 1

If you just want pandas to read in an excel file when you already have the raw bytes in memory, you can use the io package to turn a string or bytes into a readable file in memory: 如果您只希望熊猫在内存中已经有原始字节的情况下读取excel文件，则可以使用io包将一个或多个字符串转换为内存中的可读文件：

import io
file_bytes = b''.join(my_object)
pd.read_excel(io.BytesIO(file_bytes))

UnicodeDecodeError无法解码通过写入和读取（使用熊猫）到文件中解决的字节

问题描述

1 个解决方案

解决方案1
0 已采纳 2019-09-19 21:51:57

UnicodeDecodeError无法解码通过写入和读取（使用熊猫）到文件中解决的字节

问题描述

1 个解决方案

解决方案1 0 已采纳 2019-09-19 21:51:57

解决方案1
0 已采纳 2019-09-19 21:51:57