简体   繁体   English

在 Python3 中解码字节字符串时出错 [TypeError: must be str, not bytes]

[英]Error decoding byte string in Python3 [TypeError: must be str, not bytes]

I'm trying to use in Python 3.6 a piece of code written for Python 2.7, and I'm having trouble managing differences in how byte strings are handled.我试图在 Python 3.6 中使用一段为 Python 2.7 编写的代码,但我在管理字节字符串处理方式的差异时遇到了麻烦。 The code is meant to read a .dat file that existed before I wrote my code.该代码旨在读取在我编写代码之前存在的 .dat 文件。 Running the untouched P2.7 script returns the following error:运行未修改的 P2.7 脚本会返回以下错误:

import numpy as np

buff = ''
dt = np.dtype([('var1', np.uint32, 1), ('var2', np.uint8, 1)])

with open(filename, 'rb') as f:
    for line in f:
        dat = line
--->    buff += dat

    data = np.frombuffer(buffer=buff, dtype=dt)

TypeError: must be str, not bytes

If I get it right, while Python2 will concatenate the read bytes into the string buff without complaining, Python3 cares about the difference between bytes and strings.如果我做对了,虽然 Python2 会毫无怨言地将读取的字节连接到字符串 buff 中,但 Python3 关心字节和字符串之间的区别。 Typecasting line to str(line) returns the following error:将 line 类型转换为 str(line) 会返回以下错误:

    for line in f:
        dat = str(line)
        buff += dat
->  data = np.frombuffer(buffer=buff, dtype=dt)

AttributeError: 'str' object has no attribute '__buffer__'

How should I go about it?我该怎么办? What type should buff be? buff应该是什么类型的? Any solutions that would work for P2.7 and P3.6?任何适用于 P2.7 和 P3.6 的解决方案?

EDIT编辑

It turns out the data in filename.dat is not made of unicode strings at all.事实证明 filename.dat 中的数据根本不是由 unicode 字符串组成的。 I've edited the question to remove mention to my mistaken assumption, and I've added lines of code I'd omitted in trying to show a minimal example that I now realize are relevant.我已经编辑了问题以删除对我错误假设的提及,并且我添加了我在试图展示一个我现在意识到相关的最小示例时省略的代码行。 Sorry for the confusion.很抱歉造成混乱。

Use io.BytesIO for your buffer.使用io.BytesIO作为您的缓冲区。 This is compatible with Python 2 and 3, and preferable to str / bytes concatenation for large datasets.这与 Python 2 和 3 兼容,并且优于大型数据集的str / bytes连接。

import io

import numpy as np


buff = io.BytesIO()
dt = np.dtype([('var1', np.uint32, 1), ('var2', np.uint8, 1)])

with open(filename, 'rb') as f:
    for line in f:
        buff.write(line)

    buff.seek(0)
    data = np.frombuffer(buffer=buff.read(), dtype=dt)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python sendto错误TypeError:必须为str,而不是字节 - Python sendto Error TypeError: must be str, not bytes Python3 类型错误:“值”必须是 str 或字节的实例,而不是元组 - Python3 TypeError: 'value' must be an instance of str or bytes, not a tuple ElementTree TypeError“write()参数必须是str,而不是Python3中的字节” - ElementTree TypeError “write() argument must be str, not bytes” in Python3 TypeError:必须是str,而不是字节Error - TypeError: must be str, not bytes Error 期望在python3中抛出错误为“必须在str中,而不是字节” - expect in python3 is throwing error as “must be in str , not bytes” TypeError: int() 参数必须是字符串、类似字节的对象或数字,而不是 python3 中的“NoneType”错误 - TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType' error in python3 使用Magpie + Tensorflow / Python3,“ TypeError:join()参数必须为str或字节,而不是'NoneType'” - “TypeError: join() argument must be str or bytes, not 'NoneType'” using Magpie+Tensorflow/Python3 TypeError:必须是不包含空字节的字符串,而不是str - TypeError: must be string without null bytes, not str Python - 类型错误:write() 参数必须是 str,而不是字节 - Python - TypeError: write() argument must be str, not bytes TypeError:必须为str,而不是字节 - TypeError: must be str, not bytes
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM