Python3 TypeError：需要一个类似字节的对象，而不是“str”

Question

我正在尝试遵循此 OpenCV 练习http://coding-robin.de/2013/07/22/train-your-own-opencv-haar-classifier.html但卡在了运行 mergevec.py 的步骤（我使用 Python 版本而不是 .cpp 一个）。 我有 Python 3 而不是文章中的 Python 2.x。

此文件的来源是https://github.com/wulfebw/mergevec/blob/master/mergevec.py

我得到的错误是

Traceback (most recent call last):
  File "./tools/mergevec1.py", line 96, in <module>
    merge_vec_files(vec_directory, output_filename)
  File "./tools/mergevec1.py", line 45, in merge_vec_files
    val = struct.unpack('<iihh', content[:12])
TypeError: a bytes-like object is required, not 'str'

我试图遵循这个python 3.5: TypeError: a bytes-like object is required, not 'str' when write to a file and used open(f, 'r', encoding='utf-8', errors='ignore')但仍然没有运气。

我修改后的代码如下：

import sys
import glob
import struct
import argparse
import traceback


def exception_response(e):
    exc_type, exc_value, exc_traceback = sys.exc_info()
    lines = traceback.format_exception(exc_type, exc_value, exc_traceback)
    for line in lines:
        print(line)

def get_args():
    parser = argparse.ArgumentParser()
    parser.add_argument('-v', dest='vec_directory')
    parser.add_argument('-o', dest='output_filename')
    args = parser.parse_args()
    return (args.vec_directory, args.output_filename)

def merge_vec_files(vec_directory, output_vec_file):


    # Check that the .vec directory does not end in '/' and if it does, remove it.
    if vec_directory.endswith('/'):
        vec_directory = vec_directory[:-1]
    # Get .vec files
    files = glob.glob('{0}/*.vec'.format(vec_directory))

    # Check to make sure there are .vec files in the directory
    if len(files) <= 0:
        print('Vec files to be mereged could not be found from directory: {0}'.format(vec_directory))
        sys.exit(1)
    # Check to make sure there are more than one .vec files
    if len(files) == 1:
        print('Only 1 vec file was found in directory: {0}. Cannot merge a single file.'.format(vec_directory))
        sys.exit(1)


    # Get the value for the first image size
    prev_image_size = 0
    try:
        with open(files[0], 'r', encoding='utf-8', errors='ignore') as vecfile:
            content = ''.join(str(line) for line in vecfile.readlines())
            val = struct.unpack('<iihh', content[:12])
            prev_image_size = val[1]
    except IOError as e:
        f = None
        print('An IO error occured while processing the file: {0}'.format(f))
        exception_response(e)


    # Get the total number of images
    total_num_images = 0
    for f in files:
        try:
            with open(f, 'r', encoding='utf-8', errors='ignore') as vecfile:
                content = ''.join(str(line) for line in vecfile.readlines())
                val = struct.unpack('<iihh', content[:12])
                num_images = val[0]
                image_size = val[1]
                if image_size != prev_image_size:
                    err_msg = """The image sizes in the .vec files differ. These values must be the same. \n The image size of file {0}: {1}\n 
                        The image size of previous files: {0}""".format(f, image_size, prev_image_size)
                    sys.exit(err_msg)

                total_num_images += num_images
        except IOError as e:
            print('An IO error occured while processing the file: {0}'.format(f))
            exception_response(e)


    # Iterate through the .vec files, writing their data (not the header) to the output file
    # '<iihh' means 'little endian, int, int, short, short'
    header = struct.pack('<iihh', total_num_images, image_size, 0, 0)
    try:
        with open(output_vec_file, 'wb') as outputfile:
            outputfile.write(header)

            for f in files:
                with open(f, 'w', encoding='utf-8', errors='ignore') as vecfile:
                    content = ''.join(str(line) for line in vecfile.readlines())
                    data = content[12:]
                    outputfile.write(data)
    except Exception as e:
        exception_response(e)


if __name__ == '__main__':
    vec_directory, output_filename = get_args()
    if not vec_directory:
        sys.exit('mergvec requires a directory of vec files. Call mergevec.py with -v /your_vec_directory')
    if not output_filename:
        sys.exit('mergevec requires an output filename. Call mergevec.py with -o your_output_filename')

    merge_vec_files(vec_directory, output_filename)

你知道我做错了什么吗？ 谢谢。

更新 1

我这样做了：

content = b''.join(str(line) for line in vecfile.readlines())

我基本上在前面加了“b”。 但是，现在我遇到了不同的错误：

回溯（最近一次调用）：文件“./tools/mergevec1.py”，第 97 行，在 merge_vec_files(vec_directory, output_filename) 文件“./tools/mergevec1.py”，第 44 行，在 merge_vec_files content = b'' .join(str(line) for line in vecfile.readlines()) TypeError: sequence item 0: expected a bytes-like object, str found

Answer 1

正如 OP 所解释的，该文件包含二进制数据。 为了使用二进制数据：

该文件应该以二进制模式open ，在open调用中使用'rb'作为模式。
打开文件后，使用.read()而不是.readlines()来读取数据。 这避免了由.readlines()处理行结束字符的方式可能导致的数据损坏。
避免将字节数组转换为字符数组（字符串）的诸如.join()之类的操作。

对于问题中提供的代码，读取图像的代码部分应该是：

for f in files:
    try:
        with open(f, 'rb') as vecfile:
            content = vecfile.read()
            val = struct.unpack('<iihh', content[:12])
            num_images = val[0]
            image_size = val[1]
            if image_size != prev_image_size:
                err_msg = """The image sizes in the .vec files differ. These values must be the same. \n The image size of file {0}: {1}\n 
                    The image size of previous files: {0}""".format(f, image_size, prev_image_size)
                sys.exit(err_msg)

            total_num_images += num_images
    except IOError as e:
        print('An IO error occured while processing the file: {0}'.format(f))
        exception_response(e)

Answer 2

当我改变它时，我能够解决我的问题：

for f in files:
            with open(f, 'rb') as vecfile:
                content = ''.join(str(line) for line in vecfile.readlines())
                data = content[12:]
                outputfile.write(data)
except Exception as e:
    exception_response(e)

为了它：

for f in files:
            with open(f, 'rb') as vecfile:
                content = b''.join((line) for line in vecfile.readlines())
                outputfile.write(bytearray(content[12:]))
except Exception as e:
    exception_response(e)

就像我改变它之前一样：

content = ''.join(str(line) for line in vecfile.readlines())

为了它：

content = b''.join((line) for line in vecfile.readlines())

因为它正在等待一些 str，现在它能够接收我们需要的二进制档案。

您保留错误是因为您正在使用代码

content = b''.join(str(line) for line in vecfile.readlines())

你必须使用：

content = b''.join((line) for line in vecfile.readlines())

那是没有“str”演员表。

Python3 TypeError：需要一个类似字节的对象，而不是“str”

问题描述

2 个解决方案

解决方案1
3 已采纳 2017-03-26 00:56:22

解决方案2
0 2020-03-17 01:26:49

Python3 TypeError：需要一个类似字节的对象，而不是“str”

问题描述

2 个解决方案

解决方案1 3 已采纳 2017-03-26 00:56:22

解决方案2 0 2020-03-17 01:26:49

解决方案1
3 已采纳 2017-03-26 00:56:22

解决方案2
0 2020-03-17 01:26:49