简体   繁体   English

将二进制数据打包和解压缩到列表中的最快方法

[英]Fastest way to pack and unpack binary data into list

I am writing a script which will read 32 bytes of data from over thousands of files. 我正在编写一个脚本,它将从数千个文件中读取32字节的数据。 The 32 bytes consists of 8 pairs of 16-bit integers and I want to unpack them to Python integers to build a list consisting of average numbers. 32个字节由8对16位整数组成,我想将它们解包为Python整数以构建由平均值组成的列表。 I would then like to print out a hex string (packed the same way it was unpacked) of the list, along with the list object itself, to the user running the script. 然后,我想将列表的十六进制字符串(以与解压缩相同的方式打包)与列表对象一起打印给运行脚本的用户。

My current code looks like this, and it's slower than I'd like it to be (even considering the heavy I/O load): 我当前的代码如下所示,并且比我想要的要慢(即使考虑到沉重的I / O负载):

import os
import sys
import struct
import binascii

def list_str(list):
    return str(list)

def list_s16be_hex(list):
    i = 0
    bytes = b""
    while i < len(list):
        bytes += struct.pack(">h", list[i])
        i += 1
    return binascii.hexlify(bytes).decode("ascii")

def main():
    averages = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
    root = os.path.dirname(__file__)
    for dirpath, dirnames, filenames in os.walk(root):
        for filename in filenames:
            with open(os.path.join(dirpath, filename), "rb") as f:
                f.seek(0x10)
                tmp = f.read(32)

            i = 0
            while i < 32:
                averages[i//2] = (averages[i//2] + struct.unpack(">h", tmp[i:i+2])[0]) // 2
                i += 2

    print("Updated averages (hex): " + list_s16be_hex(averages))
    print("Updated averages (list): " + list_str(averages))

    return 0

if __name__=="__main__":
    main()

Is there a more efficient way of doing this? 有更有效的方法吗?

You can unpack all 16 integers at once, using struct.unpack(">16h", tmp) , which should be significantly faster for the computational part. 您可以使用struct.unpack(">16h", tmp)一次解压缩所有16个整数,这对于计算部分而言应该明显更快。 Otherwise I'd expect your program runtime to be dominated by the I/O, which you can check by measuring it's runtime without the average computation. 否则,我希望您的程序运行时将由I / O主导,您可以通过测量其运行时来进行检查,而无需进行平均计算。 There is not so much you can do about the I/O. 关于I / O,您无能为力。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM