将Little-endian 24位文件转换为ASCII数组

Question

I have a .raw file containing a 52 lines html header followed by the data themselves. 我有一个.raw文件，其中包含52行html标头，其后是数据本身。 The file is encoded in little-endian 24bits SIGNED and I want to convert the data to integers in an ASCII file. 该文件以Little-endian 24bits SIGNED编码，我想将数据转换为ASCII文件中的整数。 I use Python 3 . 我使用Python 3 。

I tried to 'unpack' the entire file with the following code found in this post : 我试图用在这篇文章中找到的以下代码来“解压缩”整个文件：

import sys
import chunk
import struct

f1 = open('/Users/anais/Documents/CR_lab/Lab_files/labtest.raw', mode = 'rb')
data = struct.unpack('<i', chunk + ('\0' if chunk[2] < 128 else '\xff'))

But I get this error message: 但我收到此错误消息：

TypeError: 'module' object is not subscriptable

EDIT 编辑

It seems this is better: 看来这更好：

data = struct.unpack('<i','\0'+ bytes)[0] >> 8

But I still get an error message: 但是我仍然收到错误消息：

TypeError: must be str, not type

Easy to fix I presume? 我想容易修复吗？

Answer 1

That's not a nice file to process in Python! 那不是在Python中处理的好文件！ Python is great for processing text files, because it reads them in big chunks in an internal buffer and then iterates on lines, but you cannot easily access binary data that comes after text read like that. Python非常适合处理文本文件，因为它会在内部缓冲区中大块读取它们，然后在行上进行迭代，但是您无法轻易访问像这样读取文本后出现的二进制数据。 Additionally, the struct module has no support for 24 bits values. 此外， struct模块不支持24位值。

The only way I can imagine is to read the file one byte at a time, first skip 52 time an end of line, then read bytes 3 at a time, concatenate them in a 4 bytes byte string and unpack it. 我能想象的唯一方法是一次读取一个字节的文件，首先在行尾跳过52次，然后一次读取3个字节，将它们连接成4个字节的字节字符串，然后解压缩。

Possible code could be: 可能的代码可能是：

eol = b'\n'          # or whatever is the end of line in your file
nlines = 52          # number of lines to skip

with open('/Users/anais/Documents/CR_lab/Lab_files/labtest.raw', mode = 'rb') as f1:

    for i in range(nlines):       # process nlines lines
        t = b''                   # to store the content of each line
        while True:
            x = f1.read(1)        # one byte at a time
            if x == eol:          # ok we have one full line
                break
            else:
                t += x            # else concatenate into current line
        print(t)                  # to control the initial 52 lines

    while True:
        t = bytes((0,))               # struct only knows how to process 4 bytes int
        for i in range(3):            # so build one starting with a null byte
            t += f1.read(1)
        # print(t)
        if(len(t) == 1): break        # reached end of file
        if(len(t) < 4):               # reached end of file with uncomplete value
            print("Remaining bytes at end of file", t)
            break
        # the trick is that the integer division by 256 skips the initial 0 byte and keeps the sign
        i = struct.unpack('<i', t)[0]//256   # // for Python 3, only / for Python 2
        print(i, hex(i))                     # or any other more useful processing

Remark: above code assumes that your description of 52 lines (terminated by an end of line) is true, but the shown image let think that last line is not. 备注：上面的代码假定您对52行的描述（以行尾终止）是正确的，但是显示的图像让我们认为最后一行不是。 In that case, you should first count 51 lines and then skip the content of the last line. 在这种情况下，您应该首先计数51行，然后跳过最后一行的内容。

def skipline(fd, nlines, eol):
    for i in range(nlines):       # process nlines lines
        t = b''                   # to store the content of each line
        while True:
            x = fd.read(1)        # one byte at a time
            if x == eol:          # ok we have one full line
                break
            else:
                t += x            # else concatenate into current line
        # print(t)                  # to control the initial 52 lines

with open('/Users/anais/Documents/CR_lab/Lab_files/labtest.raw', mode = 'rb') as f1:
    skiplines(f1, 51, b'\n')     # skip 51 lines terminated with a \n
    skiplines(f1, 1, b'>')       # skip last line assuming it ends at the >

    ...

将Little-endian 24位文件转换为ASCII数组

问题描述

1 个解决方案

解决方案1
0 已采纳 2017-07-27 10:02:46

将Little-endian 24位文件转换为ASCII数组

问题描述

1 个解决方案

解决方案1 0 已采纳 2017-07-27 10:02:46

解决方案1
0 已采纳 2017-07-27 10:02:46