將Little-endian 24位文件轉換為ASCII數組

Question

我有一個.raw文件，其中包含52行html標頭，其后是數據本身。 該文件以Little-endian 24bits SIGNED編碼，我想將數據轉換為ASCII文件中的整數。 我使用Python 3 。

我試圖用在這篇文章中找到的以下代碼來“解壓縮”整個文件：

import sys
import chunk
import struct

f1 = open('/Users/anais/Documents/CR_lab/Lab_files/labtest.raw', mode = 'rb')
data = struct.unpack('<i', chunk + ('\0' if chunk[2] < 128 else '\xff'))

但我收到此錯誤消息：

TypeError: 'module' object is not subscriptable

編輯

看來這更好：

data = struct.unpack('<i','\0'+ bytes)[0] >> 8

但是我仍然收到錯誤消息：

TypeError: must be str, not type

我想容易修復嗎？

Answer 1

那不是在Python中處理的好文件！ Python非常適合處理文本文件，因為它會在內部緩沖區中大塊讀取它們，然后在行上進行迭代，但是您無法輕易訪問像這樣讀取文本后出現的二進制數據。 此外， struct模塊不支持24位值。

我能想象的唯一方法是一次讀取一個字節的文件，首先在行尾跳過52次，然后一次讀取3個字節，將它們連接成4個字節的字節字符串，然后解壓縮。

可能的代碼可能是：

eol = b'\n'          # or whatever is the end of line in your file
nlines = 52          # number of lines to skip

with open('/Users/anais/Documents/CR_lab/Lab_files/labtest.raw', mode = 'rb') as f1:

    for i in range(nlines):       # process nlines lines
        t = b''                   # to store the content of each line
        while True:
            x = f1.read(1)        # one byte at a time
            if x == eol:          # ok we have one full line
                break
            else:
                t += x            # else concatenate into current line
        print(t)                  # to control the initial 52 lines

    while True:
        t = bytes((0,))               # struct only knows how to process 4 bytes int
        for i in range(3):            # so build one starting with a null byte
            t += f1.read(1)
        # print(t)
        if(len(t) == 1): break        # reached end of file
        if(len(t) < 4):               # reached end of file with uncomplete value
            print("Remaining bytes at end of file", t)
            break
        # the trick is that the integer division by 256 skips the initial 0 byte and keeps the sign
        i = struct.unpack('<i', t)[0]//256   # // for Python 3, only / for Python 2
        print(i, hex(i))                     # or any other more useful processing

備注：上面的代碼假定您對52行的描述（以行尾終止）是正確的，但是顯示的圖像讓我們認為最后一行不是。 在這種情況下，您應該首先計數51行，然后跳過最后一行的內容。

def skipline(fd, nlines, eol):
    for i in range(nlines):       # process nlines lines
        t = b''                   # to store the content of each line
        while True:
            x = fd.read(1)        # one byte at a time
            if x == eol:          # ok we have one full line
                break
            else:
                t += x            # else concatenate into current line
        # print(t)                  # to control the initial 52 lines

with open('/Users/anais/Documents/CR_lab/Lab_files/labtest.raw', mode = 'rb') as f1:
    skiplines(f1, 51, b'\n')     # skip 51 lines terminated with a \n
    skiplines(f1, 1, b'>')       # skip last line assuming it ends at the >

    ...

將Little-endian 24位文件轉換為ASCII數組

問題描述

1 個解決方案

解決方案1
0 已采納 2017-07-27 10:02:46

將Little-endian 24位文件轉換為ASCII數組

問題描述

1 個解決方案

解決方案1 0 已采納 2017-07-27 10:02:46

解決方案1
0 已采納 2017-07-27 10:02:46