简体   繁体   English

在python中读取复杂的二进制文件

[英]Reading complex binary file in python

I am rather new to python programming so please be a big simple with your answer. 我是python编程的新手,请简单回答您的问题。

I have a .raw file which is 2b/2b complex short int format. 我有一个2b / 2b复合短int格式的.raw文件。 Its actually a 2-D raster file. 它实际上是一个二维栅格文件。 I want to read and seperate both real and complex parts. 我想阅读并分离真实和复杂的部分。 Lets say the raster is [MxN] size. 可以说栅格为[MxN]大小。

Please let me know if question is not clear. 如果问题不清楚,请告诉我。

Cheers N 干杯N

You could do it with the struct module. 您可以使用struct模块来完成。 Here's a simple example based on the file formatting information you mentioned in a comment: 这是一个基于您在注释中提到的文件格式信息的简单示例:

import struct

def read_complex_array(filename, M, N):
    row_fmt = '={}h'.format(N)  # "=" prefix means integers in native byte-order
    row_len = struct.calcsize(row_fmt)
    result = []
    with open(filename, "rb" ) as input:
        for col in xrange(M):
            reals = struct.unpack(row_fmt, input.read(row_len))
            imags = struct.unpack(row_fmt, input.read(row_len))
            cmplx = [complex(r,i) for r,i in zip(reals, imags)]
            result.append(cmplx)
    return result

This will return a list of complex-number lists, as can be seen in this output from a trivial test I ran: 这将返回一个复数列表列表,从我进行的一次小测试的输出中可以看出:

[
  [  0.0+  1.0j    1.0+  2.0j    2.0+  3.0j    3.0+  4.0j],
  [256.0+257.0j  257.0+258.0j  258.0+259.0j  259.0+260.0j],
  [512.0+513.0j  513.0+514.0j  514.0+515.0j  515.0+516.0j]
]

Both the real and imaginary parts of complex numbers in Python are usually represented as a pair of machine-level double precision floating point numbers. Python中复数的实部和虚部通常都表示为一对机器级双精度浮点数。

You could also use the array module. 您也可以使用array模块。 Here's the same thing using it: 这是使用它的同一件事:

import array

def read_complex_array2(filename, M, N):
    result = []
    with open(filename, "rb" ) as input:
        for col in xrange(M):
            reals = array.array('h')
            reals.fromfile(input, N)
            # reals.byteswap()  # if necessary
            imags = array.array('h')
            imags.fromfile(input, N)
            # imags.byteswap()  # if necessary
            cmplx = [complex(r,i) for r,i in zip(reals, imags)]
            result.append(cmplx)
    return result

As you can see, they're very similar, so it's not clear there's a big advantage to using one over the other. 如您所见,它们非常相似,因此尚不清楚使用它们之间的巨大优势。 I suspect the array based version might be faster, but that would have to be determined by actually timing it with some real data to be able to say with any certainty. 我怀疑基于array的版本可能会更快,但这必须通过对某些实际数据进行实际计时来确定,以便能够确定地说出来。

Take a look at Hachoir library. 看看Hachoir图书馆。 It's designed for this purposes, and does it's work really good. 它是为此目的而设计的,确实工作得很好。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM