简体   繁体   English

从文件中读取结构数组

[英]Readng an array of structures from file

I have the next task: I need to read an array of structures from file. 我有下一个任务:我需要从文件中读取结构数组。 There is no problem to read one structure: 读一个结构没有问题:

structFmt = "=64s 2L 3d"    # char[ 64 ] long[ 2 ] double [ 3 ]
structLen = struct.calcsize( structFmt )
f = open( "path/to/file", "rb" )
structBytes = f.read( structLen )
s = struct.unpack( structFmt, structBytes )

Also there is no problem to read an array of "simple" types: 同样,读取“简单”类型的数组也没有问题:

f = open( "path/to/file", "rb" )
a = array.array( 'i' )
a.fromfile( f, 1024 )

But there is a problem (for me, of course) to read 1024 structures structFmt from file. 但是从文件中读取1024个结构structFmt是一个问题(当然,对我而言)。 I think, that it is an overhead to read 1024 times struct and append it to a list. 我认为,读取1024倍的struct并将其附加到列表是一种开销。 I do not want to use external dependencies like numpy . 我不想使用像numpy这样的外部依赖项。

I would look at mmaping the file and then using ctypes class method from_buffer() call. 我将看一下映射文件,然后使用ctypes类方法from_buffer()调用。 This will map the ctypes defined array of structs http://docs.python.org/library/ctypes#ctypes-arrays . 这将映射ctypes定义的结构体数组http://docs.python.org/library/ctypes#ctypes-arrays

This maps the structs over the mmap file without having to explicitly read/convert and copy things. 这将结构映射到mmap文件上,而无需显式读取/转换和复制内容。

I don't know if the end result will be appropriate though. 我不知道最终结果是否合适。

Just for fun here is a quick example using mmap. 只是为了好玩,这里是使用mmap的简单示例。 (I created a file using dd dd if=/dev/zero of=./test.dat bs=96 count=10240 (我使用dd dd if=/dev/zero of=./test.dat bs=96 count=10240创建了一个文件

from ctypes import Structure
from ctypes import c_char, c_long, c_double
import mmap
import timeit


class StructFMT(Structure):
     _fields_ = [('ch',c_char * 64),('lo',c_long *2),('db',c_double * 3)]

d_array = StructFMT * 1024

def doit():
    f = open('test.dat','r+b')
    m = mmap.mmap(f.fileno(),0)
    data = d_array.from_buffer(m)

    for i in data:
        i.ch, i.lo[0]*10 ,i.db[2]*1.0   # just access each row and bit of the struct and do something, with the data.

    m.close()
    f.close()

if __name__ == '__main__':
    from timeit import Timer
    t = Timer("doit()", "from __main__ import doit")
    print t.timeit(number=10)

Alas, there is no analog for array that holds complex structs. las,没有类似的数组可以保存复杂的结构。

The usual technique is to make many calls to struct.unpack and append the results to a list. 通常的技术是对struct.unpack进行多次调用,并将结果附加到列表中。

structFmt = "=64s 2L 3d"    # char[ 64 ] long[ 2 ] double [ 3 ]
structLen = struct.calcsize( structFmt )
results = []
with open( "path/to/file", "rb" ) as f:
    structBytes = f.read( structLen )
    s = struct.unpack( structFmt, structBytes )
    results.append(s)

If you're concerned about being efficient, know that struct.unpack caches the parsed structure between successive calls. 如果您担心效率问题,请知道struct.unpack在连续调用之间缓存已解析的结构。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM