简体   繁体   中英

Better way to store a set of files with arrays?

I've accumulated a set of 500 or so files, each of which has an array and header that stores metadata. Something like:

2,.25,.9,26 #<-- header, which is actually cryptic metadata 1.7331,0 1.7163,0 1.7042,0 1.6951,0 1.6881,0 1.6825,0 1.678,0 1.6743,0 1.6713,0

I'd like to read these arrays into memory selectively. We've built a GUI that lets users select one or multiple files from disk, then each are read in to the program. If users want to read in all 500 files, the program is slow opening and closing each file. Therefore, my question is: will it speed up my program to store all of these in a single structure? Something like hdf5? Ideally, this would have faster access than the individual files. What is the best way to go about this? I haven't ever dealt with these types of considerations. What's the best way to speed up this bottleneck in Python? The total data is only a few MegaBytes, I'd even be amenable to storing it in the program somewhere, not just on disk (but don't know how to do this)

Reading 500 files in python should not take much time, as the overall file size is around few MB. Your data-structure is plain and simple in your file chunks, it ll not even take much time to parse I guess.

Is the actual slowness is bcoz of opening and closing file, then there may be OS related issue (it may have very poor I/O.)

Did you timed it like how much time it is taking to read all the files.?

You can also try using small database structures like sqllite. Where you can store your file data and access the required data in a fly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM