简体   繁体   中英

How to write and append to h5 file multiple times in Python?

I am trying to write datasets to h5 file in the following way:

fpath = 'path-to-/data.h5'
with h5py.File(fpath,'w') as hf:
    hf.create_dataset('a', data=a)

Then I am appending to the file with more data in the same code:

with h5py.File(fpath,'a') as hf:
    dset = hf.create_dataset('b',(nrow,1),maxshape=(nrow,None),chunks=(nrow,1))
    for i in range(ncol):
        dset[:,-1:] = b
        if i+1 < ncol:
            dset.resize(dset.shape[1]+1,axis=1)

I get the following error against the second operation (append):

OSError: Unable to create file (unable to open file: name = 'path-to-/data.h5', 
    errno = 2, error message = 'Aucun fichier ou dossier de ce type',
    flags = 13, o_flags = 242)

When I check the directory, the file path-to-/data.h5 exists but without the appended datasets (checked with list(hf.keys()) ).

To make all of this work, currently I am writing everything in one step and not using the with statement (as suggested in the question EDIT here ).

hf = h5py.File(fpath,'w')
hf.create_dataset('a', data=a)
dset = hf.create_dataset('b',(nrow,1),maxshape=(nrow,None),chunks=(nrow,1))
for i in range(ncol):
    dset[:,-1:] = b
    if i+1 < ncol:
        dset.resize(dset.shape[1]+1,axis=1)
hf.close()

Here also, if I delete the written file and run the code again, it gives the same error as above and it only runs when I make a change in the file name (eg 'data_1.h5'). I don't understand this part as I anticipated that the operation h5py.File(fpath,'w') would be independent from existence or non-existence of the file.

To summarise, the only way I found to make the code work is by using the second approach (write without append) and don't alter the file (rename or move) that is generated.

I could not find it here , but is there a way to force write and append to a h5 file irrespective of it's existence or previous calls?

@nish-ant, I created a simple MCVE to demonstrate the 'w' and 'a' options with 2 simple datasets. It replicates your process (as I understand it) in 1 program. First I open the file with 'w' option, close, then reopen with 'a' option. It works as expected. Review and compare to your code. Maybe it will help you identify the file access error.
I also successfully tested with these file options:
1. 'w' for 1; then 'r+' for 2
2. 'a' for 1; then 'a' for 2

import h5py
import numpy as np

#Create array_to_be_saved
arr1 = np.arange(18.).reshape(3,6)
arr2 = 2.0*arr1

fpath = 'SO_55936567_data.h5'
with h5py.File(fpath,'w') as h5f:
    h5f.create_dataset('a', data=arr1)

h5f.close()

with h5py.File(fpath,'a') as h5f:
    h5f.create_dataset('b', data=arr2)

h5f.close()

print ('done')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM