简体   繁体   中英

How to insert/edit a column in an existing HDF5 dataset

I have a HDF5 file as seen below. I would like edit the index column and create a new timestamp index. Is there any way to do this?

在此处输入图片说明

This isn't possible, unless you have the scheme / specification used to create the HDF5 files in the first place.

Many things can go wrong if you attempt to use HDF5 files like a spreadsheet (even via h5py). For example:

  • Inconsistent chunk shape, compression, data types.
  • Homogeneous data becoming non-homogeneous.

What you could do is add a list as an attribute to the dataset. In fact, this is probably the right thing to do. Sample code below, with the input as a dictionary. When you read in the data, you link the attributes to the homogeneous data (by row, column, or some other identifier).

def add_attributes(hdf_file, attributes, path='/'):

    """Add or change attributes in path provided.
    Default path is root group.
    """

    assert os.path.isfile(hdf_file), "File Not Found Exception '{0}'.".format(hdf_file)
    assert isinstance(attributes, dict), "attributes argument must be a key: value dictionary: {0}".format(type(attributes))

    with h5py.File(hdf_file, 'r+') as hdf:
        for k, v in attributes.items():
            hdf[path].attrs[k] = v

    return "The following attributes have been added or updated: {0}".format(list(attributes.keys()))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM