[英]Delete subgroup from HDF5 file in Python
I am trying to delete a subgroup that I've wrote in a HDF5 file using h5py in Python.我正在尝试使用 Python 中的 h5py 删除我在 HDF5 文件中编写的子组。 For example, according to the documentation, the subgroup called "MyDataset" can be deleted with:
例如,根据文档,可以使用以下命令删除名为“MyDataset”的子组:
del subgroup["MyDataset"]
I did it and effectively the subgroup is not longer accessible.我做到了,并且实际上无法再访问该子组。 However, the files does not reduce its size.
但是,文件不会减小其大小。 My question, is it possible to recover the space from deleted subgroups using h5py without having to rewrite the remaining subgroups into a completely new file?
我的问题是,是否可以使用 h5py 从已删除的子组中恢复空间,而不必将剩余的子组重写为一个全新的文件? Below I provide a small example that illustrate what I am saying:
下面我提供一个小例子来说明我在说什么:
import numpy as np
import h5py
myfile = h5py.File('file1.hdf5')
data = np.random.rand(int(1e6))
myfile.create_dataset("MyDataSet", data=data)
myfile.close()
Then I open the file and remove the previous entry:然后我打开文件并删除上一个条目:
myfile = h5py.File('file1.hdf5')
del myfile["MyDataSet"]
and if you try to get the data using:如果您尝试使用以下方法获取数据:
myfile["MyDataSet"].value
you will realize that the data is not longer accessible.您将意识到数据不再可访问。 However, if you check the size of the file it remains constant before and after calling to del.
但是,如果您检查文件的大小,它在调用 del 之前和之后都保持不变。
del myfile["MyDataSet"]
modifies the File
object, but does not modify the underlying file1.hdf5
file. del myfile["MyDataSet"]
修改File
对象,但不修改底层的file1.hdf5
文件。 The file1.hdf5
file not modified until myfile.close()
is called.在
file1.hdf5
myfile.close()
之前不会修改file1.hdf5
文件。
If you use a with-statement
, myfile.close()
will be called automatically for you when Python leaves the with-statement
:如果您使用
with-statement
,当 Python 离开with-statement
时,将自动为您调用myfile.close()
:
import numpy as np
import h5py
import os
path = 'file1.hdf5'
with h5py.File(path, "w") as myfile:
data = np.random.rand(int(1e6))
myfile.create_dataset("MyDataSet", data=data)
print(os.path.getsize(path))
with h5py.File(path, "a") as myfile:
del myfile["MyDataSet"]
try:
myfile["MyDataSet"].value
except KeyError as err:
# print(err)
pass
print(os.path.getsize(path))
prints印刷
8002144 <-- original file size
2144 <-- new file size
Notice that the first time, opening the File
in write mode ( "w"
) creates a new file, the second time, opening the File
in append mode ( "a"
, the default) allows reading the existant file and modifying it.请注意,第一次以写入模式(
"w"
)打开File
会创建一个新文件,第二次以附加模式( "a"
,默认值)打开File
允许读取现有文件并对其进行修改。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.