簡體   English   中英

將新數據添加到 HDF5 文件會導致空數組

[英]Adding new data into HDF5 file results an empty array

在玩 Python 的 HDF5 包時,我發現了一個奇怪的行為。 我想在表中插入更多數據。 但不知何故我無法讓它正常工作。 正如您從源代碼中看到的那樣,我使用fromRow = hf["X"].shape[0]獲取鍵 'X' 中的最后一行數據,然后寫入tempArray2 結果是一個空表。

import h5py

tempArray1 = [[0.9293237924575806, -0.32789671421051025, 0.18110771477222443], [0.9293237924575806, -0.32789671421051025, 0.18110771477222443], [0.9293237924575806, -0.32789671421051025, 0.18110771477222443], [0.9293237924575806, -0.32789671421051025, 0.18110771477222443], [0.9293237924575806, -0.32789671421051025, 0.18110771477222443], [0.9293237924575806, -0.32789671421051025, 0.18110771477222443], [0.9293237924575806, -0.32789671421051025, 0.18110771477222443], [0.9293237924575806, -0.32789671421051025, 0.18110771477222443], [0.9293237924575806, -0.32789671421051025, 0.18110771477222443], [0.9293237924575806, -0.32789671421051025, 0.18110771477222443]]
tempArray2 = [[3.1387749004352372e-06, 8.120089097236803e+27, -1.645612730013634e-14], [3.1387749004352372e-06, 8.120089097236803e+27, -1.645612730013634e-14], [3.1387749004352372e-06, 8.120089097236803e+27, -1.645612730013634e-14], [3.1387749004352372e-06, 8.120089097236803e+27, -1.645612730013634e-14], [3.1387749004352372e-06, 8.120089097236803e+27, -1.645612730013634e-14], [3.1387749004352372e-06, 8.120089097236803e+27, -1.645612730013634e-14], [3.1387749004352372e-06, 8.120089097236803e+27, -1.645612730013634e-14], [3.1387749004352372e-06, 8.120089097236803e+27, -1.645612730013634e-14], [3.1387749004352372e-06, 8.120089097236803e+27, -1.645612730013634e-14], [3.1387749004352372e-06, 8.120089097236803e+27, -1.645612730013634e-14]]

with h5py.File('data.hdf5', 'w') as hf:
    # Add data to new file
    dset = hf.create_dataset("X", data=tempArray1, compression="gzip", chunks=True, maxshape=(None,3), dtype='f4') # Size is as the size of tempArray1

    # Append data existing file
    hf["X"].resize((hf["X"].shape[0] + 10, 3)) # Size is as the size of X+ 10
    fromRow = hf["X"].shape[0]
    hf["X"][fromRow:] = tempArray2


Key: X
 [[ 0.9293238  -0.3278967   0.18110771]
 [ 0.9293238  -0.3278967   0.18110771]
 [ 0.9293238  -0.3278967   0.18110771]
 [ 0.9293238  -0.3278967   0.18110771]
 [ 0.9293238  -0.3278967   0.18110771]
 [ 0.9293238  -0.3278967   0.18110771]
 [ 0.9293238  -0.3278967   0.18110771]
 [ 0.9293238  -0.3278967   0.18110771]
 [ 0.9293238  -0.3278967   0.18110771]
 [ 0.9293238  -0.3278967   0.18110771]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]]
Length of data: 20

奇怪的是,當我用數字 10 替換值fromRow ,例如fromRow = 10 ,它代表現有表的結尾,它起作用了。


Key: X
 [[ 9.2932379e-01 -3.2789671e-01  1.8110771e-01]
 [ 9.2932379e-01 -3.2789671e-01  1.8110771e-01]
 [ 9.2932379e-01 -3.2789671e-01  1.8110771e-01]
 [ 9.2932379e-01 -3.2789671e-01  1.8110771e-01]
 [ 9.2932379e-01 -3.2789671e-01  1.8110771e-01]
 [ 9.2932379e-01 -3.2789671e-01  1.8110771e-01]
 [ 9.2932379e-01 -3.2789671e-01  1.8110771e-01]
 [ 9.2932379e-01 -3.2789671e-01  1.8110771e-01]
 [ 9.2932379e-01 -3.2789671e-01  1.8110771e-01]
 [ 9.2932379e-01 -3.2789671e-01  1.8110771e-01]
 [ 3.1387749e-06  8.1200891e+27 -1.6456127e-14]
 [ 3.1387749e-06  8.1200891e+27 -1.6456127e-14]
 [ 3.1387749e-06  8.1200891e+27 -1.6456127e-14]
 [ 3.1387749e-06  8.1200891e+27 -1.6456127e-14]
 [ 3.1387749e-06  8.1200891e+27 -1.6456127e-14]
 [ 3.1387749e-06  8.1200891e+27 -1.6456127e-14]
 [ 3.1387749e-06  8.1200891e+27 -1.6456127e-14]
 [ 3.1387749e-06  8.1200891e+27 -1.6456127e-14]
 [ 3.1387749e-06  8.1200891e+27 -1.6456127e-14]
 [ 3.1387749e-06  8.1200891e+27 -1.6456127e-14]]
Length of data: 20


調整 X 數據集大小后,您將獲得fromRow 您需要在調整大小之前的值。 請參閱下面的代碼。

with h5py.File('data.hdf5', 'w') as hf:
    # Add data to new file
    dset = hf.create_dataset("X", data=tempArray1, compression="gzip", chunks=True, maxshape=(None,3), dtype='f4') # Size is as the size of tempArray1
# new location to get fromRow:
    fromRow = hf["X"].shape[0]

    # Append data existing file
    hf["X"].resize((hf["X"].shape[0] + 10, 3)) # Size is as the size of X+ 10
    hf["X"][fromRow:] = tempArray2

如果 dtype 為“如何在 python 3.6 中從 hdf5 文件中獲取數據數組” <u4”?< div><div id="text_translate"><p> 我想從 hdf5 文件中獲取格式為 {N, 16, 512, 128} 的數據集作為 4D numpy 數組。 N 是 3D arrays 的數字,格式為 {16, 512, 128}。 我嘗試這樣做:</p><pre> import os import sys import h5py as h5 import numpy as np import subprocess import re file_name = sys.argv[1] path = sys.argv[2] f = h5.File(file_name, 'r') data = f[path] print(data.shape) #{27270, 16, 512, 128} print(data.dtype) #"&lt;u4" data = np.array(data, dtype=np.uint32) print(data.shape)</pre><p> 不幸的是,在data = np.array(data, dtype=np.uint32)命令之后,代碼似乎崩潰了,因為之后什么也沒發生。</p><p> 我需要將此數據集檢索為 numpy 數組,或者可能類似的東西以進行進一步計算。 如果您有任何建議,請告訴我。 </p></div></u4”?<>

[英]How in python 3.6 to get data array from hdf5 file if dtype is “<u4”?


聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

粵ICP備18138465號  © 2020-2024 STACKOOM.COM