[英]Initializing or populating multiple numpy arrays from h5 file groups
I have an h5 file with 5 groups, each group containing a 3D dataset. 我有一个5组的h5文件,每个组包含一个3D数据集。 I am looking to build a for loop that allows me to extract each group into a numpy array and assign the numpy array to an object with the group header name.
我正在寻找建立一个for循环,使我可以将每个组提取到一个numpy数组中,并将numpy数组分配给具有组头名称的对象。 I am able to get a number of different methods to work with one group, but when I try to build a for loop that applies to code to all 5 groups, it breaks.
我可以使用多种不同的方法来处理一组,但是当我尝试构建一个适用于所有5组代码的for循环时,它就会中断。 For example:
例如:
import h5py as h5
import numpy as np
f = h5.File("FFM0012.h5", "r+") #read in h5 file
print(list(f.keys())) #['FFM', 'Image'] for my dataset
FFM = f['FFM'] #Generate object with all 5 groups
print(list(FFM.keys())) #['Amp', 'Drive', 'Phase', 'Raw', 'Zsnsr'] for my dataset
Amp = FFM['Amp'] #Generate object for 1 group
Amp = np.array(Amp) #Turn into numpy array, this works.
Now when I try to apply the same logic with a for loop: 现在,当我尝试通过for循环应用相同的逻辑时:
h5_keys = []
FFM.visit(h5_keys.append) #Create list of group names ['Amp', 'Drive', 'Phase', 'Raw', 'Zsnsr']
for h5_key in h5_keys:
tmp = FFM[h5_key]
h5_key = np.array(tmp)
print(Amp[30,30,30]) #To check that array is populated
When I run this code I get "NameError: name 'Amp' is not defined". 当我运行此代码时,我得到“ NameError:未定义名称'Amp'”。 I've tried initializing the numpy array before the for loop with:
我试过在for循环之前使用以下命令初始化numpy数组:
h5_keys = []
FFM.visit(h5_keys.append) #Create list of group names
Amp = np.array([])
for h5_key in h5_keys:
tmp = FFM[h5_key]
h5_key = np.array(tmp)
print(Amp[30,30,30]) #To check that array is populated
This produces the error message "IndexError: too many indices for array" 这将产生错误消息“ IndexError:数组的索引过多”
I've also tried generating a dictionary and creating numpy arrays from the dictionary. 我也尝试过生成字典并从字典创建numpy数组。 That is a similar story where I can get the code to work for one h5 group, but it falls apart when I build the for loop.
这是一个类似的故事,在这里我可以使代码适用于一个h5组,但是在构建for循环时却分崩离析。
Any suggestions are appreciated! 任何建议表示赞赏!
You seem to have jumped to using h5py
and numpy
before learning much of Python 在学习大量Python之前,您似乎已经跳到使用
h5py
和numpy
了
Amp = np.array([]) # creates a numpy array with 0 elements
for h5_key in h5_keys: # h5_key is set of a new value each iteration
tmp = FFM[h5_key]
h5_key = np.array(tmp) # now you reassign h5_key
print(Amp[30,30,30]) # Amp is the original (0,) shape array
Try this basic python loop, paying attention to the value of i
: 试试这个基本的python循环,注意
i
的值:
alist = [1,2,3]
for i in alist:
print(i)
i = 10
print(i)
print(alist) # no change to alist
f
is the file. f
是文件。
FFM = f['FFM']
is a group
是一个
group
Amp = FFM['Amp']
is a dataset. 是一个数据集。 There are various ways of load the dataset into an numpy array.
有多种方法可以将数据集加载到numpy数组中。 I believe the
[...]
slicing is the current preferred one. 我相信,
[...]
切片是当前的首选之一。 .value
used to used but is now deprecated ( loading dataset ) .value
曾经使用,但现在已弃用 ( 加载数据集 )
Amp = FFM['Amp'][...]
is an array. 是一个数组。
alist = [FFM[key][...] for key in h5_keys]
should create a list of arrays from the FFM
group. 应该从
FFM
组创建一个数组列表。
If the shapes are compatible, you can concatenate the arrays into one array: 如果形状兼容,则可以将阵列连接成一个阵列:
np.array(alist)
np.stack(alist)
np.concatatenate(alist, axis=0) # or other axis
etc 等等
adict = {key: FFM[key][...] for key in h5_keys}
should crate of dictionary of array keyed by dataset names. 应该创建由数据集名称作为关键字的数组字典的板条箱。
In Python, lists and dictionaries are the ways of accumulating objects. 在Python中,列表和字典是累积对象的方式。 The
h5py
groups behave much like dictionaries. h5py
组的行为很像字典。 Datasets behave much like numpy arrays, though they remain on the disk until loaded with [...]
. 数据集的行为非常类似于numpy数组,尽管它们保留在磁盘上,直到加载
[...]
为止。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.