简体   繁体   English

卡在一个简单的熊猫数据帧循环中

[英]stuck with a simple pandas dataframe looping

I have an hdf5 file which has 28 datasets inside. 我有一个hdf5文件,里面有28个数据集。 Each dataset is of different dimensions. 每个数据集具有不同的维度。 for example the first dataset is [60,8] and the last one is [60,1]. 例如,第一个数据集是[60,8],最后一个数据集是[60,1]。

I want to loop through the HDF5 file, read all the data in each of the dataset and write it to a pandas dataframe. 我想遍历HDF5文件,读取每个数据集中的所有数据,然后将其写入pandas数据框。 In the end I should have a dataframe of size [60, 218]. 最后,我应该有一个大小为[60,218]的数据框。 So far, i've tried the following code. 到目前为止,我已经尝试了以下代码。 But my code hangs. 但是我的代码挂起了。

Could someone spot the error in my code and tell me a better way to do this? 有人可以在我的代码中发现错误,并告诉我一种更好的方法吗?

q=h5py.File('AM_B0_D3.7_2016-04-13T215000.flac.h5', 'r') #reading the hdf5 file
dataset_names_list=[]
q.visit(dataset_names_list.append)#creating a list of datasets in the hdf5 file
ten_min_df= pd.DataFrame()
for i in dataset_names_list:
     x=q[i][:]
     if x.shape[1]>1:
         col1=[i + str(num) for num in range(0, x.shape[1])]
         temp=pd.DataFrame(data=x, columns=col1)
         ten_min_df=ten_min_df.append(temp)
     else:
         col2=[i]
         temp=pd.DataFrame(data=x, columns=col2)
         ten_min_df=ten_min_df.append(temp)

I think you need list of array s and then use numpy.concatenate with DataFrame constructor: 我认为您需要array s的列表,然后将numpy.concatenateDataFrame构造函数一起使用:

np.random.seed(452)

first=np.random.rand(3,5)
print (first)
[[ 0.88642869  0.42677701  0.89968857  0.87976326  0.07758206]
 [ 0.43617027  0.03221375  0.46398119  0.14226246  0.14237448]
 [ 0.22679517  0.60271752  0.85003435  0.5676184   0.87565266]]

second=np.random.rand(3,2) 
print (second)
[[ 0.89830548  0.27066452]
 [ 0.23907483  0.73784657]
 [ 0.09083235  0.98984701]]

third=np.random.rand(3,3)

L = [first, second, third]

df = pd.DataFrame(np.concatenate(L, axis=1))
print (df)
          0         1         2         3         4         5         6  \
0  0.886429  0.426777  0.899689  0.879763  0.077582  0.898305  0.270665   
1  0.436170  0.032214  0.463981  0.142262  0.142374  0.239075  0.737847   
2  0.226795  0.602718  0.850034  0.567618  0.875653  0.090832  0.989847   

          7         8         9  
0  0.837404  0.090284  0.764517  
1  0.564904  0.489809  0.254518  
2  0.426737  0.364310  0.328396  

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM