简体   繁体   English

从C ++读取HDF5数据:如何读取此特定格式?

[英]Reading HDF5 data from C++: How to read this specific format?

I have an HDF5 file that I need to read in C++ but I'm having trouble as the format of the file seems a bit complicated... 我有一个需要用C ++读取的HDF5文件,但由于文件格式似乎有点复杂,我遇到了麻烦。

The HDF5 file contains data saved from two devices. HDF5文件包含从两个设备保存的数据。 The data is a time series; 数据是时间序列; it can be seen as two arrays, one for time and the second for the actual output from the device. 它可以看作是两个数组,一个用于时间,第二个用于设备的实际输出。 The number of acquisitions is user-defined, but the number of acquisitions is the same for both devices (as their data is acquired at the same time). 采集次数是用户定义的,但是两个设备的采集次数相同(因为它们的数据是同时采集的)。

For example, one file will contain the data from, let's say, 10 acquisitions, organized in something similar to: 例如,一个文件将包含来自10次采集的数据,这些数据的组织方式类似于:

/Device1/Acquisition_000
/Device1/Acquisition_001
[...]
/Device2/Acquisition_000
/Device2/Acquisition_001
[...]

Each acquisition will contain a time array and a data array. 每个采集将包含一个时间数组和一个数据数组。

Here's a screenshot of what HDFView sees in the file: 这是HDFView在文件中看到的屏幕截图: 在HDFView中打开文件

I though a "path" /Device2/Acquisition_000 was a dataset and tried to read it as such, but I'm having trouble. 我虽然“路径” / Device2 / Acquisition_000是一个数据集,并试图以此方式读取它,但是我遇到了麻烦。 I then dumped the .h5 file using h5dump and got the following: 然后,我使用h5dump转储了.h5文件,并得到了以下内容:

HDF5 "data.h5" {
GROUP "/" {
GROUP "Device1" {
    DATASET "Acquisition_000" {
        DATATYPE  H5T_COMPOUND {
            H5T_IEEE_F64BE "Time";
            H5T_IEEE_F64BE "Signal";
        }
        DATASPACE  SIMPLE { ( 270000 ) / ( 270000 ) }
        DATA {
        (0): {
            0,
            -0.0933597
            },
        (1): {
            2e-05,
            -0.0476648
            },
        (2): {
            4e-05,
            -0.0628964
            },
[...]

Now I don't know how I should read that structure. 现在,我不知道如何阅读该结构。 I saw the H5T_COMPOUND so I tried the compound example from http://www.hdfgroup.org/HDF5/doc/cpplus_RM/compound_8cpp-example.html but the dataset->read() does not seems to be able to read the data; 我看到了H5T_COMPOUND,所以我尝试了http://www.hdfgroup.org/HDF5/doc/cpplus_RM/compound_8cpp-example.html中的复合示例,但是dataset-> read()似乎无法读取数据; valgrind reports accessing uninitialized data when std::cout'ing the data in a loop. 当std :: cout将数据循环循环时,valgrind报告访问未初始化的数据。

Another source of confusion is the "H5T_IEEE_F64BE" in the dump; 另一个混乱的原因是转储中的“ H5T_IEEE_F64BE”。 isn't the BE part for big-endian? 是不是Big-endian的一部分? Both the machine generating the data and the one reading it are x86_64... 生成数据的机器和读取数据的机器都是x86_64 ...

How can I read the "Time" and "Signal" arrays into C/C++ arrays? 如何将“时间”和“信号”数组读入C / C ++数组?

For reference, here's my try at adapting the example: 供参考,这是我尝试修改示例的尝试:

const H5std_string FILE_NAME("data.h5");
const H5std_string DATASET_NAME("/Device1/Acquisition_000/");
H5File file(FILE_NAME, H5F_ACC_RDONLY);
DataSet dataset = file.openDataSet(DATASET_NAME);
const H5std_string MEMBER_TIME("time_name");
const H5std_string MEMBER_SIGN("signal_name");
// Try reading a single array:
CompType mtype3( sizeof(double) );
mtype3.insertMember(MEMBER_SIGN, 0, PredType::NATIVE_DOUBLE);
double *data_signal = new double[270000];
memset(data_signal, 0, 270000);
dataset.read(data_signal, mtype3);
// Print the data
for (int i = 0 ; i < 10 ; i++)
{
    std::cout << "data_signal[i=" << i << "] = " << data_signal[i] << std::endl;
}

and its output: 及其输出:

data_signal[i=0] = 0
data_signal[i=1] = 0
data_signal[i=2] = 0
data_signal[i=3] = 0
data_signal[i=4] = 0
data_signal[i=5] = 0
data_signal[i=6] = 0
data_signal[i=7] = 0
data_signal[i=8] = 0
data_signal[i=9] = 0

Additionally, Matlab can read the file using: 此外,Matlab可以使用以下方法读取文件:

data = h5read('data.h5', '/Device1/Acquisition_000')
data = 

      Time: [270000x1 double]
    Signal: [270000x1 double]

Thanks a lot. 非常感谢。

The member names are used to pull the correct data fields out of the file. 成员名称用于从文件中提取正确的数据字段。 "signal_name" doesn't match the name of the data in the file. “ signal_name”与文件中数据的名称不匹配。 Try using "Signal", as is visible from MATLAB and from the GUI viewer. 尝试使用“信号”,这可以从MATLAB和GUI查看器中看到。

Eventually, you'll want to define a c++ structure that represents a time/signal pair, like the compound example: 最终,您将需要定义一个表示时间/信号对的c ++结构,例如复合示例:

struct dataPoint
{
    double timePoint;
    double signal;
};

CompType hdf5DataPointType( sizeof(dataPoint) );
hdf5DataPointType.insertMember(MEMBER_TIME, 0, PredType::NATIVE_DOUBLE);
hdf5DataPointType.insertMember(MEMBER_SIGN, sizeof(double), PredType::NATIVE_DOUBLE);

Then read directly into an array of dataPoint. 然后直接读入dataPoint数组。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM