Parsing the LINEMOD 6d Pose estimation Dataset

Question

I am trying to use the dataset from the widely cited LINEMOD paper used in 6D pose estimation. Their dataset is available at http://campar.in.tum.de/Main/StefanHinterstoisser

Their depth data appears to be in a one-off format that requires a special function to load. I would need to write a C++ program that wraps a provided function that depends on OpenCV, and figure out the best way to extract the numbers from the object and export. This is difficult/effortful for someone who spends all day in Python and other high level languages. I was wondering if anyone else has done the work already to put the depth numbers into a more generic or python friendly format? I have looked around but found nothing.

Also, the C++ program is brief but cryptically ambiguous to my untrained eyes. I suspect that someone skilled in both C++/opencv and Python can look at the source code and an elegant program to do the analogous file-reading in python? I will paste the contents of it below for convenience.

http://campar.in.tum.de/personal/hinterst/index/downloads!09384230443!/loadDepth.txt

IplImage * loadDepth( std::string a_name )
{
    std::ifstream l_file(a_name.c_str(),std::ofstream::in|std::ofstream::binary );

    if( l_file.fail() == true ) 
    {
        printf("cv_load_depth: could not open file for writing!\n");
        return NULL; 
    }
    int l_row;
    int l_col;

    l_file.read((char*)&l_row,sizeof(l_row));
    l_file.read((char*)&l_col,sizeof(l_col));

    IplImage * lp_image = cvCreateImage(cvSize(l_col,l_row),IPL_DEPTH_16U,1);

    for(int l_r=0;l_r<l_row;++l_r)
    {
        for(int l_c=0;l_c<l_col;++l_c)
        {
            l_file.read((char*)&CV_IMAGE_ELEM(lp_image,unsigned short,l_r,l_c),sizeof(unsigned short));
        }
    }
    l_file.close();

    return lp_image;
}

Thank you for your help on this!

Answer 1

After some trial and error, the snippet below seems to work. Hopefully, this is useful to others with my question.

import struct
cpp_int_size = 4
cpp_ushort_size = 2
with open('ape/data/depth811.dpt', 'rb') as f:
    rows_b = f.read(cpp_int_size) # I assume that the C++ int in question has 4 bytes ... trial and error
    cols_b = f.read(cpp_int_size)

    R = struct.unpack('<i', rows_b)[0] # small endian
    C = struct.unpack('<i', cols_b)[0]
    depth_image_str = f.read(R * C * cpp_ushort_size)
depth_img = np.fromstring(depth_image_str, dtype=np.uint16).reshape([R, C])

Parsing the LINEMOD 6d Pose estimation Dataset

Question

1 answers

solution1
2 2017-09-28 20:47:40

Parsing the LINEMOD 6d Pose estimation Dataset

Question

1 answers

solution1 2 2017-09-28 20:47:40

solution1
2 2017-09-28 20:47:40