简体   繁体   中英

Unary Features and Cost Volume (Computer Vision)

I was reading a paper about disparity, and came across the following phrase:

"We use the deep unary features to compute the stereo matching cost by forming a cost volume."

I looked in the literature for definitions of 'unary features' and 'cost volume', yet struggled to find anything. Could someone clarify what these terms mean in the context of computer vision?

For a single 2D patch (wxwx1), if you're looking for its most similar sibling in another image, each pixel is a candidate, so if you write their similarity in another image, it'll be a 2D images with similarities. You can call it a similarity surface, or cost surface if you put, say, distances in them.

In the paper, that I can't seem to access properly (I did see the archived HTML version of it), for WxH images, they store the cost, or distance, between a feature in one image, with all the pixels in a window around it. Since we have WxH pixels, and the window is DXxDY, then the full array is WxHxDXxDY of costs. So it's 4D but they call it a "cost volume" by analogy.

You also find cost volumes in stereo, for WxH images, and D possible depths or disparities, we can build a WxHxD cost volume. If you were to find the smallest cost for each pixel, you wouldn't need a full volume, but if you also consider the pixels together (two neighbours probably have the same depth) then you look at the full cost volume instead of just small slices of it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM