简体   繁体   English

提取图像的轮廓位置和方向

[英]Extraction of contour positions and orientations of an image

I'm basically following a paper, " Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition system". 我基本上遵循的论文是“使用统计语言模型来改进基于HMM的草书手写识别系统的性能”。

Here the author has extracted a vector of 9 features from each sliding window. 在这里,作者从每个滑动窗口中提取了9个特征的向量。 quoting the paper: 引用本文:

The first three features are the weight of the window, its centre of gravity and the second order moment of the window. 前三个特征是窗户的重量,窗户的重心和窗户的第二阶矩。

Features four and five define the position of the upper and lower contour in the window, features six and seven give the orientation of the upper and lower contour by the gradient of the contour at the windows position, feature eight gives the number of black to white transitions in vertical direction, while feature nine gives the number of black pixels between the upper and lower contour. 特征4和5定义了窗口中上下轮廓的位置,特征6和7通过窗口位置处的轮廓梯度确定上下轮廓的方向,特征8给出了黑色到白色的数量沿垂直方向过渡,而特征9给出上下轮廓之间的黑色像素数。

I managed to calculate the first three features the paper is talking about, but I seem to have trouble understanding the features 4,5,6,7,8. 我设法计算出本文所讨论的前三个功能,但是似乎很难理解功能4,5,6,7,8。

I can calculate the contour of an image. 我可以计算图像的轮廓。 Suppose, this is a window of one of the text lines(windows is of length 14 pixels, as suggested by paper): 假设这是一个文本行之一的窗口(窗口的长度为14像素,如纸张所示):

在此处输入图片说明

And this is the extracted contour of the image: 这是图像的提取轮廓:

在此处输入图片说明

So what exactly is the upper and lower contour here? 那么这里的上下轮廓到底是什么? from where can I consider the limits, if it refers to the top and bottom pixels then I could have extracted those without contour extraction? 从哪里可以考虑极限,如果它是指顶部和底部像素,那么我可以提取轮廓而无需轮廓提取? Similarly the orientation of these contours is equally confusing. 同样,这些轮廓的方向同样令人困惑。

I would really appreciate some guidance here. 我真的很感谢这里的一些指导。

I gave a look at the paper, and I am pretty sure that "upper" and "lower" should be read as "uppest" and "lowest". 我看了看这篇论文,我很确定“上”和“下”应该读为“最高”和“最低”。 This especially makes sense as the authors have a special focus on the preprocessing of their data that they normalize in both the horizontal and vertical directions. 由于作者特别关注在水平和垂直方向上进行归一化的数据预处理,因此这特别有意义。 They take care to have a kind of robustness to scale, writing angle,... 他们要注意具有一定的伸缩性,书写角度,...

I guess that features 4 and 5 can be the extremal ordinates of the contours, which, combined with features 6 & 7 which are the gradients = orientations, give a good idea of the shape of these parts of the contour. 我猜特征4和5可以是轮廓的极坐标,再结合特征6和7(即梯度=方向),可以很好地了解轮廓的这些部分的形状。

Feature 9, will be mostly useful to make the difference between letters that can have similar vertical shapes I guess, such as "i", "l", "j". 功能9对区分可能具有相似的垂直形状的字母(例如“ i”,“ l”,“ j”)最有用。

This is my understanding. 这是我的理解。 Hope this helps! 希望这可以帮助!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM