简体   繁体   中英

putting bounding box around text in a image

I want to compare two screenshots containing text. Basically both the screenshots contains some pretty formatted text. I want to compare if the same formatting being reflected in both the pictures as well as same text appearing at same location in both images.

How I am doing it right now is -

  1. Apply bilateral filters to remove the underlines of text.
  2. Apply threshold with value 180 as min value and clear them out
  3. Apply Gaussian blur on the image to remove the unfilled space between the characters.
  4. Apply threshold again with value 250 as min value.
  5. Compute contours in the images
  6. Draw rectangle bounding box around contours
  7. use O(n^2) algo to find out max overlapped rectangle and compare text within it.

However the problem is the contours appearing in both the images are different, ie in one of the image number of contours are 38 while other contains 53. I want to have a generic solution and don't want to depend upon the image content. However one thing for sure is the image is containing a well formatted text.

Thanks

I'm not sure to understand what do you want exactly but to get bounding box around word in image, i could do this :

  1. Apply processing to get good a thresholding : only text, background in black, text in white. This step depends on the type and quality of your image.
  2. Compute the sum of each line. The sum should be different from 0 where there is text and all lines in the space between each line should be null (you can set a threshold on this value if there is some noise). You can find the top/bottom line for each text line
  3. For each text line found in step 2, compute the sum of each columns. Same than step two, columns with word should be different from 0. You can find all spaces between words and letters. Remove all spaces which are too small to be a space between two words.
  4. Congratulation you have the top/bottom line and first/last columns of each words.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM