Automatic Scanned Document Image Enhancement

Question

I'm developing automatic image enhancement based on Microsoft paper Whiteboard scanning and image enhancement

in the "White balancing and image enhancement" section they provide the steps for the enhancement:

First: they estimate the background of the scanned document or the detected white Board:

1. "Divide the whiteboard region into rectangular cells. The cell size should be roughly the same as what we expect the size of a single character on the board (15 by 15 pixels in our implementation)."

then

2. "Sort the pixels in each cell by their luminance values. Since the ink absorbs the incident light, the luminance of the whiteboard pixels is higher than stroke pixels'. The whiteboard color within the cell is, therefore, the color with the highest luminance. In practice, we average the colors of the pixels in the top 25 percentile in order to reduce the error introduced by sensor noise"

then

3. "Filter the colors of the cells by locally fitting a plane in the RGB space. Occasionally there are cells that are entirely covered by pen strokes, the cell color computed in Step 2 is consequently incorrect. Those colors are rejected as outliers by the locally fitted plane and are replaced by the interpolated values from its neighbors."

My problem is with the second and the third steps:

How they get the luminace value, should I convert the input image to YUV color space and get the luminace value from the Y channel or just work on RGB color space?

How to fit a local plane in the RGB space ?

Here is my python code that I tried to make cell from input image, get the luminance value from YUV color space and a simple result that seems incorrect compared by the result they get in the paper.

Python Code:

import cv2
import numpy as np



## Return List of cells from a given Image
def SubImage(image):
    Cells = []
    CellRows = []
    for i in range(0,rows/CellSize):
        subIm = image[i*CellSize:(i+1)*CellSize,:]
        CellRows.append(subIm)
    for img in CellRows:
        for i in range(0,cols/CellSize):
            subIm = img[:,i*CellSize:(i+1)*CellSize]
            Cells.append(subIm)
    return Cells


## Sort luminosity Value
def GetLuminance(Cells):
    luminance = []
    for cel in Cells:
        luminance.append(cel.max())
    return luminance


## Estimate the background color of the white board
def UniformBackground(CelImage,img,luminance):
    a = 0

    for c in range(0,len(CelImage)):
        cel = CelImage[c]
        for i in range(0,cel.shape[0]):
            for j in range(0, cel.shape[1]):
                cel[i,j] = min(1,cel[i,j]/ luminance[c])
    for i in range(0,rows/CellSize):
        for j in range(0,cols/CellSize):
            img[i*CellSize:(i+1)*CellSize,j*CellSize:(j+1)*CellSize] = CelImage[a]
            a = a + 1

if __name__ == '__main__':
    img = cv2.imread('4.png')
    CellSize = 15
    rows,cols,depth = img.shape


    if (rows%CellSize !=0):
        rows = rows - rows%CellSize

    if (cols%CellSize !=0):
        cols = cols - cols%CellSize

    yuvImg = cv2.cvtColor(img, cv2.COLOR_BGR2YUV)
    # Get cells from Y channel
    CellsY = SubImage(yuvImg[:,:,0])
    CellsB = SubImage(img[:,:,0])
    CellsG = SubImage(img[:,:,1])
    CellsR = SubImage(img[:,:,2])

    # Get Luminance From Y cells
    LuminanceY = GetLuminance(CellsY)

    # Uniform Background
    UniformBackground(CellsB, img[:,:,0], LuminanceY)
    UniformBackground(CellsG, img[:,:,1], LuminanceY)
    UniformBackground(CellsR,img[:,:,2], LuminanceY)

    #bgrImg = cv2.cvtColor(imgB, cv2.COLOR_GRAY2BGR)
    #print imgB
    cv2.imwrite('unifrom.jpg',img)

Input white Board image:

output Image:

Expected Output:

Answer 1

temp = cel[i,j]/luminance[c]
if temp > thresh : ##Let thresh be 0.7
   cel[i,j] = 255

Cel with more luminance value is converted to white, Other cels remained as it is . The output of the image with uniform background

Answer 2

Let's work it out to step by step:

"Sort the pixels in each cell by their luminance values"

Yes, you must convert the image to some other color space which has luminance component, like Lab color space for example.

... In practice, we average the colors of the pixels in the top 25 percentile in order to reduce the error introduced by sensor noise

Meaning, that after you get, say, LAB image, you need to split it to channels, that the L channel image, take its histogram, say with 100 bins (i'm exaggerating) and take only pixels that fall in the whitest bins (say from 75 to 100). Now, after you've found the white pixels in each cell - remember them!!! for example, you can create a mask image which is 0 on all pixels except those who were chosen as "white"

Filter the colors of the cells by locally fitting a plane in the RGB space

Now get back to RBG space. as you see, the whiteboard is getting darker as it goes away. if you'll plot the whiteboard pixel RGB colors as 3d points in a 3d world whose axes are R,G, and B, you'll get a scatter which is approximately a plane (since all of those whiteboard colors are with gray hue). Now take the points you've marked as "whiteboard" in the previous step, and fit a plane to them. How to fit a plane? you can go with least squares like this , but from how they've written it in the article I think that they had RANSAC in mind.

Automatic Scanned Document Image Enhancement

Question

2 answers

solution1
0 2018-08-13 08:40:02

solution2
0 2018-08-13 13:12:56

Automatic Scanned Document Image Enhancement

Question

2 answers

solution1 0 2018-08-13 08:40:02

solution2 0 2018-08-13 13:12:56

solution1
0 2018-08-13 08:40:02

solution2
0 2018-08-13 13:12:56