10

How can I threshold this blurry image to make the digits as clear as possible?

In a previous post, I tried adaptively thresholding a blurry image (left), which resulted in distorted and disconnected digits (right):

enter image description here

Since then, I've tried using a morphological closing operation as described in this post to make the brightness of the image uniform:

enter image description here

If I adaptively threshold this image, I don't get significantly better results. However, because the brightness is approximately uniform, I can now use an ordinary threshold:

enter image description here

This is a lot better than before, but I have two problems:

  1. I had to manually choose the threshold value. Although the closing operation results in uniform brightness, the level of brightness might be different for other images.
  2. Different parts of the image would do better with slight variations in the threshold level. For instance, the 9 and 7 in the top left come out partially faded and should have a lower threshold, while some of the 6s have fused into 8s and should have a higher threshold.

I thought that going back to an adaptive threshold, but with a very large block size (1/9th of the image) would solve both problems. Instead, I end up with a weird "halo effect" where the centre of the image is a lot brighter, but the edges are about the same as the normally-thresholded image:

enter image description here

Edit: remi suggested morphologically opening the thresholded image at the top right of this post. This doesn't work too well. Using elliptical kernels, only a 3x3 is small enough to avoid obliterating the image entirely, and even then there are significant breakages in the digits:

enter image description here

Edit2: mmgp suggested using a Wiener filter to remove blur. I adapted this code for Wiener filtering in OpenCV to OpenCV4Android, but it makes the image even blurrier! Here's the image before (left) and after filtering with my code and a 5x5 kernel:

enter image description here

Here is my adapted code, which filters in-place:

private void wiener(Mat input, int nRows, int nCols) { // I tried nRows=5 and nCols=5

    Mat localMean = new Mat(input.rows(), input.cols(), input.type());
    Mat temp = new Mat(input.rows(), input.cols(), input.type());
    Mat temp2 = new Mat(input.rows(), input.cols(), input.type());

    // Create the kernel for convolution: a constant matrix with nRows rows 
    // and nCols cols, normalized so that the sum of the pixels is 1.
    Mat kernel = new Mat(nRows, nCols, CvType.CV_32F, new Scalar(1.0 / (double) (nRows * nCols)));

    // Get the local mean of the input.  localMean = convolution(input, kernel)
    Imgproc.filter2D(input, localMean, -1, kernel, new Point(nCols/2, nRows/2), 0); 

    // Get the local variance of the input.  localVariance = convolution(input^2, kernel) - localMean^2 
    Core.multiply(input, input, temp);  // temp = input^2
    Imgproc.filter2D(temp, temp, -1, kernel, new Point(nCols/2, nRows/2), 0); // temp = convolution(input^2, kernel)
    Core.multiply(localMean, localMean, temp2); //temp2 = localMean^2
    Core.subtract(temp, temp2, temp); // temp = localVariance = convolution(input^2, kernel) - localMean^2  

    // Estimate the noise as mean(localVariance)
    Scalar noise = Core.mean(temp);

    // Compute the result.  result = localMean + max(0, localVariance - noise) / max(localVariance, noise) * (input - localMean)

    Core.max(temp, noise, temp2); // temp2 = max(localVariance, noise)

    Core.subtract(temp, noise, temp); // temp = localVariance - noise
    Core.max(temp, new Scalar(0), temp); // temp = max(0, localVariance - noise)

    Core.divide(temp, temp2, temp);  // temp = max(0, localVar-noise) / max(localVariance, noise)

    Core.subtract(input, localMean, input);  // input = input - localMean
    Core.multiply(temp, input, input); // input = max(0, localVariance - noise) / max(localVariance, noise) * (input - localMean)
    Core.add(input, localMean, input); // input = localMean + max(0, localVariance - noise) / max(localVariance, noise) * (input - localMean)
}
11
  • 2
    Here is a different take (if not weird) on your problem: if you are in control of the used font, change it to something better. Where "better" means making it harder to either a 6 or 9 to turn into an 8. Maybe make it bolder too. I guess at some point you will attempt to recognize these digits, and that is the reason for your question.
    – mmgp
    Dec 3, 2012 at 2:59
  • Unfortunately, I'm going to be recognizing these images from Android users' cameras, "in the wild", so there's no control over font. Though that would be a useful solution otherwise.
    – 1''
    Dec 3, 2012 at 3:09
  • Taking the problem in another way: why are you trying to make the digits clearer? Is it to OCR them afterwards? Probably, you can get quite good results by training an OCR with those kind of digits, and using the OCR to detect them in the image.
    – remi
    Dec 5, 2012 at 9:09
  • If you have a good sample of every possible digit the users are going to take a photo from, it might be sensible to directly train from the digits. The way I worded the previous phrase makes it unlikely to be the case. Maybe it could be achievable using one-class classifiers, as One-Class SVM, since you also lack a good representation of what you don't expect to be a digit. Now, it would much easier to train a classifier if you didn't have broken digits or mis-connected ones. After thinning them, the task is much easier and much more prone to give correct results.
    – mmgp
    Dec 5, 2012 at 13:01
  • Check the niblack algorithm. See for example stackoverflow.com/questions/9871084/niblack-thresholding Dec 7, 2012 at 22:56

4 Answers 4

6

Some hints that you might try out:

  • Apply the morphological opening in your original thresholded image (the one which is noisy at the right of the first picture). You should get rid of most of the background noise and be able to reconnect the digits.

  • Use a different preprocessing of your original image instead of morpho closing, such as median filter (tends to blur the edges) or bilateral filtering which will preserve better the edges but is slower to compute.

  • As far as threshold is concerned, you can use CV_OTSU flag in the cv::threshold to determine an optimal value for a global threshold. Local thresholding might still be better, but should work better with the bilateral or median filter

2
  • Great idea with the CV_OTSU flag, it works great! Unfortunately, morphological opening doesn't work well on my thresholded image (see my edited post). Also, Astor has tried median/bilateral filters before thresholding, and it doesn't work as well as closing + normal threshold.
    – 1''
    Dec 1, 2012 at 0:55
  • probably opening followed by closing will improve the results wrt opening only.
    – remi
    Dec 2, 2012 at 9:38
6

I've tried thresholding each 3x3 box separately, using Otsu's algorithm (CV_OTSU - thanks remi!) to determine an optimal threshold value for each box. This works a bit better than thresholding the entire image, and is probably a bit more robust.

enter image description here

Better solutions are welcome, though.

2
  • an efficient binarization for such kind of documents is Sauvola technique (google sauvola + binarization). It is not implemented in OpenCV, but it is quite easy to do so, and you can use integral images to compute the mean and standard deviation of image patches extremely fast.
    – remi
    Dec 2, 2012 at 9:53
  • I tried sauvola on your image, and I manage to get quite decent results but indeed, as mmgp said, with a fine tuning of the parameters. And probably the set of parameters will work only for this image, but not be optimal for an image with different conditions.
    – remi
    Dec 5, 2012 at 9:07
6

If you're willing to spend some cycles on it there are de-blurring techniques that could be used to sharpen up the picture prior to processing. Nothing in OpenCV yet but if this is a make-or-break kind of thing you could add it.

There's a bunch of literature on the subject: http://www.cse.cuhk.edu.hk/~leojia/projects/motion_deblurring/index.html http://www.google.com/search?q=motion+deblurring

And some chatter on the OpenCV mailing list: http://tech.groups.yahoo.com/group/OpenCV/message/20938

The weird "halo effect" that you're seeing is likely due to OpenCV assuming black for the color when the adaptive threshold is at/near the edge of the image and the window that it's using "hangs over" the edge into non-image territory. There are ways to correct for this, most likely you would make an temporary image that's at least two full block-sizes taller and wider than the image from the camera. Then copy the camera image into the middle of it. Then set the surrounding "blank" portion of the temp image to be the average color of the image from the camera. Now when you perform the adaptive threshold the data at/near the edges will be much closer to accurate. It won't be perfect since its not a real picture but it will yield better results than the black that OpenCV is assuming is there.

5
+100

My proposal assumes you can identify the sudoku cells, which I think, is not asking too much. Trying to apply morphological operators (although I really like them) and/or binarization methods as a first step is the wrong way here, in my opinion of course. Your image is at least partially blurry, for whatever reason (original camera angle and/or movement, among other reasons). So what you need is to revert that, by performing a deconvolution. Of course asking for a perfect deconvolution is too much, but we can try some things.

One of these "things" is the Wiener filter, and in Matlab, for instance, the function is named deconvwnr. I noticed the blurry to be in the vertical direction, so we can perform a deconvolution with a vertical kernel of certain length (10 in the following example) and also assume the input is not noise free (assumption of 5%) -- I'm just trying to give a very superficial view here, take it easy. In Matlab, your problem is at least partially solved by doing:

f = imread('some_sudoku_cell.png');
g = deconvwnr(f, fspecial('motion', 10, 90), 0.05));
h = im2bw(g, graythresh(g)); % graythresh is the Otsu method

Here are the results from some of your cells (original, otsu, otsu of region growing, morphological enhanced image, otsu from morphological enhanced image with region growing, otsu of deconvolution):

enter image description here enter image description here enter image description hereenter image description here enter image description here enter image description here
enter image description here enter image description here enter image description here enter image description here enter image description here enter image description here
enter image description here enter image description here enter image description here enter image description here enter image description here enter image description here
enter image description here enter image description here enter image description here enter image description here enter image description here enter image description here
enter image description here enter image description here enter image description here enter image description here enter image description here enter image description here
enter image description here enter image description here enter image description here enter image description here enter image description here enter image description here

The enhanced image was produced by performing original + tophat(original) - bottomhat(original) with a flat disk of radius 3. I manually picked the seed point for region growing and manually picked the best threshold.

For empty cells you get weird results (original and otsu of deconvlution):

enter image description here enter image description here

But I don't think you would have trouble to detect whether a cell is empty or not (the global threshold already solves it).

EDIT:

Added the best results I could get with a different approach: region growing. I also attempted some other approaches, but this was the second best one.

17
  • This is a promising suggestion. By the way, for context, I'm trying to get this to work on Android with mobile phone pictures. Will this method improve image quality with blurring in an arbitrary orientation? Will it improve other types of image defects that you'd be likely to see in mobile phone pictures?
    – 1''
    Dec 3, 2012 at 0:49
  • I will restrict my answer to the blurring part, since other image defects is too broad. What I showed here is a deconvolution with a known estimate of the point spread function (PSF) which is a vertical one. There are methods called blind deconvolution, which will try to guess a proper PSF, likely better than the one I guessed. You should look into these if your interest is in a general approach to "deblurrying".
    – mmgp
    Dec 3, 2012 at 1:07
  • I'm mainly looking for something that will consistently give good-quality digits. This may not be worth implementing (OpenCV does not have a built-in implementation) if it will just fix blur, if I can compensate for blur, lighting and other defects with a proper binarization algorithm. What do you think?
    – 1''
    Dec 3, 2012 at 1:11
  • I don't think you can compensate a blur with binarization algorithms, it is a different form of problem. I don't see lighting as a problem here if you already identified the sudoku board. Also, the deconvolution has good chances to provide stronger edges, making it easier for a simpler binarizer. Keep in mind that in Image Processing there is no such thing as "consistently give good-quality for X", where X is any problem being investigated.
    – mmgp
    Dec 3, 2012 at 1:15
  • I googled for deconvolution in opencv and found something that might help you: github.com/Itseez/opencv/blob/master/samples/python2/… and ibm.com/developerworks/mydeveloperworks/blogs/theTechTrek/entry/… (didn't read them, but sound helpful).
    – mmgp
    Dec 3, 2012 at 1:19

Not the answer you're looking for? Browse other questions tagged or ask your own question.