To detect and recognize 3 different kinds of labels pasted on the side of a box using a computer vision application, while being robust for lighting as well as partial occlusions.
General Working Methodology
I approached the problem statement in two ways. I would therefore detail both of them round here.
Method 1: Skewness Correction & Image Matching
In this method, I first converted the image input from camera from BGR to HSV colour coordinates in order to account for less amount of error when dealing with changes in lighting. I utilized the property of the images that they are always shaped like a square in order to do so. Two cases arise, either the quadrilateral formed by the skewed square is complete, or incomplete.
I first dealt with the simpler problem, i.e. when the quadrilateral is complete. In order to detect the quadrilateral, the initial method I used was simply detection red regions (Hue values 165-180 & 0-15) in the image, and then applying canny edge detection. Although giving acceptable results for empty images, if my skin was present in the camera feed, the quadrilateral became undetectable. The strategy I employed then was, finding all red regions, finding all white regions, using dilate operations on the binary images thus formed, and finding the bitwise and of the two dilated images, which in turn gave me the lines that had white on one and red on another side. Thus the image was detected quite well and thus the problem with skin colours was eliminated. Then upon applying contour detection and checking for convex quadrilaterals, I found the label, applied warp perspective transform to correct the skew of the image and then when the image was almost like the original one (rotated at an angle of 0,90,180 or 270), I compared the test images with the output and calculated the root mean square difference in the norms of the samples and the test images and thresholded the output in order to identify the image.
For the case with incomplete squares, what I did was to find the longest convex contour in the image, that have sum of consecutive angles as close to 180 degrees (with error threshold of ±10 to account for perspective changes.) Then the ends of the contour were again treated as a warped square, with the warp angle calculated, and thus the image was skewed back to the original position, and same sized chunks were then cropped from the original image, and similar norm RMS matching was applied in order to threshold the results. The algorithm was correct and although the values did differ by quite a lot when a test image (warped and partially covered) was brought into the view, the values themselves varied over a large range. On point thresholding led to the results getting better, but still left a bit to be desired.
Method 2: Contour Shape Comparison
This method struck late to me suddenly late one night and solved the problems imposed by the original algorithm. I first scanned the image for white regions. Many of them appeared, along with our labels and tubelights all posing as whites. Then, I used blob detection to find all black islands in the thresholded binary image. Then I used floodfill to fill up all the black islands with whites. So I rendered an image with white denoting all white regions along with regions of other colours completely enclosed in white. Then I again thresholded the image for red (in HSV) and found another image. A bitwise and led to the separation of the red coloured blobs that were completely enclosed by white, i.e. the figures in the center of labels. Now I found the largest contours in such images, and compared their Hu moments. The 7th Hu Moment is invariant of skew, so I added maximum weight to the same. In the end the comparisons were amazing and the images were recognized with ease. Partial occlusions (<30%) had almost no impact on the results.
The code is not very readable as of now. The link to the GitHub Repo. The working codes on Ubuntu 14.04 and OpenCV 2.4.10 are step3.cpp and step5.cpp for method 1 and method 2 respectively. I would soon update the codes to be more readable.