A little group project for the Image Processing class at the University of Applied Sciences of Zurich consisted of building a virtual keyboard.
The task was to freely choose some kind of virtual keyboard (with at least six buttons), and – using a webcam – determine from a static perspective the keys pressed. We were recommended to use MatLab for the analysis but my group took the chance to get to know OpenCV and Python a little (we used PyCharm). Also we wanted to try to capture the keys with changing perspective.
Important: We are anything but Pro’s on Python. I’m sure there will be very obvious mistakes in good practice in the following code. I apologize for that. Remember however that this was our very first project in Python and the aim was to get to know the language.
We agreed on six square white buttons and two circles – one green, one blue – at each end of the buttons.
The first problem is to detect the lowest point belonging to the finger. The method we chose to do this is shaky at best but for our purposes it seemed sufficient. Next, we need to detect the two circles. This turned out to be relatively simple thanks to OpenCV’s findContours() function. These two steps are necessary for all the different solutions we tried next.
Detecting the buttons
For the first approach we simply spanned six circles between the two center points. This – of course – was doomed to fail due to changes in perspective.
For the second approach we tried to stabilize the image in two axes before further analyzing the image data.
def get_center_points(image, old_green, old_blue): green = None blue = None blue_channel, green_channel, red_channel = cv2.split(image) ret, thresh_red = cv2.threshold(red_channel, 127, 255, cv2.THRESH_BINARY) contours, hierarchy = cv2.findContours(thresh_red, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE) for cnt in contours: if 3000 < cv2.contourArea(cnt) < 20000: x, y, w, h = cv2.boundingRect(cnt) aspect_ratio = float(w)/float(h) if 0.5 < w/h < 1.5: o_point = OrientationPoint(int(x + w/2), int(y + h/2), w, h, aspect_ratio, cnt) if not old_green or get_distance(old_green, o_point) < 40: green = o_point elif not old_blue or get_distance(old_blue, o_point) < 40: blue = o_point if not green: green = old_green if not blue: blue = old_blue return green, blue, image def detect_angle(blue, green): x = math.fabs(blue.x - green.x) y = math.fabs(blue.y - green.y) angle = -math.degrees(math.atan(y/x)) if blue.y > green.y: angle = 0 - angle return angle def rotate_source(image, angle): height, width, depth = image.shape M = cv2.getRotationMatrix2D((width/2, height/2), angle, 1) image = cv2.warpAffine(image, M, (width, height)) return image
By detecting shapes in the tresholded image we then were able to detect the buttons properly. Every now and then some buttons are not detected, so we use the last known position of the button. As you can see in the video, the buttons are lagging behind sometimes (especially when shifting perspective).
We are aware that the results are poor. But the time for this little project was very limited and we couldn’t further improve it.
If you are interested in the entire source, you can find it here.
Credits to Philipp Schürch for co-developing.