This is a video of me playing a game that utilizes our technology. The game
tells you what action to perform (shown in yellow at the top), and then if you
perform the action within 10 seconds, a checkmark will appear, and you get points
depending on how quickly you complete it. If you do not perform the action in time,
an X appears and you lose 100 points.
About half way through the video, I perform the actions incorrectly to
demonstrate that our program will not accept the wrong action. For example, at 1:45 I do
right-wax-out instead of right-wax-in, which involves motion in the same area of
the frame, but our software cannot be tricked so easily. As soon I switch
directions and perform the correct action, our algorithm picks it up immediately.
The game randomly chooses one of 13 actions for you to perform:
idle (stand still)
right-wax-in (like wiping the screen in a circular motion)
right-wax-out
left-wax-in
left-wax-out
punch-right (an upwards punch)
punch-left
wave-left
wave-right
sway (like you are at a mellow concert and are holding a lighter in each hand)
watch (tap your left wrist like you're asking for the time)
waves (roll your arms like water waves)
junk (jump around like a maniac)
Introduction
We developed a method to recognize gestures in real-time, making use of the
GPU. To demonstrate our technology, we made a simple computer game that asks the
player to perform a certain action, and gives him points based on how quickly he
performs it. Other applications might include human-computer interaction and
surveillance.
Method
Our algorithm works by examining a localized region of optical flow centred
around the user's face.
Motion features are created based on optical flow which has been blurred over
several frames.
This temporal blur captures the entire set of motion required to complete a
single phase of the gesture.
A multi-class variant of AdaBoost is used to learn key features of flow
intensities that best discriminate each action.
Once a sufficient number of weak-classifiers are learned, they can be used to
classify gestures from new input in real-time.
We compute the classifier scores each frame in real-time for each possible
gesture. We then classify the gesture based on the
gesture which received the highest score.
The arrows in the following image correspond to the top 50 classifiers which
contribute towards the punch-right gesture score.
Each classifier compares a type of flow (indicated by the arrow direction) at a
certain pixel to some threshold value.
Only classifiers whose motion features are above the threshold (or below, for
negative parity) are displayed on the image.