The Vision and Media Lab
Human Body Pose Estimation
We are researching automatic methods for estimating the pose of human figures in still images and video sequences.

For still images of human figures, we have been exploring two complementary approaches to the problem of localizing 2d joint positions. One performs a top-down search based on shape matching to a set of stored "exemplar" human figures (Mori and Malik, 2006). This approach matches to exemplars based upon the shape, using "shape contexts."

The second approach searches for a 2d model of a human in still images. The novel aspect of this approach is the use of segmentation of the image into "superpixels" as a pre-processing step to reduce the complexity of the search (Mori, 2005).

MATLAB source code for computing "superpixels" is available here.

We have also applied the exemplar-based approach to video sequences (McIntosh et al., 2007). The exemplar matches are used to initialize a kinematic tracker which automatically discovers part appearance models from a video sequence. These part appearance models for each half-limb are determined by performing a local segmentation of each image, and the best ones are then used for tracking.

Results of applying this technique to the CMU Mobo dataset are shown below (click image to see video sequence).

Chris McIntosh, Ghassan Hamarneh and Greg Mori. Human Limb Delineation and Joint Position Recovery Using Localized Boundary Models. IEEE Workshop on Motion and Video Computing, 2007. [pdf]
Greg Mori and Jitendra Malik. Recovering 3d Human Body Configurations Using Shape Contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006. [pdf]
Greg Mori. Guiding Model Search Using Segmentation. IEEE International Conference on Computer Vision, 2005. [pdf]