We
are researching automatic methods for
estimating the pose of human figures in
still images and video sequences.
For still images of human figures, we have been
exploring two complementary approaches to the
problem of localizing 2d joint positions. One
performs a top-down search based on shape matching
to a set of stored "exemplar" human figures (Mori
and Malik, 2006). This approach matches to
exemplars based upon the shape, using "shape
contexts."
The second approach searches for a 2d model of a
human in still images. The novel aspect of this
approach is the use of segmentation of the image
into "superpixels" as a pre-processing step to
reduce the complexity of the search (Mori, 2005).
MATLAB source code for computing "superpixels"
is available here.
We have also applied the exemplar-based approach
to video sequences (McIntosh et al., 2007). The
exemplar matches are used to initialize a
kinematic tracker which automatically discovers
part appearance models from a video sequence.
These part appearance models for each half-limb
are determined by performing a local segmentation
of each image, and the best ones are then used for
tracking.
Results of applying this technique to the CMU
Mobo dataset are shown below (click image to see
video sequence).