Image Analysis and Pattern Recognition

My research interests lie at the intersection of pattern recognition and image processing. In particular, I am interested in the extraction of information from images, and also the identification and extraction of features to accomplish this. Two projects I have begun initial work on in this area are: the identification and extraction of clothing struction information from product photos; and tracking user states/actions via a webcam.

Past projects, from grad school, related to this research area were more often focused on CBIR (content based image retrieval) aspects.

  • Induction method for constructing decision trees from multi-label data. This project involved the development of a learning algorithm that could be applied to assign labels or tags to new items based on end-user assigned labels to prior items (ie. tag cloud data).
  • Method of approximating the K-median in a distributed environment, for K-median clustering. Used for clustering images; intended for applications where the mean vector of vector represented items does not make sense, but a centroid or median would be reasonable.
  • Active learning for region-of-interest image segmentation. Used for user assisted whole object segmentation in photographs, tumor segmentation in MRI images, and determining the percentage of cell staining in tissue micro-array images.
  • Indexing and retrieval of offline handwritten notes, based on an automatically generated codebook of words and objects.
  • N-gram hashing of local shape outline segments. Used for retrieval of cloud-like objects in satellite imagery
  • Hausdorff metric for vector quantization of line images. Used to differentiate types of images (natural versus urban, for example)

In addition to image analysis, I have some interest in more general pattern classification problems, especially those that involve adapting algorithms to unique and unusual data types or applications. For example, while working at the DOE's JGI Production Genomics Facility, I developed methods to extract useful features from four-channel signal data, then developed a learning approach to compensate for the difference and imbalance between training set distributions and the actual production distribution of positive and negative instances.