A new approach to automatically sorting, classifying and retrieving digital images -- based on the way people look at and understand pictures -- promises faster, more accurate image database searches, including better web searches.
Dr. James Z. Wang, Penn State assistant professor of information sciences and technology who holds the PNC Technologies Career Development Professorship, developed the approach when he was a graduate student at Stanford University.
He says the new approach has potential for application in biomedicine, crime prevention, the military, commerce, education, entertainment and web image classification.
The new approach considers no information other than the image itself. Just as a person shown a picture of a horse can extract the features characteristic of horses and then identify other pictures that contain horses, so does the new computer-based approach.
The new system retrieves relevant images from an image database or the web on the basis of automatically-derived image features or content.
Image retrieval techniques currently in commercial tend to rely on keywords or descriptions. While this text-based approach can be accurate and efficient for limited databases of high value -- for example, museum pictures -- it can be prohibitively expensive to input, manually, descriptions of large-scale image databases such as astronomical observations.
The new approach not only reduces the need for textual information but also can handle, quickly and efficiently, the approximately one billion images that can be found on the Internet.
Wang and colleagues have built an experimental image retrieval system, called SIMPLIcity, to validate and demonstrate their methods. It has been tested on a database of about 200,000 general-purpose images and an archive of more than 70,000 medical pathology images.
The SIMPLIcity approach performs better and faster than existing methods and can also be applied to the classification of on-line images and websites.
(Editor's Note: To view a demonstration of SIMPLIcity in action, go to this website.)
Using the same approach, the Penn State scientists have also developed an image filtering system, called WIPE, that parents can use to protect their children from pornography on the web. WIPE identifies and blocks objectionable images. It takes only one second per picture versus other filters that require minutes, and has an accuracy over 90 percent.
(A demonstration of WIPE is also at the same URL)
Wang has detailed his content-based image retrieval (CBIR) approach in a new book, Integrated Region-Based Image Retrieval, published this month by Kluwer Academic Publishers. The book details the design and implementation of the new content-based retrieval system and its application to a general picture library and a biomedical image database.
Wang notes that the capability of existing CBIR systems is essentially limited by the fact that they rely on only primitive features of the image. In his new approach, Wang matches the image features selected to classify the image to the type of picture.
For example, a color layout indexing method may be best for outdoor pictures, while a region-based indexing approach may be better for indoor pictures. The biomedical image database can be categorized into X-ray, MRI, pathology, graphs, micro-arrays and other features specific to the types of images in the collection.
For general-purpose image libraries and the web, Wang has classified images into textured vs. non-textured, graph vs. photograph. His approach represents the first time that categories such as textured vs. non-textured have been used as a distinguishing feature in image retrieval.
In addition, besides using new image features as classification tools, SIMPLIcity uses a similarity measure based on information about the entire image rather than representative segments.
In traditional approaches, computer programs may segment one image of a dog, for example, into two regions: the dog and the background. The same program may segment another image of a dog into six regions: the dog's body, the dog's front legs, the dog's rear legs, the dog's eyes, the background and the sky. The inconsistent segmentation makes it harder to make a match.
In SIMPLIcity, an overall "soft similarity" approach reduces the influence of inaccurate segmentation. The most similar region pairs are matched first and then the matching process is "softened" by allowing one region of an image to be matched to several regions of another image. In this way, all of the regions of the images are taken into consideration.
"SIMPLIcity is robust to intensity variation, sharpness variations, color distortions, other distortions, cropping, scaling, shifting and rotation," Wang says. "The system is also easier to use than other region-based retrieval systems."
The work was supported primarily by a research grant from the National Science Foundation's Digital Libraries Initiative and a research fund from the Stanford University Libraries. Additional support came from IBM Almaden Research Center, NEC Research lab, SRI International, Stanford Computer Science Department, Stanford Mathematics Department, Stanford Biomedical Informatics, The Pennsylvania State University and PNC Foundation.
The work is ongoing at Penn State.
[Contact: Dr. James Z. Wang, A'ndrea Elyse Messer]