Report on Content Based Image Retrieval (CBIR)

Monica Haladyna
Professor Bijoy Misra
Olivier Carnohan

Samples of retrieval by color: Figure 1 below

These jade figurines for example could be implemented to the image retrieval mode by the means of color.  Since some of my pictures have dark colors/tones, this would be a great way of retrieving pictures that either you don't remember their name, but do remember if they were a dark color or how the colors are distributed with a color Histogram. Another way to retrieve such an image would be to search by the color layout and choose either from the thumbnails something similar or by typing in the percentages of shades that could give you this or similar results. This image is named OlVentaCoeFigurines.jpg.

Sample of retrieval by color: figure 2 below

The name of this picture is Mother and Child figurine a from Tlaticlo. One of the oldest figurines from the New World. As one can see this image has a strong resemblance to  the gathering of warriors above.  The colors are dark and have the same background. If a person needed to search a picture such as the one here, in a relatively small database they could search by a histogram that  has a specific amount of color distributed throughout the picture. Or by typing the amount of shades like 30% red, 50% black.  This is another  way that they could be searched and retrieved successfully unless there were just thousands of pictures which would make the search more complex and further refinedment would have to be made.

Sample of retrieval by color: figure 3 below

 

This is the third example, a clay mask from Teotihuacan which was a representation of a God. As can be noted, the colors and the shades of red and black are similar to the first two (above).  A histogram that pre-establishes the luminosity and/or the distribution of color throughout the pictures could come up with these results.

Sample retrieval by color: figure 4 below

This is the Classic Stone Head From one of the books by Michael D.Coe. It is a Post-Olmec stone sculpted head of a warrior. If we did a search by color in this one, it would most likely show up along with the ones shown above. This would be possible if it was searched with specific amounts of dark tones in the color histograms. It is really hard to say just how many of the pictures could really appear if there is really no link to a file such as in QBIC. That means that a great deal of work would have to be performed with each individual image. Such as storing different types of histograms in a database that would then go look for certain patterns that were established before with each individual image. Color distribution Histograms, luminosity histograms, layout distribution of color throughout the picture would have to be also stored. 

 

RETRIEVAL BY TEXTURE SAMPLE IMAGES

Samples that are used to depict how texture based content retrieval could be made. These four samples that have been chosen have very different textures, however the concept that QBIC supports that  a great variety of images should be utilized so that there are plenty of textures to choose from below are the examples that have been chosen for this demo.

Sample of retrieval by texture: image 1 below

This is one side of a temple in Oaxaca, in the city of Mitla. A sample of this could be almost ideal for a texture based query since there are patterns on the facade and it could retrieve the most similar from all types of places and objects that are similar. This type of textural based search would be very effective with this particular sample on a large database with hundreds of architectural facades. If that was possible it would help me also very much to complete my search for images from Mesoamerica. However it is important to keep in mind that all these samples are "real-world" images and that a texture search would be searching all the multiple textures that could apply to a particular sample.

Sample of retrieval by texture: image 2 below

This image could be cropped to include a small fraction of the pyramid step to represent so many pyramids in mainly Mexico, that could be searched just by the texture alone. The steps all have a certain directionality which would enable the query by content search.

Sample of retrieval by texture: image 3 below

This is the pyramid of Tikal, if a sample piece was cropped from one of the sides and used for the texture query (when first pre-established) would recognize the granular structure, perhaps some of the steps that have the same direction.

Sample of retrieval by texture: image 4 below 

This is jus a small, cropped piece of pyramid steps that can be used as an example of the diverse types of real-world textures which show a variety of the different factors that constitute the texture query.  The directionality of the steps could be one key concept. Or as I mentioned above the granular texture, naturally it would have to be focused and the image embellished so that it is recognizable. But these are just some samples to be considered among endless types.

 

Introduction:   CBIR or Content Based Image Retrieval is a relatively "new" way to retrieve images that could be hard to access. The images can be searched by color, texture, shape which are considered the primitive features.  Also by more advanced retrieval methods such as logical (objects shown), or abstract (such as the important scenes that are depicted) however these two are still under development.  For the purposes of this document only the primitive features will be analyzed and shown with a few examples that are annexed to the left.  The main target of CBIR is to be able to retrieve images automatically by extracting the most important raw data. This is an innovative aspect since no one would really like to go through thousands of pictures either looking for particular picture, a similar picture to the one that they had in mind.  Since most of the CBIR software uses thumbnails as examples, so that the viewer can either choose initially one that conforms to the image he/she had in mind, or one can start searching by color histograms, color distribution or any other form that is available. When the Query is sent, approximately 20 images will be output to the screen.  These images were the result of the search and can be further refined, but it is important to keep in mind that the images are retrieved automatically and they hold no links to a certain category.

 1) Retrieval by color: The selected images on the left are four examples of how the color search could be accomplished.  For example if an individual wanted to search a specific image but could not remember the subject name, theme, or anything about the picture except for the colors or shades used, then  just about any CBIR search engine could be used to search for the sought image. Naturally there could be hundreds even thousands of choices in one database, that is one of the reasons that there are more than one way to search images based on the content.  The images would be stored with a color histogram to show the proportion of pixels applied to each color image. The user, would then choose a sample (QBIC) that resembled the image he/she had in mind and the database would automatically retrieve approximately twenty images that displayed which have just about the same characteristics. This was invented by Swain & Ballard in 1991.  

One of the databases that would be excellent for the  retrieval of images by color and texture, but not shape is the QBIC (Query By Image Content) from IBM . At the QBIC the way to search is basically visual. There are really no words or key words to be typed only searching by layout, color histograms, textures, and a special query search (which is called Qb Hybrid another color feature class).  Once you have chosen the way you want to search for similar images, you will get quite a few results. When you have the picture you want  all you have to do is click on it and it will isolate it from the rest, enlarge it and also give you all the information pertaining to it such as the year in which the image  was created and other types of detail that could be valuable to the researcher such as the name of the photographer or painter. There is another custom query search where the user can decide how much percentage of each color is to be searched. This is very useful when you know which tones or colors or distribution of color that the user  wants for the image that is being sought. Another site to be considered which is useful is tat the UC Davis QBIC Project. Here they employ basically the same methods of the IBM QBIC database, except that it has a section in which you can not only search by color percentage, color layout, and texture but also by choosing the colors that you want and drawing a sketch.  The results are excellent.  The sketch drawn was a tree and consequentially the result was many trees of roughly the same shape. However the thumbnails don't look high quality when they are on display but once that they are clicked and isolated you get also some of the most remarkable images: Digital Art Library at UC Davis and one sample image that was looked up to see how effective the search was and the result was the following a painting named The Conversation.

So far most of the techniques covered in this paper on CBIR, have made direct reference to the QBIC database because it is one that is widely used in many Universities and Companies.  There are other software that can be used for similar content based retrieval such as Excalibur software. 

For the images that I will be using for the final project, a great majority of them could easily be searched by color. As for the chosen ones for color, they have in fact, very similar luminosity histograms that were checked with photoshop5.5. There are other images that are grayscale which would make the search much easier because for one there aren't too many grayscale images in the final project. Second, there could be small thumbnail samples that could be displayed that showed different shades of gray. And depending on which particular image a person wanted to search in the grayscale mode most of them would be instantly retrieved. As for other images that could be searched by color histogram (which are the vast majority), it would be more difficult to incorporate a color query. The main reason is that some of the pictures are scanned from books that are in black and white or from faded postcards that really do not represent fully the true colors of the objects photographed. Therefore if a person is looking for the true colors of an original image, let's say a sculpture, and they have seen this particular sculpture somewhere for example in a museum. And let's again suppose that the person  goes try to find the white marble sculpture in a database and to their surprise they cannot find it. It is a possibility that the person who wanted to find the statue did not think of the possibility that the statues image was there but in a different shade. The white marble statue could have been scanned from a faded book, thus, not keeping the colors true of the real object but instead making a variation of the object itself in the scheme of colors. Maybe the user should have checked the shades of yellow! For this and other reasons I believe that searching by color in my own collection would be flawed. There are only a few images that I can personally  verify that they are "true" to the concept of the true color of the object itself. 

There are a few places to check out that have used QBIC such as The French Ministry of Culture which is also a very interesting and informative site.

 

 2)Retrieval By Texture: This could be directly applied to the image collection that I have collected. Again, there are some difficulties with this since I believe that some of the quality of the digitized material is not perhaps the best or most reliable. However, the QBIC could be nicely structured for a texture search since most of the Mesoamerican themes in the collection have many traits in common. Such as sample images of many pyramids throughout the Mexican republic are very similar in the materials that were used, and have a particular type of color, but most important the distinct granular texture.  The samples that I chose have slight variances but can nevertheless be used to fit in a certain pattern. According to the QBIC the texture retrieval mode would be made through second order statistics which are calculated from the image query and the stored images. It calculates the relative brightness of selected pairs of pixels from each image (Eakings &Graham). The image is then compared to the brightness or contrast, coarseness, directionality, and randomness. For my samples the coarseness would be the anchor to looking up the images in the collection by putting in the thumbnail samples a few images that have the same or similar texture (coarseness).  The only problem with texture is that you get images that may have the same or similar texture but very different colors or shades than the ones looked for.  The benefit of using the QBIC would be that its visual and you don't have to remember all the details such as the name of the image or the artist. There is not such a large variety of software for the CBIR as of now, let's look at the different things that Excalibur has to offer. The images stored and retrieved in Excalibur are not so different from the QBIC database, however, they in turn store their images by binary pattern recognition. In other words how the pixel values were distributed throughout the image. And also it is notable that in the software you can actually go and point to a certain picture and it can link it to the internet and look for similar ones!  One must keep in mind that in a texture search if you search for a similar texture that a dog has, you will get many different and varied results. This is because the pictures are retrieved due to their pixel value distribution or a layout of the directionality of the image, not the object itself. To take a closer look at how texture can be applied for searches this one comes from the MIT  Photobook. Or there is one sample that really fits the concept of closely related texture based images.   The images for my project are all "real-world" based so if there was a texture search it would have to search through multiple textures to get the best results.

 

 

3)Retrieval by Shape:  This would be a CBIR based search by having already pre-determined the objects beforehand. It is also a primitive based feature in which the objects are recognized and retrieved.  Queries are answered by computing a set of features for the query image and retrieving those stored images whose features most closely match those of the query.  These include global features which can calculate the aspect ratio, movement variants, local features as a set of consecutive boundary segments. There are "shocks" which are skeletal representations of an object shape that can be compared to another when using a special graph. But This I would not use since there are too many shapes and it would complicate the query instead of facilitating it for the user.

Other Search Options for CBIR: It was very useful and educational to read about the Iconclass. It is a way of retrieving images by means of checking which icon you want. Each icon represents a certain theme so when you have an idea of what you are looking for you can keep refining your search. This would be ideal if it could be used on a database that has many images with numerous themes and the icons can specify which area you want to look into. Such as an icon representing art, and alter another icon representing certain periods in art history and the refinement could be not so hard--many pictures would be retrieved and then if there are specifics something else could be queried.

There are other CBIR sites such as Excalibur that should be noted since they offer just about the same software as QBIC except Excalibur's quality and offerings seem to be a bit more promising since they not only offer the same primitive type of search but also one that can support java applets and other more sophisticated things. Also in Excalibur the images are stored not by histograms but are stored in a binary way and they are also retrieved in a binary format depending on the binary pattern that the image has.

Another place that perhaps doesn't relate absolutely to the CBIR but is worthy of mention is the ICONCLASS, that was invented in the Netherlands.  It is based on being able to search anything in the web with three search modes and a hierarchy of categories. It has the capacity to search very well documented bibliographic material and also when specified can search similar to Excalibur for specific pictures or more abstract concepts such as a specific document.

One other internet site that I actually found the most useful with regards to user interface, speed and precision for retrieving images is called AMORE because it is just simply excellent.  It has two distinct ways to search by either "very similar" or "semantically similar."  The later I found most advanced and precise in the speedy retrieval of about 50 art images after I specified which one I chose for the query. The image that was chosen for the query is set aside so that you always have some sort of reference and can really compare if it is similar in either the primitive features or the more advanced theme or semantic search. It is the one database that if I could implement for my final project I would consider it seriously.

There are many other uses for CBIR such as the medical community, for geographical detailed information, architectural design and almost any area where the individuals need to check with a sort of visual reference.  

University of Purdue, which are currently working on a CBIR project. University of Maryland, also is on CBIR project.

Art & Text Media, the Arthur humanities search by query. Photo Database Manage Systems is also interesting to look at.

Photo Database Manage Systems is also interesting to look at.

LLC mhbsolutions © 2007 All Rights Reserved