Under the Desktop: Searching for the Right Image
Some have declared that Internet searching is the next “killer app” for computing. And the evidence for that claim can be seen in the continued growth of search engine companies as well as from the ubiquitous search box found on Web sites.
Search engine companies keep finding new content areas to explore, such as news, blogs, and images. Professional content creators, above all other Web users, are familiar with image searches, whether from a general search engine, such as Google; or within a stock image library. However, the results of such searches can vary widely.
In a previous column, Ogling Google and Other Creative Search Strategies, I offered some suggestions for finding information specific to content creation workflows. However, these were all text-based searches.
Searching for images much more mysterious. In particular, when looking for images with a general search engine it’s often difficult to understand the results and how the images relate to the search terms.
Quantity over Quality
The first question to answer is: How do search engines find the images they present to us? Even the experts I contacted found it a difficult and seldom-explored topic.
“Each engine’s algorithm [for image searching] is incredibly different,” observed Avi Rappoport, the principal consultant of Search Tools Consulting. “Basically, they all guess on the image based on context. ”
Contrary to the belief of some content creators, search engines don’t actually look at an image and then generate some information about its content. This technology is called image analysis and most companies in that field have switched their business models from general image content to analyzing photos for homeland security applications.
Instead, as Rappoport points out, the engine determines relevance by looking at the whole page, noting the placement of the images and the text immediately before and after the image, as well as filenames and alt tags.
Rappoport said there’s no widely accepted format for adding subject-related information such as keywords and ownership. She pointed to the ID3 tags that provide similar data for MP3 audio files. “Keywords for images would be a very good thing.”
For example, when I performed a Google image search for “David Morgenstern,” I was presented with several pages of photos and even letterhead (see Figure 1). I thought that photos of me from my personal web site would come up first — no way! There were snapshots of other David Morgensterns, mostly kids; some shots from a Macworld tradeshow party (but not of me); and even one of Bill Macy portraying a character on the television show ER. In this first set of images there were several images from my Perry Mason Picture Pages but pictures of me fell below the fold.
Figure 1: The Google image search presents several screens of thumbnail images. Each photo includes its title, dimensions, size, and URL. In this search for “David Morgenstern,” there are many results, some with an emphasis on “Morgenstern,” such as a soccer team photo with listings for players Nils Morgenstern and David Kunze.
In addition to problems with relevance, some image searches can overwhelm the viewer with results. A search for the word “cat” brings 625,000 entries. Narrowing the search by adding the modifier “playful” provides a more-manageable 479 entries. However, many of the results were objects that included a decoration of a playful cat, such as clocks, stained glass windows, and music boxes.
Regardless, most of these images are of indeterminate quality, small and questionable to use, given that most are just posted without any rights or usage information.
Where the Images Are
As I mentioned in my previous column, the best place to search for subject-specific information is to use the street light approach, in other words, to search where the light is brightest, or in this case, where the answers can be found.
Is there a place where images that have been cataloged with useful tags and keywords, and have an established policy for rights? Yes, of course. Stock photography services.
You can see the difference between an image search using a general engine and a directed search with Creativepro’s own Image Grabber search tool (see Figure 2).
Figure 2: Who doesn’t love puppies and kittens? This view of the Image Grabber search tool shows 16 images from the Alamy Images service. I could have set it to display 24 and 32 photos per page.
In this tool, a search for “playful cat” brings up dozens of useful images of cats and kittens playing, some with children or with other felines. There’s one of a child making a cat’s cradle with yarn, but that’s okay. Adding to the convenience, this engine looks at a number of stock photography services with one search. Image Grabber only covers a subset of stock services, many more can be found in Creativepro’s company directory and on the Internet.
Do It Yourself
Many readers may have a Web site showcasing an online portfolio. For those content creators putting up images on the Web, Rappoport offered some suggestions to help search engines discover your images.
First, she said that it’s best to have just one image per page. “With search engines you need the right granularity,” she declared, meaning that the search engine can better judge relevance on a single, shorter page holding just a single image than it can for a page with 100 images. It’s much the same with finding a single dictionary entry, rather than finding a couple of terms sprinkled in a long FAQ document.
In addition, authors should take care to focus the text of the page’s title and any text on the page itself for search engines. The words should attempt to match the criteria people might enter in a search engine. That may be the subject of the piece, rather than your name or the branding of your company. Some search engines also look at the text in the alt text tag for images, so be sure to fill that out as well. finally, be sure your rights and usage requirements are clearly stated.
In a side note, I’d add that the content in Flash-based Web sites is a problem for search engines and thus for content creators. Since these pages are delivered to the viewer as images, they are not indexed by search engines. This makes the content essentially invisible to potential customers on the Internet.
I pointed this out to a designer the other day and he kept asking about fine print that would get around this unhappy situation. His site is almost completely generated by Flash and Java. Except for the title tag, there’s little or no text on the page for the search engine agents, or spiders, to record.
It’s something for content creators to keep in mind, especially when it comes their own branding. As the saying of Joseph Solomon Delmedigo goes: “If you judge by beards and girth, goats are the wisest creatures on earth.” On the Internet, designers need to pay attention both to their images and to the way those images can be discovered.
This article was last modified on January 3, 2023
This article was first published on June 5, 2003
