Search Engines Toss All the Books on the Floor
Barry Hollander, University of Georgia
barry@arches.uga.edu
http://www.grady.uga.edu/faculty/~bhollander
The Internet is the world's biggest library, someone once wrote, but with all the books on the floor. It was true a decade ago when I first taught working journalists how to use Archie and Veronica, gopher and ftp. It is just as true today when the ease and availability of search tools are offset by the sheer volume of information available on the Internet.
Today, everyone is an expert. We all have our favorite search engine and will often defend them with a kind of pseudo-religious zeal often seen when people debate the superiority of Netscape versus Internet Explorer. I come not to evangelize for a specific search engine (though Google is clearly the best), but rather to preach about why it is best to avoid them and use a more directed search strategy.
Simply put, search engines toss all the books on the floor: thousands and hundreds of thousands of them. Then they list them, usually 20 at a time, with little attention paid to whether the site is the Smithsonian or Bubba's Web Page. A quick example: I'm sitting with my daughter and she wants a picture of a horse. I do a quick search for horse pictures on a popular search engine. Boom, a long list appears. I click on a likely looking site and, well, this has probably happened to you too.
Horses. And girls. Friendly horses with very friendly girls.
Search engines are much smarter than they were. Google, for example, organizes its list based on a special algorithm that examines what are the most popular links. Those you get first. Even so, it is easy to get overwhelmed by the information presented. If you absolutely positively must use a search engine, there are ways to make your search faster and more efficient.
First, for basic searches I use Google, but Altavista has some features that make it a better choice--if you take advantage of its advanced search. Here, Boolean terms can be used.
For example, if I search for Barry Hollander without quotation marks, AltaVista pumps back 4,333 hits. I'm not that popular, but AltaVista is looking for web pages with either Barry or Hollander in them. Use quotation marks around my name and it searches only for those pages where the words are next to each other (a more manageable 47 hits). Want only information about me at the University of Georgia? On the advanced page, type: "Barry Hollander" and "University of Georgia." The result? Thirty-five hits.
There are other tricks available here. For example, if I wanted to know who is linking to my web page, I can type: link:www.grady.uga.edu/faculty/~bhollander. This shows me who is linking to my page (or any other page out there, useful from a reporting standpoint if you want to know who is linking to some organization or page in the news). Want just a photo? Some search engines give you that option. In AltaVista you can also type: "image:name" where name is what it is you want an image of.
There are meta search engines as well, vehicles that search multiple search engines and reward you with even more hits than you'd ever want. I do not recommend them.
One aspect of Google that is neat and underutilized is its search of USENET groups. If you don't know what that means, then I'd suggest you look it up, because USENET predates the Web and provides access to a wide range of information, some of it legitimate, some of it bizarre. Like everything else on the Internet, take what you get with a grain of salt.
Some people do all of this work for you. Take advantage of it. Steal shamelessly. My own site listed above has a variety digital journalism links and I freely admit that others do a much better job of packaging what is out there. My favorite from a journalism standpoint is perhaps Powerreporting.com. Develop your own list, not merely by bookmarking or using favorites on your browser, but by creating your own page there on your hard drive. It's not hard, the html, and keeps things cleaner. Create your own page and update it constantly with the sites that have worked for you and that others point to, especially those with domain names you can trust, such as gov or edu.
Want to find people? Tons of sites out there, such as Whitepages.com that even offer a way to do a reverse search by phone number. Neat. Or for the really paranoid, there is the Stalker's Home Page. Keep in mind, some of what is offered here actually costs you to use.
A successful search is fast and uncluttered with random, questionable hits. Obviously when looking for documents or data, it is best to go to the source, usually a university, corporation, or government agency. In AltaVista, you can even narrow a search by typing domain: and then the search term, which then limits your search to just certain sites such as gov.
A third use of the Internet, one that receives less attention, is the building and maintenance of communities. Internet sites closely resemble magazines in their attention to deep, vertical interests, and new sites spring up daily that focus on narrow niches, from hobbies to gaming. These communities are a threat to magazines, creating virtual places where people share the kind of information often found in magazines themselves.
The community-building on some of these sites are chat rooms, while on others they are discussion forums. Online auctions have now become as much communities as places for e-commerce, with people spending as much time talking to each other as buying and selling goods. And long before there was a World Wide Web, there existed a whole alphabet soup of Internet interactivity: MOOs, MUDs, IRC, and similar virtual worlds where people meet and interact. Virtual culture, the way we act and the way we study how others act on the Net, is something yet to be fully developed or understood. When you visit, learn the rules before you act.
For most of us, the Internet is a communication channel or a place to find something out. Sometimes, wandering the library can be enjoyable, a journey of discovery, but often you want to go right to the book you're looking for. Pre-searching can save a great deal of time. And like traditional libraries, the Internet is becoming more organized, but it never hurts to know your way around before you start your search.