Searching the Web: Tools
What are you really searching?
"When you do what is called 'searching the Web' you are NOT searching it directly. It is not possible to search the WWW directly. The Web is the totality of the many web pages which reside on computers (called 'servers') all over the world. Your computer cannot find or go to them all directly. What you are able to do through your computer is access one of several intermediate databases and/or web-pages which contains selections of other web pages organized to allow you to find other web pages and sometimes other databases. You search these intermediate 'search tools,' and they can provide you with hypertext links (URLs) to other pages. You click on these links, and retrieve documents, images, sound, and more from individual servers around the world." (quoted from University of California, Berkeley Internet tutorials)
If you do not have much experience using the Web, click here for the UC Berkeley Search the Internet Tutorial.
What are search engines?
Search engines are databases. They are compiled by programs called spiders, which "crawl the Web" and log the words on each page. (These programs, because they are automated, are also called "robots.") When you type keywords into a search engine's search box and hit enter, the search engine scans its database and returns a list of links to webpages containing the keywords.
When should you use a search engine?
- When you have a complex question with multiple keywords
- When you want to use multiple limits (i.e. domain, language, age of document)
- When you are looking for specific information
Are all search engines the same?
NO! Search engines differ in their use of Boolean operators (and, or, not), truncation symbols (*, ?, %), fields, limits, sorting, display and more. Check out the following Webpages for detailed comparisons of some search engines:
What are some examples of search engines?
What is a meta-search engine?
A meta-search engine sends your search to several different search engines simultaneously; you get results back from all the search engines contacted. Popular meta-search engines include Vivisimo and MetaCrawler.
This Detailed Features Table can help you decide which meta-search tool to use.
What are subject directories?
Subject directories are collections of hand-selected sites picked and organized by humans. They are often called subject "trees" because they are organized into hierarchical categories, starting with a few main categories and "branching out" into subcategories.
When should you use a subject directory?
- When you want general information on a popular or scholarly topic
- When you don't have a precise idea of what you need to know about a topic
- When you need a starting point on a topic with which you are unfamiliar
Are all subject directories the same?
NO! They differ in their size, who selects the sites (subject experts, users), sorting, reviews, etc. Check out the Search Engine Showdown Comparing Internet Subject Directories and Detailed Features Table of Recommended Subject Directories for detailed comparisons. Note that the larger subject directories will supplement their link collections with results from a search engine (e.g., Yahoo!).
What are some examples of subject directories?
The Invisible Web
A huge portion of the Web is "invisible" to search engines; thousands of Web sites cannot be indexed by the "spider" programs that compile search engine indexes. Password-protected Web sites, documents behind firewalls, and specialized searchable databases make up this "invisible" or "deep" Web.
One analogy is:
When spider programs come across a specialized database or password-protected Web site, "...it's as if they've run smack into the entrance of a massive library with securely bolted doors. Spiders can record the library's address, but can tell you nothing about the books, magazines or other documents it contains." (LibrarySpot.com)
The information found through the library's public access catalog (GIL) is part of the Invisible Web. The information available through any paid service (such as most of the GALILEO databases) may also be regarded as part of the Invisible Web.
More information about the Invisible Web is available at:
- Invisible Web: an About.com article
- The Invisible Web: Database contents rarely found in Search Engines: from the UC-Berkeley Tutorial