How Internet 'Search' Works
With billions of web pages online, you could spend a lifetime surfing the Web, following links from one page to another. Amusing perhaps, but not very efficient if you are after some specific information. One of the biggest complaints we hear concerns the difficulty of finding targeted information. Where do you start? Searching the Internet requires part skill, part luck and a little bit of art. Fortunately, a number of free online resources help with the hunt.
You've probably heard of search engines such as Yahoo!, Google, and Ask Jeeves. There are literally dozens of these tools to help you locate what you're looking for. The trick is understanding how they work, so you can use the right tool for the job.
Search engines break down into two categories--directories and indexes. Directories, such as Yahoo!, are good at identifying general information. Like a card catalog in a library, they classify websites into similar categories, such as accounting firms, English universities and natural history museums. The results of your search will be a list of websites related to your search term. For instance, if you are looking for the Louvre museum website, use a directory.
But what if you want specific information, such as biographical information about Leonardo da Vinci? Web indexes are the way to go, because they search all the contents of a website. Indexes use software programs called spiders and robots that scour the Internet, analyzing millions of web pages and newsgroup postings and indexing all of the words.
Indexes like MSN Search and Google find individual pages of a website that match your search criteria, even if the site itself has nothing to do with what you are looking for. You can often find unexpected gems of information this way, but be prepared to wade through a lot of irrelevant information too.
Search results may be ranked in order of relevancy--the number of times your search term appears in a document--or how closely the document appears to match a concept you have entered. This is a much more thorough way to locate what you want.
Let's perform an online search using three popular search engines--Yahoo!, Google, and Ask Jeeves--so you can see how they work and how you can develop an efficient search strategy.
Here's the challenge: You are planning a trip to San Francisco and you've always wanted to ride a cable car. Do they operate in January? How can you find out?
First we'll try Yahoo! One trick when searching is narrow your focus. Entering "San Francisco" in the search box results in over 38 million sites related to the City by the Bay! Entering "cable cars" results in 141,000 sites, many of which have nothing to do with San Francisco. By combing the terms ("San Francisco cable cars"), The search returns 9,000 sites, along with a brief description of each one. Now you have to visit each site to see if there's any information about cable car schedules. Fortunately, the first three sites listed all contain information pertinent to our search.
Now let's try using Google, an index-based search engine. Once again, enter "San Francisco cable cars" in the search box. The Google search results in over 1.4 million documents that match the search terms. Life is too short to comb through all these. The reason for this enormous list is that Google turns up every document that contains the words "San," "Francisco," "cable," and "cars." To search for documents that contain just this phrase, use quotation marks around the terms ("San Francisco cable cars"). Doing this results in about 9,000 documents. Fortunately, Google smartly ranks sites in order of relevancy and popularity, so the first few are more than likely to have information about schedules.
Ask Jeeves uses a technology called natural language query, a fancy way of saying that you can ask your question in plain English. By typing a question like "What is the San Francisco cable car schedule?" you get a list of related pages. Once again, you will have to go to the site, but in this case, it's a no-brainer as one of the choices is "San Francisco Muni Powell-Mason/Hyde Cable Car: Complete Schedule."
Bear in mind that websites tend to change often. These changes are not always reflected in the search engine database, particularly for directories. Typically, websites are registered with search engines when they first go online. After that, changes are not reported generally. To find the most recent information, your best bets are search engines that use Web-indexing robots, software that constantly search the Internet, recording additions and changes.
Regardless of which search engine you use, it really pays to find out the particulars of how it works. Take the time to read the search tips on the respective sites. For instance, how does the engine handle searches that include more than one word? Most engines, but not all, return results that include any of the words. Because there is so much information online, you will usually want to limit the scope of your searches. How do you do this? This is a good point to digress a bit to talk about Boolean operators.
The English mathematician George Boole developed an algebra of logic that has become the basis for computer database searches. Boolean logic uses words called operators to determine whether a statement is true or false. The most common operators are AND, OR and NOT. These three little words can be enormously helpful when doing online searches. A few examples show why.
|cable AND car||Documents with both words|
|cable OR car||The greatest amount of matches; documents with either word|
|cable NOT car||Documents about cable, but not about cable cars; a good way to limit the search.|
The exact syntax each engine uses varies, so familiarize yourself with each one's unique properties.
Rather than search each directory or index individually, you can submit your query simultaneously to multiple search engines by doing a metasearch.
Whether you want to search for information about cable cars, investments or any other subject, here are our five favorites:
Remember, all search tools are not alike. Each uses a slightly different methodology, so your results will vary. You may not always find what you're looking for on the first try.
A final word of advice. The Internet may not be the best place to find certain information. While it abounds with computer-related subjects, it is not as good for historical information. The telephone and a sharp reference librarian may still be your best bet.