CIS261-Internet


Chapter Assignment:
Searching the Web


I.  Overview

By the end of this assignment you should know how to

  1. Create search expressions
  2. Conduct basic and advanced searches
  3. Use web search engines
  4. Conduct searches on meta search engines
  5. Use a natural language query interface to conduct a search
  6. Find pages using directory links at a search engine's home page
  7. Create search expressions using boolean operators
  8. Create a search expression using wildcard characters

To get credit for this assignment, you must complete all hands on activities within the assignment.  As you read through the material, you will find sections labeled "Hands On"  you  must complete that activity - each Hands On section is worth points.


II.  Search Engines

Computers that conduct searches on the Internet are called search engines.  To use a search engine, you need to go to their website and fill in a form indicating what you want to look for.  Once you submit the search, the search engine will retrieve a list of pages that you can view.

Some of the more popular search engines include:

Each search engine is sponsored by a different company and consists of a database where Internet information is stored and indexed.   Each one stores slightly different information and each one indexes their pages differently.  For example, one search engine may store web pages only and another may store web pages, newsgroups, stock reports etc.

Internet information gets into a database in a variety of ways.  Sometimes page authors submit their pages along with appropriate keywords that can be used to display them.  Other pages are included because the search service includes software called spiders, bots or crawlers that follow links from page to page, site to site including them in their database (so pages that were never submitted to the search engine can display if a person enters the right keywords).   

Pages cannot be viewed until they are included in the search engine's index.  Each search computer has its own unique way of indexing web pages. Engines like Excite index each page individually (basically ignoring home pages). Other search engines index based upon the first few paragraphs contained within a page or keywords submitted by the author. The search engines also vary in the way they display page summaries. Some display the first few paragraphs of text as the summary. Others let the author submit summary text. While still others, like go.com let the authors place codes into their text (which the reader cannot see) that are used for the summary.

Since different search engines contain different pages and they index pages differently,  you should use more than one search engine when conducting research on the Internet.   If you don't find many pages at one site, go to another and try the search again.


III.  Conducting a Search

A.  Go to the Search Engines Home Page 

In the address bar, enter the web address of the Search engine you want to use.

Every search engine has a form on the main page.  To conduct a search you must type a keyword (or keywords) into the search engine's form and then click the search button.

Some helpful hints for entering keywords include:

1.  +keyword 

A plus sign in front of a keyword indicates that the word must be included in the search results (example:  Bill + Cosby specifies that the word Cosby appear in all search results).

2.  -keyword 

A minus sign in front of a keyword indicates that the word should NOT appear in search results 
(example:  aerobics
-step  specifies that the word step should NOT appear in search results).

3.  "keyword1  keyword2..." 

Including a phrase in double quotes requires that search engines find the phrase as entered in all search results (example:   "World War II" would only retrieve pages containing the text exactly as it is keyed in).

4.  KEYWORD 

If you enter a keyword in all capitals, it will only look for capitalized text.

5.  keyword 

If you enter a keyword in all lowercase letters, it will look for upper and lowercase.

6. Keyword 

If you enter a capital letter, it will only match pages that have the capital letter in the same location within the word.


Hands On #1

1.  Conduct the Belize annual rainfall search as directed on page 4.08 (omit step #1 and go to: www.altavista.com ).   Proceed with step #2 in the instructions.  When you find a page that displays the annual rainfall, save the page to your hard drive by selecting the File menu, Save As command.  Make sure you know what the name of the page is and the file it is being saved to. 

Online students:  To see a video demonstration, download: altavista_search.exe 

2.  Conduct the same search using HotBot as directed on page 4.09 (omit step #1 and go to: www.hotbot.com ).  Proceed with step #2 in the instructions.    When you find a page showing the average annual rainfall, save the page to your hard drive (make sure the name of the page is different than the name used in #1).  

Online students:  To see a video demonstration, download: hotbot_search.exe 

3.  Conduct the same search using Google - see page 4.11 (omit step #1 and go to: www.google.com ).   Proceed with step #2 in the instructions.     Save the results page to your hard drive.  

Online students:  To see a video demonstration, download: google_search.exe 

4.  Go into your email program and create a mail message.  The subject should be Belize Rainfall.  Attach the files you saved in #1, #2 and #3 to the message and send it to me.  

OR

Upload each file to Blackboard's dropbox

5.  Send me an email message and tell me which of the 3 search engines returned the best results and why.


IV.  More on searches...

1.  Page Ranking:

Many search engines use a "page ranking" system to determine which links to display first in the results page.  They look at the number of pages linked to the page - the  more pages that link to a page, the higher the page ranking.

2.  Natural Language Queries

You can enter the query in the form of a sentence or question. 

Example #1 the question asked was:  "What is the average snowfall in Michigan?"

3.  Directory Links (web directories) 

Many search engines include directory links at their web sites.  The links categorize web pages included in their database.  To use the directory links, you must pick a topic that you are interested in.  Once you click on the link, you will see a listing of subtopics.  You will need to click on the subtopic you are interested in. This will display either more subtopics or actual web pages.  If you see more subtopics, you will need to continue selecting subtopics until you finally get to web pages.

Example:


Hands On #2

1.  Conduct a natural language query using the AskJeeves website - see page 4.12   (omit step #1 and go to:  www.askjeeves.com )   Proceed with step #2 in the instructions.  Save the results page to your hard drive.

Online Students:  To see a video demonstration, download: askjeeves_search.exe 

2.   Go into your email program and create a mail message.  The subject should be AskJeeves.  Attach the file you saved in #1 to the message and send it to me (or you can upload the file to Blackboard's dropbox).  

3.  Use Yahoo's directory links to find a page - see bottom of 4.15-4.17  (omit step #1 and go to:  www.yahoo.com)  Proceed with step #2 in the instructions.   When you find a page that provides background information on the Conference Board, save the page to your hard drive.

Online Students:  To see a video demonstration, download: yahoo_directories.exe 

4.  Go into your email program and create a mail message.  The subject should be Conference Board.  Attach the file you saved in #3 to the message and send it to me (or you can upload the file to Blackboard's dropbox).  


V.  Meta Search Engines

These search engines send your keywords to multiple web page search engines and retrieve all the results.  The results are sorted by web search engine.


Hands On #3

1.  Conduct a meta search using Dogpile - see pages 4.18-4.19   Omit step #1 and go to:  www.dogpile.com   Follow instructions beginning with step #2.  Save the results page to your hard drive.

Online Students:  To see a video demonstration, download: dogpile_search.exe 

2.  Go into your email program and create a mail message.  The subject should be DogPile.  Attach the file you saved in #1 to the message and send it to me (or you can upload the file to Blackboard's dropbox).  


VI.  Web Bibliographies/Resource lists, Clearing houses and Virtual libraries

These web sites include reference material similar to a library.   (Many also include reviews of the reference material and annotated references).

Examples of clearinghouse web sites include:

Examples of library web sites include:

Examples of resource list sites include:


Hands On #4

1.  Conduct a search using the Argus Clearinghouse - see pages 4.20-4.22  (omit step #1 and go to: www.clearinghouse.net ).  Proceed with step #2 in the instructions. Save the National Biotechnology Information Facility Guide information page to your hard drive.   Visit the web page reviewed.  

Online Students:  To see a video demonstration, download: argus_search.exe 

2.  Go into your email program and create a mail message.  The subject should be Argus.  Attach the file you saved in #1 to the message and send it to me (or you can upload the file to Blackboard's dropbox).  In the body of the email message indicate whether you agreed with the rating Argus gave the site.  State why you agreed or disagreed with the rating.


VII  Advanced Searches

If you get too many results (also called hits) from the main search window, you can conduct an advanced search.  All the search engines include this feature because it lets you customize your search so you get fewer hits.

The search forms vary from one search engine to another.  To get into an advanced search from, go to the search engine like you normally do, then find the "Advanced Search" link.  When you click on the link, you will see a different (more complicated) form.  Fill in the portions of the form that you want to use and submit the search.

Advanced searches often use the Boolean operators:  OR, AND, NOT  to customize searches.  

Examples:

1.  keyword1 AND keyword2 

The AND operator specifies that both words must appear in search results.   
(example:  step AND aerobics )

2.  keyword1 OR keyword2 

The OR operator specifies that the results should contain 1 word or the other (the page doesn't have to have both).

3.  keyword1 AND NOT keyword2 

Specifies that the page should NOT be selected if the second keyword appears 
(example:  aerobics AND NOT step specifies that pages with step aerobics should not be displayed)

4.  keyword1 AND (keyword2 or keyword3 or keyword4) 

Using parenthesis lets you build complex searches.  Anything within the parenthesis is performed first. (example:   Hawaii AND (Maui OR Honolulu) will display only those pages that contain Hawaii & Maui or Hawaii & Honolulu)

Search engines may also use wildcard characters ? or *  to replace text in the expression.  The * can be used to replace multiple characters.   The ? can be used to replace a single character.

Example:

1.  key* or keyw*rd 

An asterisk at the end of a keyword or in the middle of the word is a wildcard character - it can be used when you aren't sure of the exact spelling or if you want to retrieve multiple spellings of a keyword (example:  exercis* would retrieve exercising, exercises, exercisers etc.)


Hands On #5

1.  Conduct an advanced search at AltaVista - see pages 4.26-4.27 (omit step #1 and go to:  www.altavista.com). Proceed with step #2 in the instructions.   Save the results page to your hard drive.  Attach the page to an email message and send it to me.  The subject should say:  Advanced Altavista   (NOTE:  You can also upload the file to Blackboard's dropbox).  

Online Students:  To see a video demonstration, download: advanced_altavista.exe 

2.  Refine the AltaVista search by clicking the Change link at the top of the results page (right below the keyword search box).

You should see the advanced form display.  Enter the following boolean query and execute the search:

Attach the results page to an email message and send it to me.  The subject should say:  Refined Altavista   (NOTE:  You can also upload the file to Blackboard's dropbox).  

Online Students:  To see a video demonstration, download: refined_altavista.exe 

3.  Conduct an advanced search at HotBot - see pages 4.28-4.30 (omit step #1 and go to:  www.hotbot.com )  Proceed with step #2.   Save the results page to your hard drive.  Attach the page to an email message and send it to me.  The subject should say:  Advanced HotBot   (NOTE:  You can also upload the file to Blackboard's dropbox).  

Online Students:  To see a video demonstration, download: advanced_hotbot.exe 

4.  Conduct an advanced search at Excite - see page 4.31 (omit step #1 and go to www.excite.com ).  Proceed with step #2.   Save the results page to your hard drive.  Attach the page to an email message and send it to me.  The subject should say:  Advanced Excite   (NOTE:  You can also upload the file to Blackboard's dropbox).  

Online Students:  To see a video demonstration, download: advanced_excite.exe 

5  Conduct an advanced search at Northern Light  - see page 4.32 (omit step #1 and go to:  www.northernlight.com )  Proceed with step #2.   Save the results page to your hard drive.  Attach the page to an email message and send it to me.  The subject should say:  Advanced Northern   (NOTE:  You can also upload the file to Blackboard's dropbox).  

Online Students:  To see a video demonstration, download: advanced_northernlight.exe 


VIII.  Summary of the Search Terms/Concepts

  1. Search Engine - computer that searches a database for key words/phrases you have entered
  2. Hit - web page contained within a search engine's database that contains text that matches your search expression.
  3. Results Page - page of links returned by the search engine that match your search criteria
  4. Web robot (also called a bot or spider) - program that adds pages to the search engines database.  The program follows links from site to site, page to page.   If you place a page on the Internet and someone else links to it, this program will find and index your page without your knowledge. 
  5. Meta Tag - codes inserted into a web page by the author of the page in order to provide search engines with keywords and a page description.  NOTE:  Only some of the search engines look at meta tags when indexing their pages.
  6. Full text indexing - occurs when the computerized database indexes all main words in every page (small words like it, the, and etc. are omitted).
  7. Web Directory - categories created by search engines that will lead to additional subcategories and eventually web pages indexed by that computer.
  8. Meta Search Engines - use a single form to conduct searches on multiple search engines all at the same time 

IX.  Online Students Only:  summary of what you should turn in

1.  Belize annual rainfall web page found using AltaVista.  
2.  Belize average rainfall web page found using HotBot.  
3.  The Belize annual rainfall results page retrieved from Google.
4.  The Belize annual rainfall results page retrieved from AskJeeves.
5.  Conference Board web page retrieved from Yahoo's directory links & search facility.
6.  The Belize annual rainfall results page retrieved from DogPile.
7.  The web page containing guide information on National Biotechnology Information Facility (the page is from Argus Clearinghouse).
8.  The results page from the Boolean search in AltaVista's advanced form
9.  The results page form the revised Boolean search in AltaVista's advanced form
10. The results page from the Boolean search in HotBot's advanced form
11. The results page from the Boolean search in Excite's main page.
12. The results page from the advanced search at Northern Light.

NOTE:  These pages can be uploaded to Blackboard OR they can be emailed.