CIS261-Internet
By the end of this assignment you should know how to
To get credit for this assignment, you must complete all hands on activities within the assignment. As you read through the material, you will find sections labeled "Hands On" you must complete that activity - each Hands On section is worth points.
Computers that conduct searches on the Internet are called search engines. To use a search engine, you need to go to their website and fill in a form indicating what you want to look for. Once you submit the search, the search engine will retrieve a list of pages that you can view.
Some of the more popular search engines include:
www.google.com - stores web pages. Uses a software robot called Googlebot to identify and evaluate more than a billion pages of content on the web. Includes a page ranking feature that gives weight to more popular and heavily used sites.
www.altavista.com - stores web pages and usenet newsgroups. Includes foreign documents and domestic. Uses full-text indexing without excluding common words (it includes everything).
www.lycos.com - stores web documents, usenet newsgroups, news, stocks, weather. Indexes the title, headings, subheadings and the hyperlinks to other sites along with the first 20 lines of text and the 100 words that occur most often.
www.teoma.com - stores web pages (used to be directhit.com). Includes tips for refining your searches.
www.webcrawler.com - stores web pages and usenet newsgroups. Uses "natural language searching" and Boolean searches. In natural language searches, users type the query in normal English.
www.yahoo.com - known for it's directory of links. Web authors must submit their pages. All pages are reviewed by editors - if the page is good, it is included at the site (they do not include all pages submitted). Specifically searches titles, comments, URL's and descriptions of the Web sites submitted.
www.hotbot.com - stores web pages. Uses a full-text indexing system in which all words are included in the searchable index except commonly occurring words like and, the, a, an, is etc..
www.go.com - stores web documents, usenet newsgroups, usenet FAQ's (frequently asked questions), reviews, topics. Uses a full-text indexing system in which all words are included in the searchable index except commonly occurring words like and, the, a, an, is etc..
www.excite.com - stores web documents, usenet newsgroups, stock quotes, sports, weather reports, headline news, corporate reports. Does concept based searches (rather than keyword). It looks at the text you enter and applies meaning to it, then it looks for pages that might match based upon the meaning rather than the words you enter.
search.msn.com - store web documents. Uses crawler software to include links from pages on the Internet. General search results include results from: popular sites, featured sites (Microsoft accepts payment in exchange for these listings), web sites within MSN's directory and web sites retrieved from DirectHit (another search engine)
Each search engine is sponsored by a different company and consists of a database where Internet information is stored and indexed. Each one stores slightly different information and each one indexes their pages differently. For example, one search engine may store web pages only and another may store web pages, newsgroups, stock reports etc.
Internet information gets into a database in a variety of ways. Sometimes page authors submit their pages along with appropriate keywords that can be used to display them. Other pages are included because the search service includes software called spiders, bots or crawlers that follow links from page to page, site to site including them in their database (so pages that were never submitted to the search engine can display if a person enters the right keywords).
Pages cannot be viewed until they are included in the search engine's index. Each search computer has its own unique way of indexing web pages. Engines like Excite index each page individually (basically ignoring home pages). Other search engines index based upon the first few paragraphs contained within a page or keywords submitted by the author. The search engines also vary in the way they display page summaries. Some display the first few paragraphs of text as the summary. Others let the author submit summary text. While still others, like go.com let the authors place codes into their text (which the reader cannot see) that are used for the summary.
Since different search engines contain different pages and they index pages differently, you should use more than one search engine when conducting research on the Internet. If you don't find many pages at one site, go to another and try the search again.
A. Go to the Search Engines Home Page
In the address bar, enter the web address of the Search engine you want to use.
Every search engine has a form on the main page. To conduct a search you must type a keyword (or keywords) into the search engine's form and then click the search button.
Some helpful hints for entering keywords include:
1. +keyword2. -keywordA plus sign in front of a keyword indicates that the word must be included in the search results (example: Bill + Cosby specifies that the word Cosby appear in all search results).
3. "keyword1 keyword2..."A minus sign in front of a keyword indicates that the word should NOT appear in search results
(example: aerobics -step specifies that the word step should NOT appear in search results).
4. KEYWORDIncluding a phrase in double quotes requires that search engines find the phrase as entered in all search results (example: "World War II" would only retrieve pages containing the text exactly as it is keyed in).
5. keywordIf you enter a keyword in all capitals, it will only look for capitalized text.
6. KeywordIf you enter a keyword in all lowercase letters, it will look for upper and lowercase.
If you enter a capital letter, it will only match pages that have the capital letter in the same location within the word.
Hands On #1
1. Conduct the Belize annual rainfall search as directed on page 4.08 (omit step #1 and go to: www.altavista.com ). Proceed with step #2 in the instructions. When you find a page that displays the annual rainfall, save the page to your hard drive by selecting the File menu, Save As command. Make sure you know what the name of the page is and the file it is being saved to.
Online students: To see a video demonstration, download: altavista_search.exe
2. Conduct the same search using HotBot as directed on page 4.09 (omit step #1 and go to: www.hotbot.com ). Proceed with step #2 in the instructions. When you find a page showing the average annual rainfall, save the page to your hard drive (make sure the name of the page is different than the name used in #1).
Online students: To see a video demonstration, download: hotbot_search.exe
3. Conduct the same search using Google - see page 4.11 (omit step #1 and go to: www.google.com ). Proceed with step #2 in the instructions. Save the results page to your hard drive.
Online students: To see a video demonstration, download: google_search.exe
4. Go into your email program and create a mail message. The subject should be Belize Rainfall. Attach the files you saved in #1, #2 and #3 to the message and send it to me.
OR
Upload each file to Blackboard's dropbox
5. Send me an email message and tell me which of the 3 search engines returned the best results and why.
1. Page Ranking:
Many search engines use a "page ranking" system to determine which links to display first in the results page. They look at the number of pages linked to the page - the more pages that link to a page, the higher the page ranking.
2. Natural Language Queries
You can enter the query in the form of a sentence or question.
Example #1 the question asked was: "What is the average snowfall in Michigan?"
3. Directory Links (web directories)
Many search engines include directory links at their web sites. The links categorize web pages included in their database. To use the directory links, you must pick a topic that you are interested in. Once you click on the link, you will see a listing of subtopics. You will need to click on the subtopic you are interested in. This will display either more subtopics or actual web pages. If you see more subtopics, you will need to continue selecting subtopics until you finally get to web pages.
Example:

Hands On #2
1. Conduct a natural language query using the AskJeeves website - see page 4.12 (omit step #1 and go to: www.askjeeves.com ) Proceed with step #2 in the instructions. Save the results page to your hard drive.
Online Students: To see a video demonstration, download: askjeeves_search.exe
2. Go into your email program and create a mail message. The subject should be AskJeeves. Attach the file you saved in #1 to the message and send it to me (or you can upload the file to Blackboard's dropbox).
3. Use Yahoo's directory links to find a page - see bottom of 4.15-4.17 (omit step #1 and go to: www.yahoo.com) Proceed with step #2 in the instructions. When you find a page that provides background information on the Conference Board, save the page to your hard drive.
Online Students: To see a video demonstration, download: yahoo_directories.exe
4. Go into your email program and create a mail message. The subject should be Conference Board. Attach the file you saved in #3 to the message and send it to me (or you can upload the file to Blackboard's dropbox).
These search engines send your keywords to multiple web page search engines and retrieve all the results. The results are sorted by web search engine.
www.ixquick.com - passes keywords to AOL, altavista, looksmart, euroseek, excite, findwhat, msn, alltheweb, goto, hotbot and yahoo. Brings "top 10" from each search engine, and aggregates results. Eliminates duplicates.
www.dogpile.com - passes keywords to approx 15 search engines. Doesn't always retrieve desired results.
www.profusion.com - passes keywords to altavista, , looksmart, excite, magellan, webcrawler, goto, alltheweb and yahoo. Aggregates results into one ranked list. Results can be sorted by relevancy, A-Z by site title, or source
www.surfwax.com - passes keywords to several good, large search engines, directories, US Government tools, and news sources. Results can be sorted by relevancy, A-Z by site title, or source.
www.vivissimo.com - passes keywords to yahoo, altavista, msn, hotbot, alltheweb, AOL, excite, directhit, looksmart and euroseek. Results accompanied with a subject subdivisions based on the major themes in your search terms.
Hands On #3
1. Conduct a meta search using Dogpile - see pages 4.18-4.19 Omit step #1 and go to: www.dogpile.com Follow instructions beginning with step #2. Save the results page to your hard drive.
Online Students: To see a video demonstration, download: dogpile_search.exe
2. Go into your email program and create a mail message. The subject should be DogPile. Attach the file you saved in #1 to the message and send it to me (or you can upload the file to Blackboard's dropbox).
These web sites include reference material similar to a library. (Many also include reviews of the reference material and annotated references).
Examples of clearinghouse web sites include:
http://www.eric.ed.gov/sites/barak.html - The Educational Resources Information Center (ERIC) provides education-related research material. It contains 16 subject-specific clearinghouse. Products and services include: research syntheses, electronic journals, online directories, reference and referral services, and document delivery.
http://www.geographynetwork.com/data/clearinghouses.cfm - The geography network lets you download maps from all over the world (it's a map clearinghouse).
http://www.clearinghouse.net/ The Argus Clearinghouse reviews web sites (called guides) submitted and assigns a rating to them based on several different criteria. It also includes links to all sites reviewed.
Examples of library web sites include:
http://www.refdesk.com/ Contains a comprehensive list of links to reference sites on the Internet
http://www.libraryspot.com/ Contains a comprehensive list of links to reference sites on the Internet.
http://www.lii.org/search/file/reference The Librarian's Index to the Internet reference page contains links to library reference material you can use online.
http://www.wdvl.com/ The Web Developers virtual library contains links to pages of interest to web page developers.
http://www.loc.gov/ Library of Congress website. Contains historical information as well as current information.
http://www.ipl.org/ The Internet Public Library includes over 40,000 Internet resources, hand picked, organized and described by librarians and library students.
http://scout.cs.wisc.edu/archives/ The Scout Report Archives includes over seven years worth of Scout Report information. (The Scout Report is published by the University of Wisconsin. Articles are contributed by professional librarians and subject matter experts select who research, and annotate each resource. )
Examples of resource list sites include:
http://www.eevl.ac.uk/ The Internet Guide to Engineering, Mathematics and Computing. The target audience is students, staff and researchers in higher education. The goal is to provide useful information for individuals in the engineering, computing and mathematic fields.
http://www.iss.stthomas.edu/studyguides/ Contains guides on how to study, how to write etc.
http://www.ifla.org/I/training/citation/citing.htm Contains citation guides for electronic documents
http://jobstar.org/tools/career/spec-car.cfm Includes links to career guides that tell you what kind of training or education is required, what you can earn, what type of environment you will work in etc.
Hands On #4
1. Conduct a search using the Argus Clearinghouse - see pages 4.20-4.22 (omit step #1 and go to: www.clearinghouse.net ). Proceed with step #2 in the instructions. Save the National Biotechnology Information Facility Guide information page to your hard drive. Visit the web page reviewed.
Online Students: To see a video demonstration, download: argus_search.exe
2. Go into your email program and create a mail message. The subject should be Argus. Attach the file you saved in #1 to the message and send it to me (or you can upload the file to Blackboard's dropbox). In the body of the email message indicate whether you agreed with the rating Argus gave the site. State why you agreed or disagreed with the rating.
If you get too many results (also called hits) from the main search window, you can conduct an advanced search. All the search engines include this feature because it lets you customize your search so you get fewer hits.
The search forms vary from one search engine to another. To get into an advanced search from, go to the search engine like you normally do, then find the "Advanced Search" link. When you click on the link, you will see a different (more complicated) form. Fill in the portions of the form that you want to use and submit the search.
Advanced searches often use the Boolean operators: OR, AND, NOT to customize searches.
Examples:
1. keyword1 AND keyword2
2. keyword1 OR keyword2The AND operator specifies that both words must appear in search results.
(example: step AND aerobics )
3. keyword1 AND NOT keyword2The OR operator specifies that the results should contain 1 word or the other (the page doesn't have to have both).
4. keyword1 AND (keyword2 or keyword3 or keyword4)Specifies that the page should NOT be selected if the second keyword appears
(example: aerobics AND NOT step specifies that pages with step aerobics should not be displayed)
Using parenthesis lets you build complex searches. Anything within the parenthesis is performed first. (example: Hawaii AND (Maui OR Honolulu) will display only those pages that contain Hawaii & Maui or Hawaii & Honolulu)
Search engines may also use wildcard characters ? or * to replace text in the expression. The * can be used to replace multiple characters. The ? can be used to replace a single character.
Example:
1. key* or keyw*rdAn asterisk at the end of a keyword or in the middle of the word is a wildcard character - it can be used when you aren't sure of the exact spelling or if you want to retrieve multiple spellings of a keyword (example: exercis* would retrieve exercising, exercises, exercisers etc.)
Hands On #5
1. Conduct an advanced search at AltaVista - see pages 4.26-4.27 (omit step #1 and go to: www.altavista.com). Proceed with step #2 in the instructions. Save the results page to your hard drive. Attach the page to an email message and send it to me. The subject should say: Advanced Altavista (NOTE: You can also upload the file to Blackboard's dropbox).
Online Students: To see a video demonstration, download: advanced_altavista.exe
2. Refine the AltaVista search by clicking the Change link at the top of the results page (right below the keyword search box).

You should see the advanced form display. Enter the following boolean query and execute the search:

Attach the results page to an email message and send it to me. The subject should say: Refined Altavista (NOTE: You can also upload the file to Blackboard's dropbox).
Online Students: To see a video demonstration, download: refined_altavista.exe
3. Conduct an advanced search at HotBot - see pages 4.28-4.30 (omit step #1 and go to: www.hotbot.com ) Proceed with step #2. Save the results page to your hard drive. Attach the page to an email message and send it to me. The subject should say: Advanced HotBot (NOTE: You can also upload the file to Blackboard's dropbox).
Online Students: To see a video demonstration, download: advanced_hotbot.exe
4. Conduct an advanced search at Excite - see page 4.31 (omit step #1 and go to www.excite.com ). Proceed with step #2. Save the results page to your hard drive. Attach the page to an email message and send it to me. The subject should say: Advanced Excite (NOTE: You can also upload the file to Blackboard's dropbox).
Online Students: To see a video demonstration, download: advanced_excite.exe
5 Conduct an advanced search at Northern Light - see page 4.32 (omit step #1 and go to: www.northernlight.com ) Proceed with step #2. Save the results page to your hard drive. Attach the page to an email message and send it to me. The subject should say: Advanced Northern (NOTE: You can also upload the file to Blackboard's dropbox).
Online Students: To see a video demonstration, download: advanced_northernlight.exe
1. Belize annual rainfall web page
found using AltaVista.
2. Belize average rainfall web
page found using HotBot.
3. The Belize annual rainfall results page retrieved from Google.
4. The Belize annual rainfall results page retrieved from AskJeeves.
5. Conference Board web page retrieved from Yahoo's directory links
& search facility.
6. The Belize annual rainfall results page retrieved from DogPile.
7. The web page containing guide information on National Biotechnology
Information Facility (the page is from Argus Clearinghouse).
8. The results page from the Boolean search in AltaVista's advanced form
9. The results page form the revised Boolean search in AltaVista's
advanced form
10. The results page from the Boolean search in HotBot's advanced form
11. The results page from the Boolean search in Excite's main page.
12. The results page from the advanced search at Northern Light.
NOTE: These pages can be uploaded to Blackboard OR they can be emailed.