|
|
Lesson
One:
Throughout history,
libraries have provided information resources to support the needs of
researchers. Although recorded information has been disseminated in many
ways, from clay tablets to computers, libraries have collected, organized,
and made information available to their users. Libraries have evolved over
thousands of years from their origins as places to store the archives of a
particular city or culture, to twentieth-century information gateways leading
to vast amounts of virtual information resources. Libraries today provide
information in many formats, including physical collections of books,
periodicals, newspapers, pamphlets, government documents, audiovisual
materials, and electronic resources. While physical collections still serve
as essential sources of information, the development of a virtual world of
online databases, reference services, and the Internet has created a valuable
extension to the library. "Virtual libraries", "digital
libraries", or "online libraries" are terms used to describe
libraries in the Information Age that organize and provide access to a huge
number of digital information resources scattered throughout cyberspace. Many of those resources
have been made available via the Internet. Since the creation of the World
Wide Web in 1989, the phenomenal growth of the Internet has provided a global
connection of information and communication networks that is astounding in
scope. At the click of a computer mouse, the Internet seems to instantly
produce an infinite number of resources on any imaginable research topic. However,
if you jump into searching the Internet without any preparation, you may find
yourself overwhelmed by the variety and complexity of search tools and the
retrieval of thousands of documents, many of which are irrelevant or
superficial. If you truly want to
plumb the depths of the sea of information offered by the Internet, you need
some in-depth navigation skills. Planning your research project, deciding
when it is appropriate to supplement traditional library resources with
Internet resources, focusing on the appropriate search tools, using specific
search commands and operators, and evaluating the resources you retrieve can
save time and produce high-quality results. The Internet can help you
with a number of research tasks, including:
The Internet has actually
redefined the term "resource" and the process of conducting
research. Today, the term "resource" encompasses not only
traditional resources available in libraries such as books, periodicals and
audiovisual materials, but it also includes broadcast media which is
available from many sources, and now includes any resource you can access
from any Internet-capable computer or device in your library, at work, or at
home. Although computers are now the primary means of accessing the Internet,
other devices are being used, such as cell phones, personal digital
assistants, and pagers, which can send and receive e-mail and access the Web.
Soon, a variety of appliances, including your car or your TV set, may be
connected to the network, communicating with each other, and providing access
to information. An Internet resource can
be an e-mail message stored in a discussion group archive; an online
magazine, journal, or encyclopedia article from an Internet-accessible
database or research service; an archive of daily newspaper articles; a
statistical database compiled by the U.S. Government; a personal,
organizational or corporate home page; a sound file, map, digital photograph,
streaming audio or video file that can be downloaded to your home computer;
an interactive tutorial offering a variety of multimedia features; or a new
type of resource that may not have existed prior to the advent of digital
networks. In order to understand
why the Internet provides such a wealth of resources and why those resources are
distributed on computers all over the world, you should know a little about
its history and structure. The Internet had very humble beginnings. Its
founders had no intention for it to develop into a universally accessible,
global network of information. The Internet began in
1969 as a network called ARPANET, designed for the Advanced Research Projects
Agency (ARPA) of the U.S. Department of Defense. ARPA was established in 1958
as the result of the Soviet launching of the Sputnik satellites, which ignited
fears of Russian aggression from space. ARPA sponsored research
on linking geographically remote computers to allow remote logon access and
sharing of data and resources. ARPA's goal was to connect military and
defense contractors and universities involved with defense research. In the
fall of 1969 ARPANET linked the first four computers known as Interface
Message Processors, which were located at the University of California at
Los Angeles, Stanford Research Institute, University of California at Santa
Barbara, and the University of Utah. According to most
published accounts of Internet history, the original ARPANET was designed as
an experiment in developing a network which would withstand a nuclear
attack--if a section of the network disappeared, the entire network would not
be destroyed. To that end, the network was decentralized, data was
distributed among all the network computers, and data was transferred in
small packets. A somewhat revisionist
history of the early days of the Internet, Where Wizards Stay Up Late:
The Origins of the Internet, was published in 1996, and claims the
original purpose for the development of ARPANET was to share data and
distribute the cost of computing, rather than to develop a network that would
withstand a nuclear attack. In fact, the concept of a
decentralized network dates to the early 1960s, when the RAND Corporation, a think tank which studies
national security and public welfare issues for the U. S. government, was
asked by the U. S. Air Force to design a communications network that would be
able to survive and function during and after a nuclear attack. The idea of a
centralized network was unacceptable, since any central network computer or
network control center would likely be the first target in an attack. Paul Baran of RAND
published a paper in 1964 entitled On Distributed
Communications, which provided the theoretical design for data transfer
on an unreliable network, a network which was designed from the beginning to
operate while in tatters. Baran's design proposed many of the features which
were eventually incorporated into the network we have today, including
decentralized data storage, digital packets and different routes for packets
in the same data transfer. Baran proposed a network
of special computers or nodes whose sole purpose was to route messages like
"hot potatoes." As soon as a message entered the router, it would
be tossed out again by the most efficient route, or if the best path was
destroyed or busy, the message was sent over the next best route (Hafner,
61-62). Baran based his network
design on observations of human brain function, after noticing that brain
functions don't rely on a centralized set of cells. Brain circuitry can be
rerouted around damaged cells and neural nets can be re-created over new
pathways (Hafner, 57). Baran's ideas were not immediately accepted by ARPA,
but were improved upon by an ARPA director, Lawrence Roberts, who in 1967
proposed a packet-switched network, based on the efficiency and reliability
of Baran's original idea. The early 1970s were
spent developing basic standards called protocols for data transfer (see How the Internet Works for more information on
protocols). The first protocol developed was known as the Network Control
Protocol or NCP. This protocol supported computers running on the same
network. By 1972, there were 37
host computers connected to ARPANET and ARPA's name was changed to DARPA
(Defense Advanced Research Projects Agency). The researchers at ARPANET
realized the need to create protocols that supported not only the data
sharing for computers on the same network, but also the interconnection of
different computer networks, now known as internetworking. Stanford Research
Institute was assigned the task of designing a set of protocols that would
allow multiple computer networks to be interconnected together. During 1973-1978 a team
of researchers led by Vinton Cerf at Stanford Research Institute and Robert
Kahn of ARPA developed a suite of protocols called TCP/IP (Transmission
Control Protocol and Internet Protocol) which supported the interconnection
of a number of different computer networks. In 1983 TCP/IP replaced NCP as
the core Internet protocol. Although the ARPANET's
founders originally allowed only defense scientists and military researchers
to logon and run programs from remote computers, by the early 1980s,
educators discovered the value of interconnected computers, especially
supercomputers that were expensive to develop, to share research information
and computing resources. The universities needed a
worldwide network like ARPANET. Because military agencies are less willing to
share information or to allow access, the U. S. academic community began
developing several networks. The academics created BITNET (Because It's Time
Network), an academic and research network that links IBM computer centers
around the world, and the CSNET (Computer Science Network) that linked
university computer science departments. In 1986 the NSFNet was
created and named for the National Science Foundation, which provided most of
the funding. NSFNet linked academic researchers across the country with five
supercomputer centers. This soon expanded to include regional and statewide
academic networks that connected universities and research organizations, and
the NSFNet began to replace the ARPANET for research networking. The academic
networks were developed with the same network structure as ARPANET, as
independent, interconnected sites scattered randomly around the world. The
NSFNet continuously linked more powerful supercomputers through faster
connections, upgrading the network in 1986, 1988, and 1990. As these government and
educational networks established connections, the concept of the Internet, a
worldwide connection of networks, was born. However, it wasn't until the
World Wide Web became available in the mid 1990s that the Internet became
ubiquitous and easily available to the casual computer user. The World Wide Web The World Wide Web is a
branch or subsection of the Internet that provides access to hypertext
documents. Hypertext resources are documents which provide links or
connections to other documents. Selecting a hypertext link allows you to jump
to the information the link represents. You can also return to a previous
link and then go off in another direction. Hypertext lets you move through a
text in a nonlinear manner and allows you to explore a vast worldwide
"web" of information. The World Wide Web began
in 1989 as a communications project in Switzerland, at the European
Laboratory for Particle Physics called CERN (Conseil Europeen pour Researche
Nucleaire). Tim
Berners-Lee, a graduate of Oxford University with a background in
computer communications, proposed a global
hypertext information system to be used as a means of transporting research
and ideas throughout CERN. Berners-Lee was proposing a solution to two
problems: information storage and retrieval, and communication on a global
scale, since the members of CERN were located in a number of countries. Berners-Lee created an
information system using hypertext, combined with the global connections
provided by the Internet, to produce a "web" of connected documents
that can be located anywhere in the world and accessed by anyone with a computer
and a hypertext browser. Hypertext is a concept
that has been discussed since 1945, when Vannevar Bush, science advisor to
President Roosevelt during World War II, proposed a machine that would be
capable of producing hypertext links between documents. Bush's proposal was
outlined in an article entitled As We
May Think, published in the July 1945 issue of The Atlantic Monthly.
In 1965, Ted Nelson
coined the term "hypertext" and proposed a worldwide hypertext
system called "Xanadu," to which individuals could contribute
resources. Other hypertext programs were developed during the intervening
years, but it wasn't until Berners-Lee developed a hypertext browser that
functioned with existing Internet technology that a global hypertext
information system was created. The original CERN project
outlined a simple system using networked hypertext links to transmit
documents and communicate among physics researchers. The links appeared as
highlighted words in the document. Later on as more people became interested
in the possibilities of hypertext, highlighted, colored or underlined text,
pictures, icons, or graphics were used as links, and links were made to sound
and video files. The term hypermedia is sometimes used to describe a
hypertext system which can display multimedia, including graphics, sounds,
animation, and video. In 1992, there were 50
web servers worldwide. Today, the Internet, including a vast number of World
Wide Web sites, has become a collection of millions of independent networks,
each owned by organizations independent of each other, all interconnected by
high-speed data lines, satellites, cable modems, radio signals, and wireless
connections. Due to its origins in the
decentralized ARPANET, there is no central computer or data storage on the
Internet. Information files are scattered around the Net, around the world,
virtually hidden in far-away places, waiting for discovery. Internet developers and
users have the freedom to publish anything on the Internet. The openness of
the Internet and the availability of information on almost any topic reflects
the values of those who built it. Although it started with the federal
government, it was built by people in the worlds of education and scientific
research, and it reflects their values of individual participation, equality,
and information sharing. Although commercial interests are now proliferating
and seeking to change the concept of free information sharing, a wealth of
information, some free, some fee-based, is available to researchers who have
the skills to locate, select, and evaluate resources. Some links on the history
of the Internet include: How the Internet Works:
Protocols You might take for
granted that when you retrieve a file of information or send an e-mail
message across the Internet it will always reach its destination. However, sometimes
it doesn't because the process for sending information is extremely complex. In order for the Internet
to work in connecting many different types of computers, software and files
together, standardized rules called protocols
must be used, that define how computers communicate. A good example of an
early communications protocol was Morse Code. The protocol for Morse Code
used standardized dots and dashes to communicate over telegraph lines by
transmitting electrical impulses. Internet connections are
made with a series of protocols called TCP/IP
(Transmission Control Protocol/Internet Protocol). The TCP/IP protocols define
the Internet as a packet-switched
network. With a packet-switched connection there is no single, unbroken
connection between sender and receiver, like there is with the telephone
system. The telephone system is a
connection-oriented, circuit-switched network. When you make a telephone
call, the switches at the telephone company set up a dedicated line between
you and the person you call, for the duration of the call. While you are
using the line, no one else can; and if there is a problem on the network,
you lose your connection. A packet-switched network
does not require two computers to establish a dedicated, unbroken connection
for data transfer. It instead breaks the data into small units or packets and
transfers the packets over any phone or data lines that are currently
available. When you ask your browser
to go to a specific Internet address, or when you click on a hyperlink, the
sending computer breaks the data you have asked for into packets. Each packet
contains a piece (up to 1500 bytes) of the data. Each packet is labeled with
the addresses of the sending and receiving computers along with some
instructions on how to put the data back together again once it has reached
its destination. The data in these small
packets is transferred over phone lines or data lines. The packets take
different routes through a complex series of routers. Each router examines
the destination address and decides the best way to get the packets to their
destination. The packets eventually
all reach their destination -- your computer -- and are put back together
again, using the instructions they have been labeled with. This is why it
sometimes takes a while to load the data before information appears on your
screen. Of course, the speed of the data transfer depends on the type of
network connection or modem you use. All Internet functions
depend on protocols which standardize how the data for those functions are
transferred. Such protocols include:
The World Wide Web uses
HTTP (Hypertext Transfer Protocol) to transfer data. The HTTP protocol contains
commands that allow you to jump to another hypertext document and retrieve
the information in that document. When you enter a URL in your browser window
or click on a link, this sends an HTTP command to the web server described in
the URL, and directs the server to send the requested file. The computer language
used to create hypertext documents is referred to as HTML (HyperText Markup
Language). HTML uses tags (characters enclosed in brackets) to format
documents so that a web browser can read and display them. Tags denote such
features as headings, paragraphs, fonts, images, and hypertext links. The
HTML code behind any web document may be displayed in a browser window by
selecting "Page Source" on the "View" menu, or by right
clicking the mouse and choosing "View Source". FTP (File Transfer
Protocol), developed in 1985, is a standard method of moving files from one
computer to another on the Internet. The transfer of files using FTP can work
in either direction. You may retrieve files from a remote server, or transfer
files to a remote server, if you have been granted access to that server. FTP was the only means
for file transfer on the Internet prior to the creation of HTTP (HyperText
Transfer Protocol) and the World Wide Web. Although many of its functions are
now handled by the HTTP protocol, FTP is still used for file transfer on the
Internet. The World Wide Web has
made several other early Internet protocols nearly obsolete. Two such
protocols, Telnet and Gopher, were once widely used to connect to remote
sites and search for information. Many Telnet and Gopher sites have migrated
to the World Wide Web, which offers simpler interfaces, multimedia effects,
and user-friendly interactivity. However, Telnet
connections still provide access to some library catalogs and some government
databases. Telnet, or remote logon, is a tool that allows you to access the
programs and applications available on another computer system, whether it is
located next door or on another continent. The Telnet protocol allows you to
sit at the keyboard of one computer and use that keyboard and monitor as
though they were connected to another computer at a remote location. Telnet is supported by World
Wide Web browsers, but requires Telnet client software. Netscape and Internet
Explorer allow you to use a Telnet client with the browser, which provides an
instant interface with the Telnet program. Gopher, created in 1991
at the University of Minnesota (whose mascot is the Gopher), is an outdated
Internet protocol that is rarely seen today. Popular for several years,
especially in universities, Gopher predates the World Wide Web. Gopher files
are primarily text, with no hypertext links, very few graphics and virtually
no audio or video effects. With hypertext links, the Hypertext Markup
Language (HTML), and the development of a graphical browsers, the Web quickly
transcended Gopher. There are many other
Internet protocols which will not be covered in this course. Yahoo!'s Protocols
page provides links to additional information. Another concept that is important
in understanding how the Internet functions is the client/server
concept. Most Internet services rely on the client/server model. The Internet
user is the client and has client software installed on his computer
to access various Internet services. When a user wants to connect to a
particular information tool, he uses his client software to connect to server
programs, which provide the service or information needed. The web browser is
an example of client software needed to access World Wide Web servers. Most
browsers function as client programs for World Wide Web, FTP, and Gopher
access. For access to Telnet sites, a Telnet client is needed. Your computer
also requires specific client software for e-mail and for viewing certain
types of information files (such as audio, video, or PDF files). Each
piece of client software on your computer recognizes certain protocols and
processes data according to those protocols. Internet Addresses: IP Addresses and Domain Names Each computer connected
to the Internet is called a host computer. Each host computer has a unique
address called an IP address, which is used by the TCP/IP protocol to
identify the host requesting the data file. An IP address is a 32-bit numeric
address written as four numbers separated by periods. Each number can be zero
to 255. For example, 230.160.25.240 could be an IP address. Since IP addresses are difficult for people to remember, host names or
domain names such as ccla.cc.fl.us are generally used to identify the address
of any computer connected to the Internet. Because computers on the Internet
only understand IP (numeric) addresses, not domain names, every Web server
requires a Domain Name System (DNS) server to translate domain names into IP
addresses. A domain name may identify one or more IP addresses. The domain
name system organizes domain names into top-level categories, such as:
The U. S. and other countries use two letter country codes, with over 300
two-letter codes for countries, as well as codes for states, such as fl.us.
Due to a shortage of top level domain names, several new domain name
extensions have been proposed, including:
The proposal for new
domain names has been controversial. Information about the current state of the
domain name system is available in Management
of Internet Names and Addresses, a document published by the U. S.
Department of Commerce. Domain names in .com, .net or .org can be registered
through competing registrars.
The International Internet
Address and Domain Name System provides a detailed overview of the
current domain name system. URLs (Uniform Resource Locators) Every data file or
document on the Internet also has a unique address called a URL (Uniform
Resource Locator). The URL consists of three parts: the protocol, the domain
name and the path. The protocol, as
discussed above, is the set of rules the computer follows in order to
communicate with another computer. It lets the computer know how to process
the information it receives. If the protocol is http://, for example, the
computer knows it will be processing a World Wide Web document. The domain name is
the Internet address of the computer (server) which is hosting the site &
storing the documents. This domain name may be expressed as an IP address. The path is the
directory and file specification; it lets the computer know which directory
and file to access after connecting to the server. The path is not a required
element, but if you know the path it will take you directly to the desired
file or document. The path is also the part of the URL which changes most
frequently. If you type in a URL & an error or "File not found"
message is returned, retype the URL, omit the path and try to locate the file
by searching the site or following links. Let's break down the URL
for a web page from LINCCWeb. The
LINCCWeb home page allows you to link to Florida community college library
catalogs and many other resources for community college students. This
particular LINCCWeb page provides links to electronic databases containing
articles from encyclopedias, periodicals, and newspapers, access to worldwide
library catalogs and more: http://www.ccla.lib.fl.us/www/dblist.html http:// is the protocol. This lets you know you are
retrieving a World Wide Web document and lets the computer know how to
process the hypertext file it is receiving. www.ccla.lib.fl.us/ is the domain name, the address of the computer
which is hosting the web page. If you were to stop here and not type the
path, which consists of the directory and/or file name, you would access the
LINCCWeb home page rather than the database page. www/dblist.html provides the path to the specific page you want; in this
case, the directory (www) and name of the html file (dblist.html) which
provides links to electronic databases. Complete Exercise One after
reading this lesson. It is worth 4 points. Copyright © 1997-2000
Florida Community College |
|