Professional Documents
Culture Documents
INTREX - Report of A Planning Conference On Information Transfer Experiments
INTREX - Report of A Planning Conference On Information Transfer Experiments
September 3, 1965
Sponsored by
The Independence Foundation
of Philadelphia, Pennsylvania
0262150042
OVERHAGE
intrex REPORT
Participants vii
Summary xv
VI Project Intrex 53
Library Modernization
Extension of Time-Shared Computing
The Course of Project Intrex
iii
VII The Experimental Program 61
1* The Model System 61
The Augmented Catalog
Text Access
2. Integration with National Resources 91
Objectives
Assumptions
Typical Experimental Use
General Comment
Selection of Initial Information Centers
Other Users
Teletype Links
Observational Objectives
3. Fact Retrieval 96
The Automated Index
The Automated Handbook
The Automated Notebook
Future Research
4. Initial Facilities 102
The Computational Facility
Software
Storage
Transfer of Library into the Store
Transmission, Display and Consoles
Permanent Copy
Summary
5. Related Studies: Extensions and 111
Elaborations
Educational Functions of Intrex
Selective Dissemination
Browsing( Accidental Discovery)
Publishing
Selective Retention
6. R&D to Support the Experimental Program 131
Consoles
Interaction Language
Analysis of Content
Analysis of the Needs of Users
Theory of Information Transfer
7. Data Gathering for Evaluation 139
Data on Use
Economic Controls
Data on Learning
An Annotated Userrs Card
iv
Appendix A Remarks of Vannevar Bush to Project
Intrex Planning Conference 144
v
Appendix O A Technique of Measurement That May
Be Useful in Project Intrex Experi¬
ments J. C. H. Licklider 235
vi
PROJECT INTREX PLANNING CONFERENCE
Carl F. J. Overhage, Chairman
PARTICIPANTS
Stanley Backer
Professor of Mechanical Engineering
Massachusetts Institute of Technology
Cambridge, Massachusetts
Gary L. Benton
Assistant Director, Project Intrex Planning Conference
Massachusetts Institute of Technology
Cambridge, Massachusetts
Daniel G. Bobrow
Head, Artificial Intelligence Group
Bolt Beranek and Newman
Cambridge, Massachusetts
John E. Burchard
Dean-Emeritus, School of Humanities
Massachusetts Institute of Technology
Cambridge, Massachusetts
Harold E. Clark
Chief Physicist and Scientific Director
Xerox Corporation
Rochester, New York
Melvin S. Day
Director, Office of Scientific and Technical Information
National Aeronautics & Space Administration
Washington, D.C.
James O. Dyal
Senior Engineering Specialist
Xerox Corporation
Rochester, New York
Herman H. Fussier
Director, The University Library
Professor, The Graduate Library School
University of Chicago
Chicago, Illinois
Richard L. Garwin
Director of Applied Research
International Business Machines Corporation
Yorktown Heights, New York
Frank E. Heart
Associate Group Leader
Lincoln Laboratory, MIT
Lexington, Massachusetts
Herman H. Henkle
Executive Director and Librarian
The John Crerar Library
Chicago, Illinois
Myer M. Kessler
Associate Director of Libraries
Massachusetts Institute of Technology
Cambridge, Massachusetts
J.C.R. Licklider
Consultant to the Director of Research
International Business Machines Corporation
Yorktown Heights, New York
William N. Locke
Director of Libraries
Massachusetts Institute of Technology
Cambridge, Massachusetts
Stephen A. McCarthy
Director of Libraries
Cornell University
Ithaca, New York
Max V. Mathews
Director, Behavioral Research Laboratory
Bell Telephone Laboratories, Inc.
Murray Hill, New Jersey
viii
George A. Miller
Professor of Psychology
Harvard University
Cambridge/ Massachusetts
Foster E. Mohrhardt
Director, National Agricultural Library
Washington, D.C.
Philip M. Morse
Professor of Physics
Director, Operations Research Center; Director,
Computation Center
Massachusetts Institute of Technology
Cambridge, Massachusetts
CarlF.J. Overhage
Professor of Engineering; Director, Project Intrex
Massachusetts Institute of Technology
Cambridge, Massachusetts
Eric S. Proskauer
Vice-President and Manager
Interscience Publishers, a Division of John Wiley &
Sons, Inc,
New York, New York
Richard C. Raymond
Consultant-Information, Advanced Technology Services
General Electric Company
New York, New York
Arthur L. Samuel
Consultant to the Director of Research
International Business Machines Corporation
Yorktown Heights, New York
Peter R. Scott
Head, Microreproduction Laboratory
Massachusetts Institute of Technology
Cambridge, Massachusetts
John L. Simonds
Head, Information Technology Laboratory
Eastman Kodak Company
Rochester, New York
ix
Charles H. Stevens
Staff Member, Project Intrex
Massachusetts Institute of Technology
Cambridge, Massachusetts
Don R. Swanson
Dean, The Graduate Library School
University of Chicago
Chicago, Illinois
Joseph Weizenbaum
Associate Professor of Electrical Engineering
Massachusetts Institute of Technology
Cambridge, Massachusetts
Gordon Williams
Director, The Center for Research Libraries
Chicago, Illinois
Victor H, Yngve
Professor of Linguistics and Library Science
University of Chicago
Chicago, Illinois
VISITORS
Scott Adams
Deputy Director, National Library of Medicine
Bethesda, Maryland
Burton W. Adkinson
Head, Scientific Information Service
National Science Foundation
Washington, D,C.
Henry Aiken
Professor of Philosophy
Brandeis University
Waltham, Massachusetts
Samuel Alexander
Chief, Data Processing Systems Division
National Bureau of Standards
Washington, D. C.
x
W.O. Baker
Vice-President, Research
Bell Telephone Laboratories, Inc.
Murray Hill, New Jersey
Curtis G. Benjamin
Chairman, Management Board
McGraw-Hill Book Company
New York, New York
Joseph L. Boon
Technical Assistant to the General Manager
Apparatus and Optical Division
Eastman Kodak Company
Rochester, New York
Paul Brobst
Member of the Technical Staff
Xerox Corporation
Rochester, New York
W. Stanley Brown
Member of the Technical Staff
Bell Telephone Laboratories, Inc.
Murray Hill, New Jersey
Douglas W. Bryant
University Librarian
Harvard University
Cambridge, Massachusetts
Vannevar Bush
Honorary Chairman and Life Member of the Corporation
Massachusetts Institute of Technology
Cambridge, Massachusetts
Verner W. Clapp
President
Council on Library Resources, Inc.
Washington, D. C.
Norman Cottrell
Director of Documentation Services
American Society of Metals
Cleveland, Ohio
John W. Emling
Executive Director, Transmission Engineering
Bell Telephone Laboratories, Inc.
Murray Hill, New Jersey
xi
Robert Fano
Professor of Electrical Engineering
Massachusetts Institute of Technology
Cambridge, Massachusetts
Merrill M. Flood
Professor and Senior Research Mathematician
University of Michigan Medical School
Ann Arbor, Michigan
Steven Furth
Industry Development Manager in Information Retrieval
International Business Machines Corporation
White Plains, New York
Leon D. Harmon
Member of the Technical Staff
Bell Telephone Laboratories, Inc.
Murray Hill, New Jersey
Karl F. Heumann
Director, Office of Documentation
National Academy of Sciences
Washington, D.C.
Eugene Jackson
Corporate Director of Libraries
International Business Machines Corporation
Armonk, New York
Mark Kac
Professor of Mathematics
The Rockefeller Institute
New York, New York
Robert A. Kennedy
Head, Library Systems Department
Bell Telephone Laboratories, Inc.
Murray Hill, New Jersey
Gilbert King
Member, Board of Directors; and Research Consultant
Itek Corporation
Lexington, Massachusetts
xii
Russell A. Kirsch
Electronic Scientist
National Bureau of Standards
Washington, D.C.
William T. Knox
Technical Assistant to the Director
Executive Office of the President, Office of Science
and Technology
Washington, D.C.
W. Kenneth Lowry
Manager, Technical Information Libraries
Bell Telephone Laboratories, Inc.
Murray Hill, New Jersey
Lee E. McMahon
Member of the Technical Staff
Bell Telephone Laboratories, Inc.
Murray Hill, New Jersey
Marvin L. Minsky
Professor of Electrical Engineering
Massachusetts Institute of Technology
Cambridge, Massachusetts
Richard Oldham
Assistant Chief Engineer
Station WGBH
Boston, Massachusetts
Ascher Opler
Vice-President
Computer Usage Company
New York, New York
Harald Ostvold
Director of Libraries
California Institute of Technology
Pasadena, California
John R. Pierce
Executive Director, Research-Communications Principles
and Systems Research Divisions
Bell Telephone Laboratories, Inc.
Murray Hill, New Jersey
xiii
Ithiel de Sola Pool
Professor of Political Science
Massachusetts Institute of Technology
Cambridge, Massachusetts
Nathaniel Rochester
Manager, Boston Programming Center
International Business Machines Corporation
Boston, Massachusetts
Jesse H. Shera
Dean, School of Library Science
Western Reserve University
Cleveland, Ohio
Samuel S. Snyder
Information Systems Specialist
Library of Congress
Washington, D.C.
Fred A. Tate
Associate Director, Chemical Abstracts
Columbus, Ohio
John W. Tukey
Professor of Mathematics
Princeton University
Princeton, New Jersey
Claude Walston
Member of the Technical Staff, Federal Systems Division
International Business Machines Corporation
Bethesda, Maryland
I. A. Warheit
Manager, Projects in Information Retrieval
and Very Large Storage Systems
International Business Machines Corporation
San Jose, California
F. Karl Willenbrock
Associate Dean of Engineering and Applied Physics
Harvard University
Cambridge, Massachusetts
xiv
SUMMARY
xv
3. Selectivity in the Intrex Program
xvi
library operations; in selective dissemination; in some limi¬
ted forms of browsing; and in recording user interaction
with the system.
xvii
9. The Fact Retrieval Experiment
xviii
CHAPTER I
Today we do not know how to specify the exact nature and scope
of future information transfer services. We believe that their
design must be derived from experimentation in a working
1
environment of students, faculty, and research staff. A
uniquely favorable situation will exist at MIT for fruitful ex¬
perimentation in this field. There are library users, in all
academic categories, who are accustomed to the experimental
approach and who will cooperate in meaningful tests of new
services. In Project MAC, MIT is already carrying forward
a broad study of machine-aided cognition which will greatly
stimulate the rise of new concepts in information transfer.
2
experimental modifications of conventional library operations.
The rationale of this view is set forth in Chapters V and VI.
3
Conference were in the technical realm; its principal aim was
to formulate a plan for experiments.
Acknowledgments
4
to the MIT Press for foregoing the customary editorial reviews,
to permit publication of the report within six weeks of the close
of the Planning Conference.
5
CHAPTER II
7
requires exceedingly extensive and highly accessible book,
journal and report collections for its support, for the serious
investigator cannot be satisfied until he has reasonably good
access to the past and present records of his subject field.
Indeed, unless he does have such access, the cumulative,
progressive character of virtually all serious research could
not exist. Programs of graduate instruction also create very
extensive demands, if good students are to be well trained.
8
Moreover, the rapid pace of scientific and technological re¬
search and development introduces a factor of substantive
obsolescence in many kinds of literature. This reduces the
frequency with which such literature must be consulted. How¬
ever, society is unwilling to erase the record of the past, for
historians and other investigators find the serious study of
the past illuminating and valuable in the better understanding
of many aspects of contemporary civilization. Because so
much of the record of contemporary civilization has been
published on wood pulp paper that is subject to rapid deterior¬
ation, research libraries have the added obligation of finding
means of preserving these records. New methods of pre¬
serving old paper and filming or otherwise preserving the
content of crumbling documents are being put into effect while
concurrently specifications for improved papermaking are
being developed.
9
libraries make them far easier to use than comparable col¬
lections in other countries. Stack access to a large book
collection classified by subject is both a sobering and an en¬
lightening experience for a reader who encounters it for the
first time, as many foreign visitors testify. Similarly, the
dictionary catalog, with author, title and subject approaches,
is a versatile tool for bibliographic access to the collection.
10
very rapid rate, and they will have to be maintained and con¬
trolled in a manner that will make them readily available for
current use and manipulation, as well as for historical pur¬
poses. Fortunately, such materials are singularly adaptable
to mechanized handling.
11
a continuing publication that serves as a guide to manuscript
collections in university and historical libraries.
12
that produces a full-size paper copy of a microimage or of any
page of any document in the library. The student or scholar
can now build a small private library quickly though not in¬
expensively. He can annotate, clip and edit. The result of
this is not only a transformation of the user's work habits;
it means we are on the threshold of a new integrity for the
collections. They can be kept intact so that a volume in use by
any one person does not deprive everyone else of it. In more
and more institutions, journals no longer are allowed to cir¬
culate. Quick copies are better, and in the long run, cheaper.
13
access to any piece of recorded information — first to that
held locally, then to pertinent information held elsewhere.
The latter may not be quite so quickly available.
14
CHAPTER IH
15
NATIONAL, LIBRARY-INFORMATION SYSTEMS
16
More than 20 major proposals have been made for national
information systems. Most significant is the plan now being
developed by the Committee on Scientific and Technical In¬
formation (COSATI) in the Office of Science and Technology,
Executive Office of the President. William T. Knox, chair¬
man of COSATI, has given a general outline of the plan: 5
OPERATING SYSTEMS
17
script preparation, editing, publishing, abstracting, indexing,
library services and similar functions are all planned and
carried out as a coordinated function.
Standardization is imperative.
18
A categorization of network systems would include the following
types: Monolithic; Discipline; Mission; and Special. Each will
have a place in the future of library development.
NETWORKS
19
Types of users (professors, students, engineers,
scientists, research staff, industry users)..
Location of users.
General and special needs.
Services required.
Subject fields.
20
Now imagine that there is also a national facility that contains
a catalog of the holdings of books, reports, and serial titles
in all major libraries. Within the network, one can then call
on a national facility with a citation, and find out where some
copy of the document might exist. Thus, the local user would
press a "find11 key on his console with reference to a particu¬
lar citation, and would receive an answer that (happily) the
document was indeed available locally. Or a slightly less
happy answer might be that it was available nearby. Then,
as another possibility, the user might press a "local" button
and be given some indication by the local system as to how
and when he can get a copy of the document. Thus, the local
system might say that the document was available in both micro¬
film and hard copy; did he wish hard copy of the microfilm
for personal retention? Alternatively he might get an immed¬
iate picture of some portion of the text on the cathode- ray
tube on his console.
Note that this process does not involve at any time the use of
the local catalog. All bibliographical information is stored
in decentralized centers having to do with each specialty; the
document location may be in a separate local or national
store. It is evident that, given a specific citation, one may
then need to proceed to ascertain:
Where is it?
What is the local address or call number?
What is the local availability status or
capability? and, finally, request it.
INTERNATIONAL NETWORKS
21
in as much detail as possible the exact nature
of investigations being conducted in foreign
institutions, and they must know it as promptly
as possible. M
22
7U. S. House of Representatives, Select Committee on Govern¬
ment Research. Study No. IV "Documentation and Dis¬
semination of Research and Development Results.11 88th
Congress 2nd session, Washington, D. C., 1964, p. 54.
23
CHAPTER IV
INTRODUCTION
25
interaction — is firm enough to support that proposition as a
basis for significant decisions. In short, it is now evident
that much of the creative intellectual process involves moment-
by-moment interplay between heuristic guidance and execution
of procedures, between what men do best and what computers
do best. On the basis of that realization, it seems reasonable
to project to a time when men who work mainly with their brains
and whose products are mainly of information will think and
study and investigate in direct and intimate interaction with ex¬
tensively programmed computers and voluminous information
bases.
26
THE PROJECT MAC EXPERIENCE
27
,Jsystem programmer", but from the behavioral point of view
of a substantively oriented user of the system.
28
Save this information that I am going to use,
now, so I may recover it if my test goes awry
and destroys the test copy.
Start up program "_11 from just the state in
which I left it when last I used it.
Explain command ff_M to me in greater detail.
Compile (i. e. , prepare for execution) the pro¬
gram called "_" with the ,f_11 compiler.
Operate the program called "_", using the
data called _".
29
called merely "clerical”. The ’’on-line mathematical
assistant” can carry out symbolic integrations about as well
as most graduating seniors can. ) Examples of areas in which
services are available to aid the on-line user are:
Preparing and editing text and running off clean copy.
Graphic design of structures and devices.
Modeling or simulation of dynamic and stochastic
processes.
Planning of highways, buildings, water-resource
systems, and the like.
Understanding of systems of mathematical and
logical interrelations and constraints.
Examining and tracing bibliographic information
in the field of physics.
Preparing, testing, revising, and documenting
computer programs.
THE CONCEPT
30
of the full-fledged, on-line intellectual community of unspec¬
ified future date. The experience has provided a concrete
foundation for the concept and attested to the feasibilities
and values of some of its parts; but the over-all concept
is as much an idealization, based on perception of our wants
and needs, as it is a projection of experience in community-
computer interaction.
31
is delivered. If the cost of delivery would be great, the cost
is presented, rather than the fact or document itself, and a
negotiation between the user and the system is thus opened.
If the user does not specify the fact or document uniquely,
then a negotiation is opened to refine the retrieval prescrip¬
tion or to give the user a notion of how many facts or docu¬
ments he may receive if the system follows through to meet
his request. The fact-retrieval capability postulated here is
based upon the assumption that the system contains a store
of information organized in a more readily proeessible form
than that of natural-language text or, alternatively, that great
strides have been made beyond the present level of understand¬
ing the syntactic organization (and, more especially, the seman¬
tics) of natural language. The document-retrieval capability,
on the other hand, is based approximately upon the present
state of the art. Whether one contents himself with document
retrieval, or assumes retrieval of facts from fact stores
organized by men, or goes on to postulate inference from
natural-language text, the conclusion must remain the same:
that retrieval of stored information is the basic service upon
which all the facilities of the on-line system must depend.
32
other non-intellectual aspects of life. For the present pur¬
pose, however, it suffices to illustrate the projected capability
of the system to apply pre-defined procedures to information
contained within its stores.
The fourth and final basic service is control. The user con¬
trols the system, or addresses requests to it, through a few
familiar devices: a keyboard, a pen-like stylus, a micro¬
phone, and a small assortment of buttons and switches.
Through those devices, he can communicate in strings of
alpha-numeric characters, by pointing, by writing clearly,
by sketching or drawing, and by speaking distinctly in a
limited vocabulary. His communication with the system is
carried out, as suggested earlier, in languages that are some¬
what more constrained and formal than the open, natural
language of everyday speech among men.
Built upon the basis of the four services just described are
many derivative services. It will have to suffice to mention
only a few of them, and briefly. There are arrangements,
patterned after the Sketchpad programs developed at the MIT.
Lincoln Laboratory, that facilitate the design of structures
and devices. There are programs to faciliate the preparation
and editing of text. There are programs to facilitate the
preparation, editing, testing, modification, and documentation
of computer programs. There are special-purpose languages,
together with facilities for carrying out instructions given in
the languages, for modeling or simulating complex processes.
(The modeled processes may then be set into action and
viewed in operation on the display screens.) There are ar¬
rangements to facilitate communication among members of
the on-line community — arrangements for viewing the same
dynamic model on screens at different consoles, for authori¬
zing access to otherwise private files, for merging texts and
pooling data, and the like. There are many courses and many
tecnhiques of computer-assisted instruction, including some
33
that instruct the user in the operation of the on-line system
and in the preparation of computer programs in various pro¬
cedure-oriented and problem-oriented languages. Indeed,
there are even programs that will play chess, checkers, and
Kalah, for example, with anyone who challenges them; and
some of the game-playing programs will play at any specified
level of mastery and with any specified style.
34
oriented users predominate so greatly in sheer number off¬
sets the greater concentration and, on the whole, greater
skill of the professionals. In many instances, however, it is
difficult to distinguish clearly between the contributions of
the substantively oriented users and the contributions of the
system professionals, for the professionals monitor the con¬
tributions of the users and often modify substantially, and
usually polish, the techniques and programs and the sets of
data that are offered to the public files.
35
The report is a document, of course, complete with refer¬
ences and figures. The writer does not have to type out the
references, for they are "in the system" — he merely points
to the ones he wants and to the places in the text to which they
are relevant, and the computer captures them and numbers
(and, when a new reference is added, re-numbers) them.
Some of the figures are dynamic. When the models, for ex¬
ample, are displayed on the screen, their input signals flow
and the models "behave". Most figures have several forms.
The reader can select detailed, real-time behavior or sum¬
mary statistics. In some instances, the information contain¬
ed in the figure is displayed from store. In other instances,
a generating function is stored, and the figure is displayed
from concurrent calculation.
The report is typed just once — when the author writes the
initial draft. Indeed, in writing the initial draft, he employs
some material that he prepared earlier for use in the experi¬
ment itself. He does not re-type it; he has the computer copy
it, and perhaps he edits it a bit. In any event, a thing is typed
just once and thereafter only modified. Because editing with
the aid of the editing program is quick and easy, and because
the current approach to*computer "understanding" of natural
language is more demanding than human readers are for ex¬
cellence of style and rigorous adherence to stated conventions,
important articles are revised and re-revised.
36
the submission records to each copy.) Editors use the net¬
work in their communications with reviewers, and that speeds
up the review process.
The report is not stored all in one place. The title, abstract,
references, etc., are held in a more readily accessible file
than the body. Keyed to the body (and to some of the figures)
are sets of data. The sets of data are stored in a data bank.
Whereas data used to be relegated to vaults — preserved on the
off-chance that some scholarly skeptic might re-examine the
experiment in detail — now that there are means for working
with data effectively, for analyzing or summarizing them in
new ways without going through hours of drudgery, stored
data are as often retrieved and examined as are stored
documents.
37
telling what things are available and in what forms. Part of
the description is for potential users; it gives them the over¬
all picture. Part is for computers; it provides detailed in¬
structions for operation of models, retrieval of associated
data, and the like.
38
The discussion between Fry and Rodehafer leads Fry to re¬
learn some Laplace transform theory. He studies it partly
with the aid of programmed instruction, partly by reading
retrieved texts, and partly by asking questions. At one point,
for example. Fry asks:
39
THE INFLUENCE OF THE CONCEPT
40
The other major restraining factor is the primitive state of
understanding of natural language. Men appear to communi¬
cate with one another in natural language rather well, but
even the most sophisticated linguists do not fully understand
the syntax; and there is almost no semantic theory at all that
is capable of supporting engineering applications. It is evi¬
dent that, although linguistics is recognized as an important
and challenging field, and although it is the focus of much
activity, the basic problems will not be solved to the point of
engineering applications on the time scale that has been set
for Project Intrex and the operational information network to
which it is to lead. This consideration adds its weight to that
of the text-conversion difficulty in limiting the use of the on¬
line intellectual community as a model.
41
CHAPTER V
A CENTRAL COMPUTER
43
Four applications which are already clearly distinguishable
are related to (1) the needs of the universities for computa¬
tional and data-handling facilities in the conduct of their or¬
ganizational businesses, such as payrolls, scheduling classes,
and so forth; (2) the computational needs of the members of
the university community in the pursuit of their intellectual
endeavors; (3) the use of computers for what has now come to
be called computer-aided instruction (CAI); and, finally, (4)
the use of the computer for information retrieval. It is, of
course, this last category that is the proper concern of Pro¬
ject Intrex, although the storage and retrieval of the text used
for CAI can probably be also considered a part of the Intrex
responsibility. We further note that all four of these appli¬
cations can and will make use of computers operating in a
time-sharing mode and that, in principle, they all could share
a single common facility. While this might seem to be a desir¬
able state of affairs in the interest of economy, it seems highly
unlikely that all four different groups of users could work to¬
gether satisfactorily, particularly in the early, developmental
stages.
44
A large portion of library information will be available in
image microform. This main image store will be accessed
by address only. Microform copies of equal size or of larger
format (for inexpensive viewers) and hard copy will be avail¬
able locally within seconds on demand. Although local viewing
of the microform may possibly be done by mechanical trans¬
port of the microform, remote viewing will be possible by high-
resolution facsimile or CRT display.
ACCESS TECHNIQUES
Physical Aspects
45
Because of the difficulties of handling books of different sizes,
one might think that the leading libraries will have standard¬
ized on a fixed book size both for storage and for distribution.
But this could have been done with profit at any time within
the last 300 years, and it seems that a general law may be at
work, the consequence of which is that it is easier to intro¬
duce a distinctly new system than to modify slightly an old one.
We have little confidence that the actual format of ordinary
books will be very much modified by 1975. Books for storage
purposes could be produced with high-quality materials which
would no longer be subject to the high deterioration rates which
affect our present materials.
46
Intellectual Aspects
We have talked about access in terms of physical devices.
Now let us turn our attention to the intellectual aspects of
providing access to stored information.
47
documents, depending on his needs at the moment.
48
Other users, of course, will want to retain a larger volume
of material. These users will ask for the material to be de¬
livered to them in a microform package and will then have
locally available enlarging equipment for viewing.
SELECTIVE DISSEMINATION
Many of the difficulties which currently plague selective dis¬
semination systems will have been obviated by 1975. We can
therefore expect with confidence that there will be at least
two forms of selective dissemination for which the system
must make provision. The simplest form of such a system
will consist of notifications sent to the user whenever there
are new acquisitions that fit his profile. However, we can
predict that, for a limited class of users, the actual docu¬
ments will be sent on arrival as soon as the profile match
has been observed. The systems, at least the more advanced
systems, will undoubtedly provide for machine determination
of reader profile based on his initial assessment of his in¬
terest, as modified by his actual use of the library system.
From one really extreme point of view, a request by a user
for a specific document to the library indicates either that
the user’s profile has changed with time or that the system
is not functioning properly.
49
in data-retrieval systems rather than in document-retrieval
systems, and be, therefore, available to the library of the
future in this form.
50
THE INFORMATION TRANSFER BUDGET OF 1975
51
CHAPTER VI
PROJECT INTREX
53
There was a consensus at the Planning Conference that the
proper evolution of libraries had such a great involvement
with machine-processing methods and with machine-search¬
ing methods that the over-all system would be very much
concerned with time-shared, on-line computers. It was thus
decided that Project Intrex should apply the same technology
to the solution of the library problem and to the development
of a real-time, on-line community within the universities and
similar establishments so that a single set of terminals would
presumably serve the user both as access to the modern li¬
brary and as his input to the time-shared system for comput¬
ing, editing, and other functions.
LIBRARY MODERNIZATION
54
the microimages. Toward the end of the period under con¬
sideration, there should be a substantial fraction of manu¬
scripts generated directly in coded form, so that they may be
stored much more compactly than if they were set into type
and photographed. For that part of the information that is
stored in microform, the user will have the choice of obtain¬
ing hard-copy output after a few seconds delay and at a cost
of a few cents per page, or of receiving a microimage suit¬
able for reading at his home or in his office on a microfilm
reader.
55
of logic is decreasing very rapidly, and when new techniques
in digital transmission as well as physical mechanisms of
transmission (communication satellites) are becoming common,
it is hazardous to predict the exact nature of a system for use
10-15 years hence. On the other hand, neither the video
transmission of scanned images nor the reconstruction at the
terminal of coded images is obviously uneconomical; and,
evidently, in an experimental program both of them will have
to be considered. This is fortunate, because we will have to
start with library materials stored as images, and yet we
want to work toward a system in which the text itself is ma-
chine-processible and thus susceptible of economical stor¬
age and transmission.
56
between are most of the experiments typically conducted in
large research and development efforts. Intrex ought really
to work near both ends of this spectrum. On the one hand,
specific new technologies ought to be applied to the problems
of libraries and information transfer. On the other, compet¬
ing techniques and ideas for solving these problems should be
examined and compared. The suggested experimental pro¬
gram includes both types of effort.
57
Core Program
58
For storage: — print on paper, analog microimages
on photographic materials, analog signals on mag¬
netic or possibly thermoplastic materials, digitally
encoded characters and graphic elements on photo¬
graphic or magnetic materials;
For delivery: — transportation for some of the fore¬
going, electrical transmission for others;
For display: — direct inspection, xerography, optical
projection, oscilloscopic display, and the like.
Supporting Activities
59
Browsing; planned facilities to foster unplanned
discovery.
Selective dissemination of information.
Use of the on-line network to expedite preparing,
reviewing, printing, indexing and abstracting of
manuscripts.
"Publishing” through the system to the on-line
community.
60
CHAPTER VII
The choice of the field or fields for this model will be crucial.
It is clear that the whole MIT library is simply too large as a
first step. If the funding and scope of the project permit, an
upper bound on the sensible first step might be a direct attack
on the MIT Engineering Library. Barring that, Intrex should
pick one or a small number of subject fields for individual,
experimental systems treatment.
61
availability of a time-shared computer facility and many
users, a large number of experiments could be attempted.
Powerful, machine bibliographic-search techniques could be
implemented. Machine fact retrieval could be tried. Many
different forms of access to actual documents could be inves¬
tigated. "Telebrowsing" could be investigated in the context
of the machine-readable catalog. Experiments in selective
dissemination of materials could be based upon the machine-
readable catalog. The possible uses of this machine-available
body of bibliographic material and data for educational purposes
and class use could be seriously investigated. Experiments
in different techniques for publication would be appropriate.
The more mechanistic functions of everyday, conventional
library life could be approached from a new point of view;
here we would include circulation, acquisitions, serials con¬
trol, etc. Thus, the model library system would be at once
centralized and distributed. There would be a central store
of documents, microfilm, and digital data, and there would
be a distributed time-shared computer net and distributed
users with access to that body of data. Many different aspects
of information transfer could be investigated and evaluated in
the context of this model facility.
62
THE AUGMENTED CATALOG
63
Time boundaries of the collection and its catalog should be
determined by the need to cover as much retrospective liter¬
ature as is required to meet the demands of the serious
scholar.
The Environment
64
Third, of the recorded information available on any one
subject, only a percentage ever gets under standard biblio¬
graphical control in catalogs, abstracting services, indexing
services, lists of citations, etc. All these devices must be
used in a comprehensive search, even though the results
will be incomplete; and a comprehensive search must go
considerably beyond them. And yet the editorial and intellec¬
tual efforts of assembling even these systematic devices are
extremely complex, costly, and time-consuming, requiring
specialist staffs. They all use different techniques and
different access terminologies, and are structured differently
with respect to subject arrangements.
65
mation centers. The excellence of any one library
is based, today, as much on its ability to provide
good switching facilities — to tap other libraries —
as on its ability to provide a substantial part of the
corpus of knowledge of any one subject.
General Requirements
When the criteria for search are complex, the system must
be open-ended yet controlled, one that efficiently brings to
light the relevant (while suppressing what is not germane)
from a heterogeneous store. This is important for any use
of the catalog, whether a search is for a specific document
or for unspecified documents relevant to a specific search
interest. In this connection, decisions must be made on the
degree to which input will be limited to the author’s own words
in titles or abstracts, and the degree to which the author’s
terminology will be translated into or supplemented by a con¬
trolled vocabulary.
66
author, wrong title, etc., with the portion of correct vs in¬
correct information being highly variable); and 3) a reference
that may be correct, but that does not match the data used by
the library to describe the document.
Input
67
the collection — far more than has been put into the traditional
card catalog. The inclusion of any new or old element in the
catalog entry will depend, first, on the thought that it should
be there and, second, on its continued usefulness in service.
This open structure provides for the addition of new informa¬
tion and the excision of what is found useless. To begin, we
think the catalog entry should include:
It should be evident that not all the items that will be found in
the completed catalog will be incorporated from the first.
Indeed some, such as users’ comments and sources of reviews
of current items, could not be included until they become
available. Therefore, an evolving entry is expected, with
some items — such as the basic author, title, place and date -
being entered early in the process and other data being both
added and deleted during the life of the catalog.
68
In the initial development of the catalog, current materials
will need to be supplemented by conversion of records of
material already in the MIT libraries or in other files, in
order to assure, as soon as possible, a file of information
adequate for effective reader use and project experimentation.
69
File Organization and Procedures
70
slower drum machine. It appears that this level of the file
structure is the one in which it will be found possible to es¬
tablish the majority of the simpler bibliographic control in¬
formation and to provide the high degree of connectivity, or
cross-reference linkage, which the machine system makes
feasible. In the language of the programmer, the file at this
level may be a list-structured or perhaps a ring-structured
file. This file can have a number of interconnected and over¬
lapping classification schemes expressed in its structure.
At each intersection or terminal point in the file structure
will be a list of addresses in the larger file at which complete
data concerning any cataloged document may be found. There
will be sufficient amounts of the normal kinds of bibliographic
data in the disc file to permit the implementation of the
simpler search strategies and the collection of experimental
data on the more normal uses of the system.
In a file somewhat larger and slower than the disc file, then,
will be a numerically arranged file of the master records
pertaining to all the cataloged documents in the collection.
These records may be formatted to contain places for all the
kinds of information that may be desired in the experiment,
including places for user annotations and comments.
71
A word is in order on the subject of file protection. The fact
that the system normally loads the computer memory from
data stored in larger and slower files means that a failure
here is not disastrous. Similarly, the drum memory is
normally loaded from other system files, so that a failure in
the drum system will seldom do much permanent damage to
the file structure. Perhaps the most crucial memory of the
structure suggested here is the disc storage. Procedures by
which the disc storage is periodically copied into the larger
and slower data-cell or tape storage, so that the catalog can
never be destroyed completely by a minor accident, are
probably to be desired. At the level of the largest and slowest
file machine, it may again be desirable to copy its contents
from time to time on to magnetic tape, just to avoid the neces¬
sity of creating the new, machinable input data.
Search Considerations
72
In any of the hierarchically organized index classes, a pro¬
vision would be made for the user to indicate the level of
abstraction (depth in the hierarchy) of the desired search,
and for the machine to inform the user of the hierarchical
structure that he is entering.
Equipment
73
can be achieved through a MAC-like system and this is what
is proposed for the model library. It may also be desirable
to integrate local computational capacity with the catalog
memory; such a step implies use of the time-shared system
primarily for the connection of users to the catalog.
74
Questions for Investigation
75
What is the most efficient search strategy?
What is the most efficient file order and
organization?
Procedures:
76
Computer programs capable of assembling data on all user
actions at the consoles will be incorporated so that more
elaborate questions can be asked and answered. Samples of
such questions are: What will happen to the habits of the
users of the system? Will they, in fact, be able (and want)
to make quite different uses of the system than is typical of
their current use of libraries? Beyond this there are whole
classes of small sub-experiments which are of interest. Does
the annotation furnish an important portion of the catalog?
Do the users like to see the reviews? The abstracts? The
tables of contents? Or do they simply not use these facilities?
Is the citation index the most often-used portion of the system,
or do people prefer to employ subject headings? Would expert -
generated critical bibliographies furnish an important and
much-used tool? If the chosen field were interdisciplinary in
nature, would this new catalog really make a major change
in the access of one portion of the field to the other?
77
TEXT ACCESS
78
Others: (Pamphlets, maps, charts,
clippings, reprints, preprints, photo¬
graphs, data from computer or other
sources, galley proofs, catalogs,
standards, technical reports, etc.) 2500-3500 pieces
79
In addition to transmission facilities, satellite microform
files are possible and attractive document or catalog sources
in many cases. Also, a fast messenger service from the
library is attractive, particularly as an experiment.
80
The documents can be arranged on open
stacks, according to some existing classi¬
fication method.
The documents can be arranged in closed
stacks, according to a classification method,
size, or in acquisition-number order.
Microform System
Microform systems have traditionally served two major pur¬
poses: space compaction (with associated ease of access),
and dissemination of document copies. The future uses of
microforms may encompass a more active role in the informa¬
tion transfer process. These uses include files of catalog
data, informative abstracts or extracts, current-awareness
services, microforms as a primary source for transmission
to soft displays and remote hard-copy production, dissemina¬
tion of document images in microform to personal user files,
and a publication medium. Detailed discussions of possible
microform systems and components are given in the
appendices. Specific choices of system components and con¬
figurations await an initial study of the total systems require¬
ments, taking into consideration the diverse nature of the
input material, user requirements and preferences, forms
of microstorage and access, economic constraints, inter¬
facing with adjunct facilities, and needs for dissemination
among the population of users.
81
The major system components and component functions are
as follows:
82
document address would specify the appropriate container to
be manually selected. Within the container of film, automatic
search to the specified frame is possible with presently avail¬
able equipment. The semi-automatic system offers flexibility
in that the selected image can be manually placed in any one of
a number of display, transmission, or copying facilities.
83
either be displayed on a cathode-ray tube or reproduced by
facsimile techniques. Reproduction could be either as micro¬
images or as full-sized text, though the former makes little
sense as it costs almost as much in this case as a full-sized
image. Alternatively, using the accession number, he may
consult a remote microimage store.
84
Some Experimental Studies Inherent
in Document Access
85
It should be noted that the experiment offers a complex array
of patterns from which a reader will be free to choose, rather
than simply a dichotomous situation in technology or access
time. Within the range of possibilities provided by the experi¬
mental library, the following confrontations of reader and
document seem of interest.
86
The reader may request the display
of microform images on a remote CRT
console.
87
the MIT library. This can be a condition of the experi¬
ment, but one could also provide a special staff for the
aggressive pursuit of materials not locally held. The same
kinds of charges could be used for the support of such services.
88
The "joys" of this library are the complete absence of losses,
of circulation records, of the pressure upon readers to re¬
turn books, plus the assurance to the reader that the book will
be in the library.
89
In this test the copies might be full-size hard copies or
microforms; or, for a given period, all copies might be
hard copies; for another period, all copies might be in
microform. Reading devices would be loaned by the
library.
90
2. INTEGRATION WITH NATIONAL RESOURCES
91
often do not provide the level of analysis that is needed and
that may eventually be provided by modern computers in com¬
bination with new techniques of analysis.
OBJECTIVES
ASSUMPTIONS
92
depending upon the field and the nature of his inquiry, the
computer may provide a reference from the locally stored,
general bibliographical data, or refer him to the printed
index-catalog of an information center. Alternatively, the
system may assist the user in determining a set of the des¬
criptors, suitable for his needs, for a custom search of the
bibliographical data stored at an appropriate information
center. This search specification is first checked by the
computer to make sure that it is a feasible search and re-
checked to make sure that it does not duplicate an analysis
already available in a printed index or some other form. The
search specifications are transmitted in a form suitable for
direct computer input to the appropriate cooperating center.
The reader is then informed when the results of his search
will be available. The results are transmitted back as soon
as available and go directly into the user's file without inter¬
mediate print-out. The user may inspect the results, and if
the national locational information has been automated, may
request a locational search of the bibliography, or a portion
of it, to indicate availability. With this information at hand,
the reader may request the text of those items he desires
from the most suitable source.
GENERAL COMMENT
93
have Intrex consoles — though some supplemental consoles
will also be desirable. We propose that the Medlars biblio¬
graphical capability at the National Library of Medicine, and
the NASA information system be used. Medical literature is
of increasing interest to MIT and Medlars covers this vast
field in great depth with a very fast service. The NASA system
also covers a broad field of mission- and subject-related
literature of exceptional interest to MIT. Both have batch-
search capability and both issue extensive bibliographies in
both tape and printed form. These two offer interesting con¬
trasts in types of users, types of literature covered, and
types of searches that are possible. Furthermore the exper¬
iments should help both centers to modify and improve their
services.
OTHER USERS
TELETYPE LINKS
OBSERVATIONAL OBJECTIVES
94
permits significant alterations in both local and information
center operations without impairing operations or engender¬
ing costly revisions of software or hardware. It is clear
that netting of libraries and information centers on a much
wider and deeper basis than at present will be essential. It
is also clear that many experiments of the type suggested
here must be performed before an adequate engineering
approach to networedesign will be possible.
3. FACT RETRIEVAL
96
currently published handbooks; storage of the facts contained
in sections of selected handbook(s); access to these facts,
including sophisticated searches for information; and mainte¬
nance of an automated notebook or data bank, continually up¬
dated with current experimental information, which can be used
for a multiplicity of purposes or users. We assume as a basic
facility for these projects a large, general-purpose, time-
shared computer system such as that in use at Project MAC;
such a system would undoubtedly be at the heart of the model
system. Attached to the system will be a variety of consoles
to facilitate delivery and display of requested information.
97
it here, but will probably be a necessary adjunct to all the man-
computer interactions in the model system. It is imperative
therefore that the design of interaction languages be an early
goal of the Intrex project. In the interim, before the ultimate
interactive system exists, one command, "HELP”, should
connect the user to a human reference librarian.
98
(2) Versatile Organization. At present, one can conveniently
answer questions only about combinations of variables chosen
for printing by the original publisher. In an automated hand¬
book, it would be quite reasonable to ask, say, for a list of
melting points of metals with shear strength above a certain
level and density between two limits. This kind of question
can be answered from current published handbooks, but only
after arduous labor by the user.
99
Handbook of Chemistry and Physics would occupy approximate¬
ly 10^bits of storage. This does not include the necessary
deep indexing discussed earlier. However, by matching the
storage structure to the information to be encoded, radical
reduction in storage requirements can be achieved, concurrent¬
ly with reduction in retrieval time. For example, the 20 pages
of tables of square roots, cube roots, etc., can be replaced
by a very short (and fast) computer program to compute re¬
quested values. Similarly, many pages of tables of experimen¬
tal data can often be compressed to a single (perhaps too com¬
plex for human use) empirical equation.
100
To be effective, a data bank must make it possible for users
to interact with it without having precise knowledge of the
organizational principles that determined the form of storage
of any particular subject of shared data. Interaction here
means both depositing information and its retrieval and amend¬
ment. In particular, a user must be able to retrieve data
organized in one way and have it presented to him in quite
another way. Thus the data bank will contain not only data in
the ordinary sense, but programs to manipulate and reshape
the data. It is obvious that development of this data bank will
necessitate an attack on many problems faced throughout the
model system, and that cooperation would profit everyone
concerned.
FUTURE RESEARCH
101
4. INITIAL FACILITIES
102
therefore look either to MAC or to the time-sharing facilities
provided by the MIT Computation Center for its computational
support. However, MAC may prove an intractable host, for
MAC is itself an experiment and, as’ such, must maintain its
own freedom to an extent which might make serious, long-
range Intrex planning impossible. Intrex will require a stable
computer service. The MIT Computation Center might pro¬
vide this more stable base. On the other hand, many projects
supported at MAC, such as console hardware and software
development, design of interactive languages, and computer-
aided design systems, maybe usefully coordinated with Intrex
efforts. A final decision on the choice of a shared facility
depends on administrative and technical questions beyond
the scope of the Planning Conference.
There may be programs that are written with the sole objective
of learning something from their composition and their exer¬
cise, and with no implied issue of survival. One might, for
example, construct an experimental catalog, its associated
search and retrieval mechanisms, and certain data-logging
103
and tracing machinery (instrumentation) which would be un-
economically slow in any real operational sense. But the
sure knowledge that this catalog would never be required to
evolve into an operational tool would allow significant flexi¬
bility, contributing to the range of experiments that might be
carried out with it. Simulators and computer models could
also be abandoned once they had yielded their results.
The first two years will yield considerable insight into the
ultimate configuration of the modelJLibrary. It is certainly
to be hoped that the adequacy of typewriters. Touch-Tone
telephones, and more sophisticated consoles will have been
tested during this initial period, and that over-all system
design criteria related to these instruments (their placement,
numbers, etc.) will have emerged. A central question will
be how to couple these and other equipments (e.g., the free¬
standing computer system mentioned above) to the opera¬
tional system which will then exist. Granting that Intrex
must have access to the computer facility's machinery
as a right and that such right must be purchased, the idea
emerges that Intrex contribute an additional processor
and disc storage unit (with required input-output controls)
to the computer system. This would enhance the computa¬
tional power of the facility in an over-all sense, thus bene¬
fiting not only Intrex but the entire MIT computing community.
(It might even prove possible occasionally to decouple thelntrex-
contributed subsystem from the over-all system and to run it
independently; whether this proves feasible or desirable cannot
be determined at this stage of the planning.)
104
storage supply. The over-all Intrex facility will probably
have to be augmented by a free-standing computer with its
own storage, as well as by sets of specially designed con¬
sole^. However, such instruments ought, so far as possible,
to be capable of being coupled to then-existing or contemplated
MAC or Computation Center facilities. Participation on the
part of Intrex in a shared facility ought to be purchased as a
contractual right, preferably through Intrex1 s contribution
of equipment and also by means of direct financial support.
An advantage of such an arrangement is that Intrex gains
immediate access to the whole time-sharing apparatus which
now exists, including the current network of typewriter con¬
soles. A long-term benefit to Intrex is that Intrex manage¬
ment is freed from the burden of operating a computer cen¬
ter. The community of users of this shared facility will
benefit by the availability for certain purposes of very large
digital and image stores and by the console development
efforts of Intrex.
SOFTWARE
105
hardware such as microform readers. The analogy breaks
down, however, in that certain functionally correct elements
of operational hardware can be purchased as off-the-shelf
items immediately. Instruments so acquired often come with
manufacturer’s warranties, operating instructions, etc. How¬
ever, Intrex software must be designed, compared, manufac¬
tured, debugged, documented, and ultimately maintained by
Intrex itself. Intrex management is consequently faced with
the task of '’bootstrapping" from the existing software facility
to a more appropriate software facility. Fortunately, this
sort of problem has been faced many times within the com¬
puter community. The problem will be eased for the period
of Intrex by the availability of high-level languages in which
much of the Intrex programming can be written as macro-
instructions.
STORAGE
Naturally, the Intrex experiments need not deal with the whole
of recorded knowledge, and it seems reasonable to use as a
data base in a restricted field about 5000 books and 100 jour¬
nals, the entire contents of each of which may since its incep¬
tion approximate 2500 pages of text. Thus, at 300 pages per
book we will have about 1.5 x 106 pages of books and 3 x 10b
pages of journals to put into the experimental store.
106
access time and 106 bits per second data rate ($36, 000/year).
It is thus clearly possible to store the information for the aug¬
mented catalog in digital-coded form. The access time for
each of the units seems tolerable as well, for as many as 10
to 30 users conducting active searches. It appears desirable
to put the journal-related augmented catalog data largely into
the disc store and the data dealing with books largely into the
data-cell system. At 2.5 x 10° bits per second, it will take,
for example, about 20 seconds to search all titles of articles
or books in the store.
107
store to a video scanner which will allow the flexible electri¬
cal transmission of the image to consoles nearby or across
the campus. It may be necessary to use developmental hard¬
ware for this application, but it may be desirable to do the
video transmission via a temporary storage device, which
may be a magnetic disc, etc., in order to allow the serving
of many customers with a single scanning station, without
holding a page in the scanner beyond the time required for
a single scan.
108
is required over a 5-Mcps line, and this bandwidth is neces¬
sary to gain experience with the information transfer rates
that will be obtained with character-coded material transmit¬
ted over narrower-band lines in 1975-. Naturally, signals
from the light-pen or the buttons can be sent back to the com¬
puter in a narrow-band form on top of the video transmission,
if desired (or during the retrace time, for instance).
PERMANENT COPY
109
the drive to the CRT display. This competes, of course,
with physical delivery of similar copies from the central
computer store, but may be a very desirable component
of the system.
SUMMARY
110
5. RELATED STUDIES: EXTENSIONS AND ELABORATIONS
111
look this possibility directly in the eye and to ask what experi¬
mental results might contribute to the wise planning of the
future library's opportunities and responsibilities in this field.
112
providing the texts, the reserve books, the special readings,
etc. Third, and in many respects most significant of all, the
library should foster in its users the desire to continue their
education beyond the classroom, perhaps even beyond gradua¬
tion, by self-directed explorations of the library's collections.
Not only should Intrex try to reduce the difficulties of the new
users; some attention should also be given to educating them
in the way the system works and the resources of documentation
that it draws upon. Instruction in information retrieval at MIT
is presently limited to publication of an occasional pamphlet
and to direct efforts of the front-counter librarian and the
reference staff. In an Intrex-designed library, however, the
search and retrieval functions will be automated and quite rapid.
A danger exists that students will emerge knowing how to use
the system, but not understanding how it works — much as the
driver of a Ford today probably knows far less about his ma¬
chine than did the driver of a Model T. This situation would
be acceptable if the student could be assured of access to an
Intrex-type library after graduation, but there is a real likeli¬
hood, for some years, at least, that he will go out into a world
of Model T libraries that he may be poorly prepared to use.
Classroom Back-Up
113
to avoid expensive complications in the display equipment. A
laboratory course might conform to these specifications, since
the laboratory apparatus itself provides the graphic information.
The information would then be made available to the students,
who could elect to study it in any of three alternative forms.
A student could, if he wished, ask questions and get detailed
answers, or he could use the system for document retrieval,
or he could ignore the system completely and use the more
conventional option. One advantage of this mode of experi¬
mentation, of course, is that the system itself can keep records
of the student's actual behavior — records which probably
should not be made available to the instructor — and these
could be related to performance in the course.
114
United States, and it is not obvious that MIT should be the place
where one tries to experiment with the education of young men
by turning them loose in an information transfer system to follow
their own interests and desires. Nevertheless, the ease with
which information will be obtainable in the future system should
encourage expeditions into the unkown that would be quite im¬
practical under the present system. Only a little effort would
be required to implement experimental studies on how best to
stimulate such unassigned adventures by the students.
SELECTIVE DISSEMINATION
A library of the type envisioned here can take a much more active
role inproviding inform at ion to its clientele than most libraries
have in the past. The availability of a computer system makes
it possible to keep profiles of each user’s interests and to furn¬
ish documents to him even before he requests them. This kind
of active library service,, aimed at supporting the user's current
awareness of developments in his field, is usually called Se¬
lective Dissemination of Information (SDI).
115
to documents of potential interest to them. In an industrial
library, this task will normally be performed by the librarian
or other information-service personnel. The information is
disseminated either by routing marked copies of serials or
other documents to appropriate users, or by sending users
selected distribution lists (with or without abstracts), or by
distributing photocopies of individual articles. In one form or
another, these procedures could be easily adapted to one of
the MIT libraries, or to some selected group of users.
116
recipient, when shown a title and perhaps an abstract, may
decide that the article is of probable future interest, and
should therefore be readily accessible to him, but he may not
choose to read it immediately. Thus there are several cate¬
gories of response that must be anticipated. A possible
questionnaire that the recipient would fill out for each dissemin¬
ated article might be something like the following;
117
Following the initial notification, a recipient should be able to
select articles that he wishes to receive in their entirety, and
the system should be capable of providing a very rapid response
should the recipient demand it. A particularly important vari¬
able in these studies is the length of the initial notification list;
it must represent a compromise between economy of effort on
the part of the receiver and the advantage of giving him a wide
range of choice.
118
edition of Fowler's Modern English Usage, he speaks of Fowler,
"He knew what he wanted from life; what he wanted was within
his reach; he took it and was content. n This is almost the
antithesis of a modern browser: he does not know what he wants;
what he might want has a good chance of not being within his
reach; and he is likely to be at least vaguely discontented
with what he is able to take.
The places for browsing also varied: book stores (these were
favorites); university common rooms; periodical tables in
coffee rooms, in the professor's outer office, dentists' offices,
barber shops, and in more focused arrays such as scanning
lists, indexes, catalogs, tapes; tapping computer output at
random; and finally, of course, the library stacks themselves.
One characteristic of browsing as classically conceived is that
it is the examination of a spatially ordered set of documents
with depth of penetration into the items fully and easily con¬
trolled by the browser. The active control of the depth of
penetration is of central importance in the process.
119
general concern that more sophisticated methods of search
should not destroy the browsing opportunity for those who
desired to browse, whatever their motives. Of course, a
complete destruction of the privilege of browsing by wandering
physically among the open shelves, wherever they may be,
cannot occur so long as machines coexist with books and
periodicals — which we project to be for a long time to come.
How then could browsing be diminished?
120
formulate specific experiments on browsing, however, the
absence of any normative data on browsing habits in existing
libraries becomes painfully obvious. This lack is serious,
not merely because we do not know what facilities and oppor¬
tunities different groups of users presently expect, but in
particular because we do not have examples of how such
undirected activity should be described, measured or evaluated.
A whole experimental methodology must be developed; it might
involve questionnaires, automatic record keeping, personal
diaries, user choices between alternatives, subjective estimates
of probabilities or costs involved in browsing, or other types
of data. Appropriate and informative measurement techniques
could no doubt be developed, but they do not exist at the
present time. Developing them should be the first stage in
planning a series of experiments on browsing.
121
If evaluative measures can be devised, several studies might
be worth conducting, e.g., browsing facilities of different types
could be made available and user preferences, costs of discov¬
ery, and virtues or defects of specific systems could be determined
for different areas of specialization and classes of users. The
model collections envisioned for the access experiments could also
be used for browsing experiments. Some librarians have from time
to time assembled books in special "Browsing Rooms" that would
provide a person with an introduction to the pleasures of reading. We
do not believe such recreational reading, separated from regular
curricular and other materials, should form the collection to be
used in these studies of browsing behavior.
122
the user had complete and immediate control over
his depth of penetration into any item. In a tele-
browsing system, moreover, various actions could
be taken to rearrange the user's personal files —
to make some items more immediately accessible —
as a result of discoveries made during the unplanned
exploration of the public collection.
PUBLISHING
123
the publishing industry, namely, manuscripts. A variety of
incentives for authorship are recognized and invoked by the
publishing industry. The desire to document and broadly
communicate research results to the professional community
is probably most basic. Closely related, but present in
differing degrees, are the opportunities for professional
recognition and monetary reward gained through publication.
Viewed in this context, copyright protection represents one
particular class of author incentive, based on monetary re¬
ward for creativity. (See, also, the appendix on motivations
of authors. )
124
of existing documents into the bibliographic and textual data
bases of the system, using optical character-recognition
techniques.
125
Publication Through a Microform System
Authors will try such a system if it is simple and does not intrude
itself more than the accustomed ways of doing business. Its
competitors will be the duplicating machine and the office photo¬
copier. In an experimental operation, it would be important to
examine the author’s reaction, the user’s reaction, the obvious
costs and the hidden costs. One publication. Wildlife Disease,
has been issued only in microform for a number of years. Its
authors have become accustomed to preparing copy in the square
format that uses the film to best advantage. The experiment
proposed here carries this type of publication to the individual
article, report, or book chapter, and seeks to determine accept¬
ability in a community of scholars.
126
On-Demand Publication
There have been precedents for such a program (in the ”auxili-
ary publication” program of the American Documentation Insti¬
tute for making available ancillary material omitted from pub¬
lished journal articles, also in the on-demand supply of Ph. D.
theses and by certain reprint publishers of photographic copies
of out-of-print books); but there has been (so far as is known)
no major instance of on-demand publication as an original
method of publishing. Project Intrex should be able to meet
this condition. For its list, one suggestion is that it might
take the (presently unpublished) progress reports of the num¬
erous MIT contract research projects of general interest.
Another suggestion is that, by arrangement with the MIT Press,
it convert to machine-readable (digital) form or to machineable
graphic form (e.g., microfilm) a number of MIT publications
of known saleability (leaving a number of similar publications
in conventional form to serve as controls), in order to test the
feasibility and effects of on-demand publication.
127
SELECTIVE RETENTION
128
Requests. In a large library, the average document is
requested only rarely — once or twice in many years. Hence,
request data are statistically unreliable over any reasonable
time span. In some subject fields, documents are used much
more frequently, and statistics may be correspondingly better.
Even so, they will probably leave much to be desired. Certainly,
if the Markovian model suggested by Morse is accepted, the
requests recorded most recently would be most pertinent.
129
using great statistical care, the body of data already collected
might be used to evaluate each of the rules and to see approxi¬
mately what percentage of user requests would have been sat¬
isfied had the rules been in effect. Appropriate theoretical
models might simplify these calculations materially. In this
way, many possible rules could be examined before any of
them were applied, and without additional data collection.
Eventually, however, the rules would have to be implemented
(and further data collected) to see if they live up to expectations.
The trial might last about two years, during which time a man
would be employed to apply the rules. The normal library
selection procedures also would be carried out in order to
obtain a comparison between them and the experimentally de¬
rived rules.
* * *
130
6. R & D TO SUPPORT THE EXPERIMENTAL PROGRAM
CONSOLES
131
also, however, for other applications of interactive computing,
and the whole task of developing consoles should not rest upon
the project's shoulders. What is essential is that Intrex for¬
mulate the needs for consoles of the information transfer ex¬
periments of 1965-1970 and of the information transfer networks
of the 1970's — and see to it that those needs are met. At the
very least, that will require a small group of people who are
knowledgeable in man-computer interaction and particularly
in information display. At the most, it will require a group
that can adapt hardware, design and construct interfaces, and
perhaps develop one or two critical devices. Hopefully, the
supportive R & D in consoles will not be a large effort; hope¬
fully, it will be mainly a matter of formulation, of liaison
with other organizations, and of testing. But it will be criti¬
cal to the success of Project Intrex.
132
flat sheet on which the user can write or draw and through
which the computer can determine the position, at every
moment, of the user's stylus. Both the light pen/oscilloscope
combination and the stylus/RAND Tablet combination are at
present quite expensive; the costs are of the order of $10, 000.
In the opinion of many people who have had experience using
them, the arrangements are so convenient and, to use a word
that is expressive even though not literally accurate, powerful
that all the consoles of an information-transfer network should
have them. The problem is there, again, either to devise new
and inexpensive arrangements to serve the same function or
to figure out how to decrease the cost of arrangements of the
kind that now exist.
INTERACTION LANGUAGE
133
and to work toward a coherent family of interaction languages
for operational networks of the 1970’s. In addition, the group
should serve as a condensation point for research in man-
computer communication that will be conducted throughout
the Cambridge area.
134
time, most communication from men to computers takes place
in languages only slightly more complex than the simplest —
languages in which most of the statements are imperative and
consist of two terms, the first being a verb-like operator and
the second naming the data upon which the operation should
be performed. The statements that are not imperative are
simple declarative statements such as, "Alpha is an integer",
and "Beta is a system variable".
During the last few years, there has been some progress in
getting computers and programs to perform syntactic
anlayses — to parse sentences of natural language — and to
respond discriminatingly to substantive terms by taking into
account the kinds of data structure that are associated with
them. Unconstrained natural language, however, is still far
beyond the ken of computers and computer programs, and
most workers in the field of computational linguistics believe
that many years of research will be required to make machines
"understand" natural language in a way that, according to
behavioral criteria, approaches human understanding of
natural language.
135
desirable, therefore, for the computer to translate from the
terse mode to the full mode when it presents the record
for review.
ANALYSIS OF CONTENT
136
not to be practicable to achieve the required breadth and depth
of indexing and abstracting through human effort alone, then
Project Intrex should undertake to develop the techniques and
the programs required to support the experiments. It may¬
be appropriate to go farther than that in connection with the
theoretical work to be discussed later in this section; but we
think the supportive effort in automatic and semi-automatic
analysis of content should be focused sharply on the require¬
ments for indexes, abstracts, and the like that are posed by
the main experiments.
137
no dearth of theory in those fields. There is not at the present
time, however, we think it is fair to say, a comprehensive and
basic theory of information transfer. Perhaps such a theory
is now in the process of forming, but it does not yet have
definite structure, and it is not yet ready to support engineer¬
ing applications.
138
7. DATA GATHERING FOR EVALUATION
DATA ON USE
ECONOMIC CONTROLS
The initial data on what services are most popular will probably
need to be combined with data on cost of service, to determine
the charges to be made for various services, if economic
139
controls are instituted. Thus the use-patterns, obtained by
correlating user records and records of use, will be needed both
before and during any experiment involving economic controls.
DATA ON LEARNING
140
To carry this idea further, the user's card could be the basis
for making the library's reaction with the user active rather
than passive. Excerpts of the user's use-pattern could be
recorded on his "card”, and the system could be programmed
to volunteer further data of the sort he has already shown an
interest in. In addition, the user could have the right to
"annotate" his own McardM, to indicate his possible future
interests and disinterests. For example, a computer program
to correlate the new acquisitions as they are entered in the
augmented catalog with the "user's card" declaration of inter¬
ests could serve as the beginning for an automated selective
dissimination of information system that did not rely on long
initial interviews to open the activity.
141
APPENDICES
GONNA WAV* 1
TH& MOST APVto* CfcP
, OOUttk IN gV&tf'THIN' ,
OK M'MOW/EP
10 MAN NO* M*61!
Copyright 1965
Hall Syndicate, Inc.
143
APPENDIX A
REMARKS OF VANNEVAR BUSH
TO
PROJECT INTREX PLANNING CONFERENCE
2 August 1965
This being the case, it will move forward well only if intelli¬
gent citizens, in positions of influence, come to realize the
potential benefits that exist. And this will occur only if those
in a position to understand the whole affair speak out, effec¬
tively and often. I am not advocating some sort of promo¬
tional campaign. I am merely trying to point out that there
are two parts of this job: first, the tough technical and pro¬
fessional task, which is indeed tough; and second, the burden
of carrying the intelligent section of the public along toward
understanding and, if possible, enthusiasm.
144
methods of every professional group — in law, medicine, the
humanities. It will support every phase of our general cul¬
ture. I believe very few scholars today realize what this
could mean. I am sure the general public does not realize, for
example, that success in this program could mean as much to
their well-being, their health, as has been produced by the
power of antibiotics. If and when professional men and schol¬
ars generally grasp what is here involved, there will be no
lack of support if this country maintains its present prosperity.
One last point, and this is merely an idea which I present for
your consideration. I believe that an essential aspect of a
fully successful system is that it should automatically improve
by reason of its use. It is easiest to consider this in the case
of a professional library, perhaps one on the law. If the attor¬
ney, turning to this, and finding the exact reference for his
purposes, found also the comments and criticisms of his col¬
leagues who had passed that way before, and if he added his
own thought, the value of the library would be continuously
enhanced. I realize that this would have to occur, at first, in
situations where conflict of interest is absent. But I would
hope that it would extend in an atmosphere of professional
interchange and frankness which is characteristically American.
As you labor on this problem you will work, not with the tools
of today, but with those tools as they will be rendered more
powerful by the technical advance which is present and active.
With these refined tools, and adequate support, I am sure
much will be accomplished. I merely wish I were young enough
to participate with you in the fascinating intricacies you will
encounter and bring under your control.
145
APPENDIX B
INTRODUCTION
147
the network of time-shared computer facilities, itself, might
exist as a loose confederation of coordinate subsystems. The
plan calls, in accordance with that idea, for setting up the
proposed information system at first within the time-shared
computer facility at MIT, then extending service over tele¬
communication lines to individual people or consoles at other
locations, and finally bringing other time-shared computer
facilities into the system, thus creating a true network.
148
Suppose, now, that the experimenter continues to use the
system as he prepares his manuscript for "publication".
For example, he uses an "editor" program to help him get
his manuscript into good form and style. He uses graph-
handling programs to prepare graphs that have the proper
visual appearance and, at the same time, the correct alpha¬
numeric representation within the computer system. (The
computer can reconstruct the graphs from the alpha-numeric
tables.) In short, the experimenter prepares his manuscript
in computer-processible form.
149
processing, for retrieval, and for study by people. The two
parts of this organization are, needless to say, closely
interrelated.
150
The final function of the system is to keep records of its use
and to facilitate experimentation. The proposed system is
an experimental system. It is, as the jargon puts it, an
"experimental vehicle". It should, therefore, be more
heavily adorned with facilities for observation and control
of its operation, and arrangements to promote flexibility of
operation, than would be fitting in an "operational" network.
But distinguishing "experimental" from "operational" is not
to suggest that the experimental network should not operate.
151
printed to computer-processible form. In any event, by
selecting a small, new field, or a small cluster of such fields,
it would be possible to ensure that the system be practically
useful at an early date, and that the cost of carrying it to the
stage of practical usefulness would be small.
152
at least at first — toward the domain of invisible colleges
rather than toward the domain of existing journals.
EXPERIMENTS
153
paper, to make those programs available through the network,
and to experiment with them in approximately the same way
as was proposed in the case of text-editing programs.
154
Representation of text within computer memories. The prob¬
lem of representing natural-language text within computer
memories, representing it in such a way as to be economical
in the use of memory space and also to facilitate processing,
is an important, and I think a deep, technical problem. There
are many ideas — coding for compactness, "hashing'^ list
structures, "trie" structures, etc. — and, I think, still more
to be discovered or invented. The processes of discovery
and invention involve a kind of software experimentation. I
would expect to see much of that kind of experimentation
carried out within the network.
COMPUTER-PROCESSIBLE VS COMPUTER-RETRIEVABLE
TEXT
J.C.R. Licklider
155
APPENDIX C
MEASURING USER NEEDS AND PREFERENCES
156
unimportant, either to the individual user or to the system as
a whole, simply because it is used infrequently. Records of
use — frequencies provide valuable information about the way
the system is working, but they must be interpreted with great
caution when planning innovations are in the system.
157
same research interests as other users. Not
only are such "planted" users good feedback
channels about user satisfaction; they can also
do much to make the system more accessible
and attractive to other users.
That a "user's remarks file", of the kind currently
provided in the MAC system, can give the user of
a computerized system an easy and convenient chan¬
nel, especially for small details that might other¬
wise not be reported; it probably is well worth
the cost.
George A. Miller
158
APPENDIX D
INTERACTION LANGUAGES
159
user in discovering how he can formulate requests to and
commands for the system. One way of doing this is by pro¬
viding a simple teaching-machine program to use with a
programmed instruction text, with very flexible paths through
this text. One can compare experimentally this type of
instruction with ordinary user manuals that are provided
externally to the system. In this programmed-instruction
situation, the help from the system should gradually diminish
as the user learns more and more, and eventually he should
be able to talk to the computer in a very concise fashion
indeed. Of course, he should always have the option of going
back into the longer mode and requesting help when he for¬
gets the conventions of abbreviation of the short commands.
Similarly, the output from the computer should become more
abbreviated as the user becomes more skilled; but he should
always have the option of saying, ’’What? 11 and getting the
unabbreviated form of that message.
160
We have discussed here, briefly, several aspects of interac¬
tion languages. First, as implied by the name, the language
should not be something that is accepted passively, but some¬
thing with which the user can interact strongly with the
machine. The machine should be very forgiving of mistakes,
and should request additional information it needs if the user
forgets to supply it. Secondly, there should be an ability to
abbreviate and to form a higher-level language built on the
command language originally given to the user. The way
suggested to do this was to embed the command language in
a macrolanguage for the system. Finally, programmed
instruction and feedback should be an integral part of the
system.
D.G. Eobrow
161
APPENDIX E
163
At quite a different level, I would like to argue that Project
Intrex should enlist the services of an imaginative economist
at a very early stage. There are at least two major reasons
for such action. First, the adoption of various technical paths
and systems designs will be critically dependent upon sound,
economic analysis. Secondly, the application of sound tech¬
niques of economic measurement in the kinds of changes that
are ahead in the library information field need to be better
understood than is currently the case.
H.H. Fussier
164
APPENDIX F
THE ROLE OF GRAPHICS IN INFORMATION TRANSFER
165
is needed? Or has he misused existing equipment? There may
be basis for each of these points of view, but it would serve
little purpose to fix the blame here* Clearly, it can be stated
that technology exists to solve in great part the problems of
the information center. But it is a mistake to leave the job
entirely to the manufacturer. Faced with a great variety of
manufacturing alternatives, it is not likely that he will opti¬
mize equipment for purely library applications without guid¬
ance and standardization from the library community.
166
or printed on to paper copy or film. The capacity of the sys¬
tem is limited by the tape supply system or the disc files
available. If high-resolution images are required, consider¬
able storage space is used and special display tubes employed.
Although in its infancy and presently expensive, video storage
offers a highly flexible medium for rapid access to graphic
images. Its primary applications in the future may be in situ¬
ations where the ability to erase and add images is of prime
importance.
167
In contemplating various experiments, a detailed analysis of
the economic factors associated with various systems of stor¬
age and transmission is needed. There is little doubt that
microfilm offers the most economic means of microstorage
with currently available equipment for input, access, presen¬
tation, and generation of full-size paper prints. For the Intrex
experiments, evaluation of the possible uses of graphics in
information transfer probably will be made via microfilm sys¬
tems as adjuncts to computer consoles, transmission devices,
and paper-copy machines. A major contribution that can be
made by Intrex would be the specification of improved viewers
and printers for library uses of microforms.
168
Since the film strips are generated by filming of computer
display, the catalog is updated by computer merging of new
data within the old collection and periodic generation of new
microfilm catalogs. In the interim, a separate file of new
acquisition images would be maintained. (It may even be of
some value to have a separate file of new items prior to the
periodic merging.)
169
viewers out of economic reach of the student and most other
library users. Even so, we still recognize the attractiveness
of the concept of personal files of compact, inexpensive micro¬
film copies.
170
Study of User Network Requirements
171
Evaluation of Low-Reduction Microimages
172
console during the off-site reading period. Xero¬
graphic images with unfixed (not heated) toner
particles may provide this facility.
J.L. Simonds
173
APPENDIX G
DATA ARCHIVES AND LIBRARIES
175
seeking the precise data you wanted by some sort of small-
scale sample survey., or else you could ask the Census Bureau
to run you a special set of tables. Since either of these proce¬
dures is costly, data producers are under constant pressure to
report more detailed and elaborate cross-tabulations in the
standard tables. The libraries get fuller and fuller of thick,
densely printed volumes, which, however, can never report
more than a small percentage of the possible cross-tabulations.
(Note that the present Indian census is being reported in 1500
volumes. Given the excellence of this census, its uniqueness
in developing countries, and the widespread interest at MIT in
India, the MIT library certainly ought to acquire this collec¬
tion. But if India is publishing 1500 volumes now, what does
the future hold for 115 countries in 20 years?)
AVAILABLE SOLUTIONS
176
will probably never be an economy, however, to abolish cen¬
sus volumes or other volumes of basic statistics. It is prob¬
ably quicker to look up the population of the 50 states in the
World Almanac than to ask a computer to recompute them. I
do not intend in this brief memo to attempt to explore the al¬
gorithms for determining when it is more economical to print
out frequently used results and store them on sheets of paper
or books; when it is cheaper to compute frequently used re¬
sults and store them in ready-to-print output files of a com¬
puter; and when it is cheaper to keep the raw data untabulated
and permit the rapid calculation capability of the computer to
be used to produce the desired results. It is sufficient to note
that the direction of technological development is probably such
as to make it increasingly economical to leave raw data untabu¬
lated until the consumer wants it and then to permit the compu¬
tation of results, rather than to store the results once computed.
177
that would permit the addition of those individuals new re¬
sponses 10 years later — or,.at least, the Census won't do that
for named individuals. Given this limitation there .are, how¬
ever, two partial solutions, namely, the recording of uniden¬
tifiable samples of individuals and the recording of aggregate
data in units sufficiently small to be useful but sufficiently
large to protect the individuals. The one-in-the-thousand
census tape is an illustration of the former solution. It con¬
tains no exact local information. It is, therefore, substan¬
tially impossible to identify any individual in the United States
from his replies. There is only one chance in a thousand that
any given person has data about himself on the tape in the first
place, and one could never guess who that individual is if one
doesn't know where he lives — no matter how much information
the tape gives about his family size, occupation, etc. The one-
in-a-thousand census tape does contain an arbitrary serial num¬
ber for each respondent. If this were preserved by the Census
Bureau, a very valuable longitudinal analysis could be done at
the next census. However, often one is particularly interested
in localizing data. One wants to study Boston, not people in
the Northeast. The best solution under these circumstances
is to report the smallest unit that will disguise individuals,
such as, for example, the block or the census tract.
178
so the problem of matching does not arise. All in all, it is
clear that, in the technology of the future, the user of data
will not be satisfied with printed tables but will want access,
so far as possible, to the basic data file.
The census provides the largest and probably the most impor¬
tant archive of all. The census publishes not only population
statistics, but also censuses of agriculture, industry, mu¬
nicipal government, etc.
179
that type of information that is of interest to many members
of the university community and that is too bulky or expensive
for each to retain or own. Each member of the faculty owns
some books,, but no member of the faculty can afford all the
books he needs. The library provides the economy of shared-
book usage.
180
It is clear that many of the problems to which our social
science project is addressing itself are logically identical to
library problems to which Project Intrex is addressing itself,
and that many of the solutions (by way of computer hardware
and software, and human procedures) are identical. I would
therefore like to urge Project Intrex to consider seriously
the role of the data archive as an integral part of the library
system where, I am convinced, it belongs, and hope that we
can find ways to integrate and coordinate our intersecting
interests.
POSTSCRIPT
181
APPENDIX H
GUIDELINES FOR INTREX
CONTENT-ANALYSIS EXPERIMENTS
183
Language — English and others that are heavily-
represented in recorded knowledge (Russian,
French, German, at least).
Size — varying from titles through paragraphs
to multi-volume treatises.
Some of the text should be that which formed the subject matter
of other well-performed experiments. Use of texts perpared
elsewhere should help to reduce the input costs.
184
observe results, feed back additional manipulations, call for
evaluation programs, etc. All this will be recorded in a
master file which collects the use pattern and observed
results of the content-analysis system.
Ascher Opler
185
APPENDIX I
The purpose of this note is not to try to add ideas to the im¬
pressive pool formed by the collection of contributions al¬
ready made during the Planning Conference, but, rather to
propose some experiments. Perhaps one or two of the things
proposed will not qualify even under a broad and liberal in¬
terpretation of "experiments11, but I shall try to press, in
formulating the following, in the direction of analyzable
experiments.
187
apparatus of bibliographic control. If he wants to get an actual
book, he has to use the regular library services to get it.
Comparing the third and fourth browseries, one pits the new
look against the old look in an arena in which the new look
can actually handle more of a total operation, rather than —
as is usually the case — less. In particular, it will be pos¬
sible to implement a variety of dynamic annotation schemes
when the entire activity is carried out through consoles and
a digital computer, whereas card-catalog and index-book
browsing would become chaotic in a hurry if everyone wrote
in the margins. Perhaps I am making a mistaken assumption
here, but I think it is safe to say, at the very least, that it
is not necessary, in this comparison, to handicap the computer-
and-console approach.
Comparing the first two browseries with the second pair, one
sets into clear relief the factor of depth of penetration. In the
first two, the browser can delve as deeply as he likes and drift
out of browsing into consecutive reading or deliberate search.
In the second pair of browseries, the browser has to leave the
premises in order to move down into the realm of content.
188
Comparing the first and third browseries against the second
and fourth, one sets the conventional manual-visual approach
against one based on newer technology. In the case of full
and complete browsing, the conventional approach is based
on collections of shelved books. In the case of browsing
constrained to the level of the apparatus of bibliographic
control, the browser has to take a rack of catalog cards over
to the table, thumb through them, go over to an index volume,
take it off the shelf, and so on. A picture of the full, uncon¬
strained browsery based on microimage representation of
full content and computer-processible bibliographic informa¬
tion is given in the paper on "The Microbrowsery”, which
I wrote after completing the first few pages of this memoran¬
dum. The new-look browsery constrained to the level of the
apparatus of bibliographic control represents a very conserv¬
ative approach to the "telebrows eryM, about which George
Miller has done some thinking. Note that, in the telebrows-
ery part of this proposed experiment, the bodies of documents
are not assumed to be available in digitally coded, computer-
processible form.
189
if the documents are new accessions and therefore always
changing — to microimage, and a cataloging and/or indexing
that is both rapid and deep. Both those operations pose a
considerable requirement for personnel, equipment and money.
I should argue, however, that the conversion to microimage
and the cataloging/indexing ought to go on, anyway, and
browsery Number 2 might give them the right kind of pressure
and the right kind of point of application. I may have pressed
fairly hard, in connection with browsery Number 4, when I
assumed that users could browse through the apparatus of
bibliographic control and the annotations while sitting at con¬
soles remote from the laboratory. There, however, the
pressure is wholly in the realm of economics and not in other
aspects of console technology. The trouble is that browsery
Number 4 would not be much good if the browsers had to
walk some distance to a community console, only to find
it already occupied.
190
particularly important in this area, and I think that they will
be the more helpful as MI liked it", and "I give it a grade of 3
on your 7-point scale", are shunned in favor of data for the
preparation of flow charts and case histories.
191
manual browsery, far short of browseries one and two, but
I think it would make the experiment more meaningful. The
fact is, I think it would be frustrating and unproductive for
most people — for most purposes — to browse through catalogs
and indexes, and I am afraid the only value to be gained from
browseries three and four, as originally described, would
be proof that that expectation is correct. Therefore, make
browsery four as full and complete a browsery as possible
under the constraints set by the difficulty of converting the
contents of documents into digitally encoded, computer-
processible form. Then match browsery three to it to serve
as a control. Finally, if it should prove possible to go quite
a long way toward making the telebrowsery a real browsery,
then it would be worthwhile raising the question, whether or
not to expand the experiment to include six categories, the four
described initially plus an as-full-as-possible telebrowsery
and its manual-visual control.
192
The third strategy is based on the assumption that there are
so many scientists and scholars in the world that almost
every thought worth thinking, every device worth inventing,
every theory worth testing, has been thought or invented or
tested. That leads to a kind of desperation and to a search
of the most distant, most unlikely areas in the hope that
whatever is found there that has any relevance at all will at
least have some probability of novelty and some possibility
of revolutionizing the field to which the browser aspires to
contribute.
193
finds seemed important to the browsers, but no information
about which browsery was being used), would be taken quite
literally in the scoring of the experiment. The two "results” of
the experiment would be (1) a set of tables showing the values
of the three browseries to each of the three groups of browsers,
and (2) a set of descriptions or graphs or flow diagrams of the
courses followed by the browsers in the three browseries.
194
browseries were designed. In part, the outcome would be a
collection of descriptions of behavior and reaction in brows¬
ing. I suspect that the latter kind of result would be of con¬
siderable interest, even though it might not lend itself to neat
analysis or interpretation. In a field like browsing, in which
everyone has some opinions but no one has any data, it is
sometimes a good idea just to do some careful observing and
see what transpires. Incidentally, I do not know that no one
has any data about browsing. *
It turned out that Bill Locke had already initiated some li¬
brary research, and that there are, indeed, some data on
browsing — but not many.
195
up an experimental browsery. If there are not enough books,
then several people can contribute paired associates, and the
browsery collection can be assembled from the union of the
suggestions.
At the beginning of the year and again at the end of the year,
the subjects would be interviewed and tested to determine
profiles of their interests, aspirations and personalities.
196
researcher with a particular kind of bent to try to make sense
out of the diary data. He would approach the task, I think, by
formulating and collecting hypotheses about browsing and try¬
ing to structure the field. Then, with his a priori formulation
of browsing to go on, he would read the diaries. In the proc¬
ess, he would modify his formulation, altering some of his
hypotheses and adding new ones suggested by the diaries. Per¬
haps he might stop this process after having read a random
half of the diaries.
J.C.R. Licklider
197
APPENDIX J
THE MOTIVATIONS OF AUTHORS -
INTELLECTUAL PROPERTY AND THE COMPUTER
199
sent out as possible. He wants to watch acceptance of his
writing, and wants carefully to nurse a second edition; he
resents help if he is unable to perform this task himself.
200
psychologist who offers means for building author satis¬
faction into the system.
I did not deal with the function of the editor, who has to act
as a filter for the input.
E. S. Proskauer
201
APPENDIX K
INTRODUCTION
203
control than the book. Technically the generation of full-size
paper copies is easier and subsequently more economical if
the input is microfilm than if the copy were made from a paper
original.
MICROPHOTOGRAPHY SYSTEMS
204
If library science has not succeeded very well in developing
good information transfer systems, this is due partly to poor
economic support for libraries, to a lack of planning of infor¬
mation networks, and to other reasons, but a primary factor
is the inadequacy of the book itself. (This kind of statement
often requires the proponent of new techniques to enlist the
services of a public relations expert or to be able to duck
quickly, because questioning the adequacy of the book is
only one step below an attack on motherhood. )
One might conclude that the book simply isn't worth it. But a
more proper conclusion is that we should strive for systems
with the advantages of the book at the output end without its
disadvantages in the preceding phases of information transfer.
Microphotography systems, sometimes incorporating com¬
puters and sometimes used in conjunction with computers,
come closer than any other concept to being able to do this
during the next decade.
205
So far as library applications of microphotography are con¬
cerned, microfilm service has normally meant the acquisition
of roll microfilm, unindexed, and stored in a primitive
manner, without any provision for location of individual pages
or chapters of the document, let alone automatic page selec¬
tion. Such microfilm was used with reading devices that
suffered not only from a conspicuous lack of good mechanical
and human engineering, but that were frequently designed
for film in a different form. For example, a 30X reduction
microfilm is offered to the user with a 15X magnification
reader, resulting in a very small, inadequate screen image.
There has been no adequate standardization of the manner
in which images are arranged on film to facilitate retrieval
and bibliographic control. One of the worst aspects of the
library microfilm situation has been that it has not been
subject to quality standards, so that the user often has had
to tolerate severely substandard graphic images.
206
has been designed which facilitates quick and reliable access
to information of this type - including fully automatic subject
search, address-type of retrieval, and simple indexing methods
which assist in locating a given page.
The fastest-growing area during the last few years has been
the storage and retrieval of engineering drawings in micro¬
form, in aperture-card systems. Mail-order houses are
using microfilm extensively and are achieving considerable
publication economy as well as improved access speed to the
information, and catalogs listing parts for general engineering
use are published in microfiche form. The advantages of
publishing historical manuscripts in microform are obvious.
Access to this country's historical materials has been quite
inadequate, and a major program to disseminate such mater¬
ials in the form of microfilm is now under way.
207
this project. On the one hand, it is an example of a good
balance between "type" of information (namely, reports) and
the microform chosen (namely, the microfiche), and a stan¬
dard has been written to assure uniform placement of the
images, uniform size of the entire fiche, uniform reduction
ratio, etc. On the debit side, little attention has been paid
to the storage and retrieval problems of the fiche, and neither
the reader nor the reader-printers for either occasional or
heavy use can be considered quite adequate. It is of interest,
however, to note that, even though a complete system (includ¬
ing all possible user requirements) was not designed by NASA,
the size of the project has caused manufacturers to introduce
additional equipment quickly.
208
silver film, diazo film, Kalvar film, or all
three? Should the system experiment with
completely new forms of image storage — e.g. ,
microxerography or thermoplastics?
What are the economic constraints of the system?
What is the compatibility of this information
store with rest of the library? With other
mechanized systems in the library?
What type and amount of information should be
in machine-readable form on the microfilm?
Should it be a simple address in digital form,
or a series of machine-readable descriptors so
that the information can be found by description
rather than address?
What reduction ratio should be used?
209
Readers and reader-printers could certainly be improved
substantially. A host of interesting new types of film is
awaiting exploitation.
210
EXPERIMENTAL OBJECTIVES AND EXPERIMENTS
211
to the complete, up-to-date collection. The system should
allow browsing. Output should normally include several
alternatives, including a print output. Subject search might
be part of the system.
212
The experiments should study the attributes of microphoto -
graphic images to be transmitted, possibly utilizing a buffer
principle based on xerographic Proxi plates.
213
Experiments in microphotography share the hazards of all
experimentation — namely, the difficulty of determining
meaningful methods for evaluation and the difficulty of eval¬
uating user satisfaction. (Psychologists have warned that
experiments are frequently prejudged by the manner in which
they are introduced or described.) There is another hazard
in experimenting with image retrieval, and this concerns
comparison between different techniques. While it seems at
first perfectly feasible to compare a magnetic information
store with a store of hard-copy documents and with a micro¬
film store, further thought about the details of such an ex¬
periment shows that, with the present immature state of the
systems, an elementary mistake can easily be made. It is
somewhat like an experiment which compares a DC-3 air¬
craft, a Rolls Royce automobile and a fishing vessel as rep¬
resentative of aviation, ground transportation, and shipping,
respectively. Such interdisciplinary comparisons may be
useful as follow-up experiments, but basic system experi¬
ments should come first.
Peter Scott
214
APPENDIX L
215
control will be unnecessary", introduces the focal problem;
and perhaps it says enough about the focal problem. However,
I shall discuss the matter further, trying to come to grips
with a question about the nature of "experiments" that still
bothers me.
216
Lincoln Laboratory after the initial pressure of making the
"quick fix" and building the SAGE System subsided. Never¬
theless, I think it is worthwhile noting that rigorous experi¬
mentation has had a hard time adjusting to the pressures and
realities of large-scale engineering works, and that large-
scale engineering projects have often been successful — perhaps
equally often, unsuccessful — quite independently of the experi¬
mental work that was conducted at the same time on the same
premises under the same general management.
217
numbers of alternatives to which combinatorics leads. The
main heuristic is to isolate critical subsystems, simpler and
therefore combinatorically less explosive. Sometimes one can
test the various possible forms of a subsystem in parallel.
Alternatively, one may employ hill-climbing or adaptive tech¬
niques. On this level, neat experiments can be designed,
carried out, analyzed, and reported within the five-year period
of Project Intrex. Indeed, they can be carried out, in some
instances, in less than one year. I think that a considerable
number of such experiments should be performed by Intrex.
Some of them should be scheduled to influence the system
projects with which Intrex will be concerned. Others should
be protected from the pressures of the system enterprises and
should be scheduled to feed into the design, not of the "experi¬
mental" systems of the first five years, but of th.e "operational"
systems of the second five years.
J. C. R. Licklider
218
APPENDIX M
HOW HUMANISTS USE A LIBRARY
Philosophers
I do not include
219
Teachers who try to grow in what they know about
what they teach without trying to contribute seri¬
ously to the store of knowledge. They need, I
suggest, a fine opportunity for browsing and for
leisurely and thoughtful reading in depth, and a
selected exposure to the important current liter¬
ature of their field — but very selective, since
humanists perforce will not find any five years of
time of maximum importance, and particularly
not the present time. It is only rarely that the
current literature or the current writer has the
importance, ipso facto, that it may properly have
in rapidly changing science.
220
we will get it* And our principal frustration might
well be an inadequate stock of things, most of which
we know We want from bibliographic and catalog
searches. We may use rare books or manuscripts
now and then for purposes hard to explain, but
facsimile copies at original size would ordinarily
be quite enough. An exception in my own field is
that an art critic or historian has no business
talking at all about paintings, sculpture or build¬
ings he has not seen in the original. No skill in
photoreproduction — even in living color — is a
substitute regardless of scale and quality. But
these experiences are by definition not in the li¬
brary. After they have been had, sometimes
more than once, quite-crude, much-reduced
graphics may do. For example, the size, color
and texture of the basilisk on the west portal of
Amiens are important. After I have absorbed
them on the site, if I need any reminder back in
the study, a reasonable line drawing, an inch
high without color texture or scale, will do; it
may even be better than a photograph. The
presentations computers can make will do for
much of this; the rest has got to be in form that
can be considered at leisure. The most important
part has to be either in books that can be kept and
frequently, even capriciously, returned to, or in
photocopies, black on white at a size large enough
to read with no apparatus more complicated than
"natural" spectacles and with enough margins to
permit heavy personal annotation on paper which
permits annotations. Sometimes even these
annotations need to be cut and pasted into an early
rough draft. Most of this could possibly be
computerized in a fancy enough set-up.
221
Hours. Certainly, somebody wrote such a
prayer and it is a beautiful one. It is in the
English of the early 16th century. Now a
student who needs to relate this directly to
More needs first to establish, if he can,
that More wrote it; this could require the
use of the original (now in the Yale library)
and nothing else. At the next level down, a
good facsimile will do, purveying the environ¬
ment as well as the text of the prayer. Further
down, a printing of the text only (without
illustration) will do. Still further down, a
translation of the English into modern English
in typewriter face may be enough. It depends.
222
3. he usually does not possess a battery of re¬
search assistants or even secretaries to do work
and he would not trust them to do it if he had them
(there have been notable exceptions).
John E. Burchard
223
APPENDIX N
MOTIVATION
225
Policy Decisions
226
In the case of a university library, which has priority (in
service, in book choices, etc.): student or the faculty?
Does the library use by the faculty differ enough from
that by the student, so that there should be separate
libraries (or collections or rooms) for each?
Questions of this sort are being answered all the time, either
consciously or by implication, by librarians or by their gov¬
erning boards. Most of the time, the operating decisions,
which should be based on an explicit analysis of such questions
and their answers, are based on a reluctance to change past
practices or a desire to emulate some other library, though
it should be apparent that the answers may differ appreciably
from library to library and even from time to time in the same
library. Occasionally, attempts are made to arrive at answers
by "market surveys" of a sample of users. Experience is
showing the dangers of such opinion surveys, unless they are
very carefully worded and unless they are quantitatively check¬
ed against the actual behavior of the same users. Too often
has the questionee persuaded both himself and the questioner
that he would use some proposed new service, only to find
that he seldom gets around to using it, once it is installed.
Pertinent Data
For some time to come, many of the questions listed above will
have to be answered on the basis of the librarian's experience
and intuition. Some of them may always have to be so answer¬
ed. But, surely, a greater quantitative knowledge about
library use can assist in getting answers, will make it easier
to determine when conditions have changed enough to warrant
changes in operation and wherein procedures in one library
should differ from those in another. Data on some or all of
the following questions would be of value in this respect:
227
What is the pattern of visits by various attendees (users)
of the library? Is there an hourly, weekly or seasonal
periodicity in attendance ? How long do they stay, and
what is the distribution of lengths of stay? Is there a
correlation between length of stay and the attendee's
use pattern?
What is the pattern of book (periodical, etc. ) use? With
a freely circulating book, what is the ratio between use
in the reading room and borrowing to take home ? What
fraction of the collection is not used at all during a year?
How do these use factors change with the age of the book?
Is there a correlation between the use factors for succes¬
sive years ? How do the use factors differ for different
classes of books (field of specialty, foreign language, text,
periodical, report, and so on) ? Is there any correlation
between the use factor of a book and the way the library
was persuaded to buy the book (request from faculty,
decision of librarian, decision based on list of new publi¬
cations or on book advertisements, etc.)?
Da;ta on all these items can be obtained. Most of them are not
gathered by most libraries. Expense and lack of librarian's
time are the usual excuses given for the neglect. Certainly,
gathering any of the data mentioned costs time and therefore
money; to answer all the listed questions in detail each year
would overburden any library's budget. It is the thesis of
this appendix that librarians should realize as managers of
industrial, mercantile and military operations are learning,
that, as use patterns change and publication increases, lack
of such data may lead to wastage and loss of utility, that expen¬
diture of time and money in gathering some of the data men¬
tioned could save more than this in improved utility. In the
near future, the introduction of data-processing equipment in
library operations will make it easier to amass the data; librar¬
ians should experiment with such data gathering before mecha¬
nizing, comparing the various methods of data gathering and
the value of the various kinds of data in assisting policy deci¬
sions, so the data-processing equipment can be designed to
produce the most effective data most efficiently.
228
Prediction of next year's operation thus consists
of extrapolation of the few items of gathered data; the models
then provide the details of the prediction. (The use of ex¬
perimentally checked theories to predict the behavior of some
system is, of course, the usual method of physical science
and of engineering.
The attendee may perform any one of these tasks zero or one
or more times during his stay; the average number of times
each different task is performed per visit provides a use
pattern which can assist the librarian in determining the re¬
lative importance of the services provided, can help, for
example, determine whether different kinds of libraries are
needed for different users, or whether one general library
could satisfy nearly everyone.
229
Probability Distribution of Use
What was found, from the MIT data, was that the number of
tasks per visit, performed by members of certain obvious
classes of users (such as chemists or physicists, or fresh¬
man sophomore undergraduates, etc. ) was geometrically dis¬
tributed. In other words, out of NQ visits of a given class,
the number of visits during which k or more tasks were per¬
formed is fairly accurately given by the simple formula
NoaX where a is a constant factor, less than unity, character¬
istic of the given class; a_ is, in fact, equal to the ratio K/(K
+ 1), where K is the mean number of tasks performed per
visit by members of the class. Not only is the totality of
tasks so distributed; the individual tasks, of the sort instanced
above, turned out to be independently distributed and subject
to the same geometric formula. Thus, the average K’s for
individual tasks are additive, to obtain various factors si for
various groups of tasks of interest in policy determination.
230
occurrence of k or more successive heads in a sequence of
coin tosses; the random nature of the time limitations and the
interests of the attendee, plus the random availability of the
various books in the library, makes it plausible that such a
random distribution should apply.
Duration of Visit
231
Circulation Interference
Prediction of Circulation
232
having circulation m during the previous year would have
the same circulation, but the distribution-in-circulation would
be determined; a certain fraction would not circulate, another
fraction would circulate once, and so on, the fractions being
determined by the previous year's circulation. Thus, there
would be a chance for a book to have a sudden increase in
popularity and, for such a book, its subsequent circulation
would be typical of the increased popularity; it would have
forgotten its earlier neglect, so to speak. In addition, of
course, a few books will suddenly lose popularity; their sub¬
sequent history will, with a few exceptions, reflect this loss
of popularity. Thus the Markov-process model accounts for
the exceptional, sudden changes in circulation, as well as
the usual, gradual decrease in use.
_ (a + bm)n „ -a-bm
x — “ r_ j 6
n n!
This formula fits the data, particularly for the more-popular
books, remarkably well. Its use enables one to predict the
future circulation rates of various classes of books. For
example, if the mean circulation of a given group of one or
more books during a given year is M per year, then the mean
circulation rate of the same group t_ years later would be
1
N = a , ~ “ + Mb1
1 - b
233
In addition, the model displays the limitations on prediction,
caused by "quantum jumps" in popularity which occasionally
occur. It indicates that, even though a book did not circulate
at all last year, it may circulate next year, and it provides
a value of this probability. It thus can predict how many books,
which have been "purged", will have later to be "reinstated".
Thus the costs of a proposed purging program can be
evaluated for comparison with its advantages.
Further Developments
234
APPENDIX O
A TECHNIQUE OF MEASUREMENT
THAT MAY BE USEFUL IN PROJECT INTREX EXPERIMENTS
235
lets the user retain the microfiche, that he will pay . . . and
so on. The experimenter, or the computer system through
which the experiment is being conducted, might then say that
no one of the bids is high enough to obtain the desired service,
but that the bid for the microfiche would be successful if it
were doubled. The subject might agree to that price and that
service, or he might raise one of his other bids. I think it
would be possible to work out rules of the game that would
lead the subject to reveal his pattern of evaluations during
the course of his making each selection. Moreover, I think
it would be possible to achieve that result without having to
reveal, in advance, the experimenter's schedule of prices
for the alternative facilities or services.
J.C.R. Licklider
236
APPENDIX P
INDEXING
237
The experimental design problem is the usual one: a control
group would be helpful. For example, the users in an exper¬
iment might be students in two comparable sections of the
same course. The attempt would be to compare relative task
success, using two alternative index-search systems. I hasten
to note that simultaneous experimentation would be conducted
with the subjects, with other factors varied, to avoid the im¬
possible task of evaluating all alternatives factorially.
DISSEMINATION
238
information. The system would bring the information to the
user when he should have it, and would do this by maintain¬
ing a currently reliable characterization of the user in its
own processor. Most of our information dissemination sys¬
tem is constructed in this spirit, and it is for this reason
that indexing-searching systems are relatively unimportant.
239
- experiments at the IBM Research Center with a dissemination
scheme (DICO) based upon similarity of interests among users,
rather than upon profiles expressed by use of key words (where
the initial feedback by the users was a relevancy index value
for each of a sample of items). Flood extended the DICO idea
to use feedback to modify the distributions made to users in
a stochastic, adaptive, sequential information dissemination
system (SASIDS) that is now under test, and for which some
comparisons have been made with the Luhn and Kochen-Wong
systems.
240
members of Congress and their staffs and thus serves that
group, while the library as a whole serves the national library
community primarily and develops procedures and services
pertinent to this need. Intrex should certainly include many
features that relate to the characterization of individual users,
and subsets of users, to be used in producing automatic dis¬
semination of quality comparable to that given a special
librarian who is intimately familiar with his small group of
users. Experimentation by Intrex on dissemination systems
should go very far beyond devices like the use of key-term
profiles, as in the Luhn system.
Merrill M. Flood
241
APPENDIX Q
INTRODUCTION
243
The traditional back-up role for the con¬
ventional lecture course, i. e., making
available extra copies of texts and supple¬
mentary reading material;
The provision of instruction in the art of
information retrieval from the university
library store or from external sources;
The fostering of self-search and acquisition
of new knowledge by the individual student
through the library medium.
244
Hence, it is imperative that Project Intrex should experi¬
ment with built-in instruction aids designed to teach the
effective utilization of the nation’s information store as
viewed from the entire spectrum of users — in industry, in
Government, and in the university.
Education, Self-Acquired
245
A Class-Room-Supplement Experiment
Note that this first part of the experiment does not deal with
Computer-Aided Instruction, as usually conceived, but is an
experiment on an information retrieval system which may be
used by the student or not, as he may elect, in preparing for
a course conducted along conventional lines. The student
would be presented with an assignment requiring him to obtain
information from a variety of sources; and he would be pro¬
vided with access to a console on a system containing all the
needed information.
246
other supplementary subjects were available only in the usual
library reference room. Then one could undertake paired
comparisons of the same student’s performance on different
subjects available in different forms. And variations on this
theme could be developed with continuing experience in con¬
duct of the course.
247
correction of prerequisites will be library-stored and-serviced.
It may be operative only in the early part of the term, or it
may be resorted to as new phases of classroom discussion to
identify for the individual student the nature and scope of his
deficiencies.
A Laboratory-Supplement Experiment
248
the MIT catalog (of subjects as distinguished from library-
catalog). And these capsules could be stored and serviced
in the Intrex-designed library. Such an experiment would
require active administration by the Registrar’s office, and
extensive cooperation from the entire MIT faculty. But a
more appealing experiment can be directed toward one par¬
ticular subject which is offered in every department of the
schools of science and engineering at MIT. It is the subject
of instrumentation and measurement.
249
The role of Intrex will include the solicitation and assembly
of existing material, and even the sponsorship of instrument
teaching programs to be included in this experimental col¬
lection. The experimental library will advertise the presence
and content of the programmed instructional material, and
store it in readily accessible form. Records of its usage
will be maintained and correlated in terms of gross average
student-user and instructor reaction, or in terms of paired
comparisons on different students with and without access to
the programs for particular instruments, and on the same
students with and without access to programs for different
instruments.
EXPERIMENTAL INSTRUCTION
ON THE ART OF INFORMATION RETRIEVAL
250
of information should be recommended: dictionaries, ency¬
clopedias, textbooks, handbooks, the library’s card catalog.
Where deeper penetration of a field is needed, the indexing
and abstracting services of the particular discipline and its
related fields should be progressively brought into play. For
full, specialist needs, the total information-containing net¬
work should be recommended, including communication with
sources making limited distribution of their information (such
as personal communication with authors). An advanced student
should know, even with an advanced system, how to get in¬
formation from "in" groups. A great deal of information
today is available only by tapping into a circular type of com¬
munication channel among members of a particular, disci¬
plinary "in" group.
251
available the full services of Intrex in searching and repro¬
ducing hard copy as he moved into the subject. The system
would record his search patterns and keep the tutor informed
of his movements. The tutor would attempt to advise more
on search strategy and study patterns than on actual subject
questions. Whenever possible, student questions relating to
subject content would be handled by having tutor and student
move to the Intrex system and search together for the answers.
252
accorded the undergraduates in the above two experiments.
They would not be permitted to read for a degree, hence
tuition levels should be reduced.
Stanley Backer
253
APPENDIX R
255
seek analogies. For example, a part of the theory of a cir¬
culating collection has much in common with the theory of
reliability and maintainability. What little theory there is
in the field of machine translation can probably be trans¬
ferred to problems in the areas of man-machine conversation
and content analysis. Perhaps the broadest theory to apply
to the over-all project is the theory of information and com¬
munication.
256
We shall assume that the foreseeable technologies of the
complex will preclude repeated processing of the contents
of most of the messages in depth. A control function based
on less than complete text will be necessary to make the
complex operate effectively. We shall assume that the
sources are unaware of the identities of many of the users
to whom their messages should be sent and that the users
are often unaware of the submission by sources of messages
that are relevant to their work. These features, and the
fact that some of the messages will have enduring value to
future generations of users, suggest that we model the com¬
plex as a store-and-forward, broadcast communication
system. Each input message may eventually be desired by
an unknown number of unspecified users.
Control data are useful in the system for two reasons. The
total quantity of control data should be considerably smaller
than the total quantity of message data so that processing
can be faster. The control data, if properly chosen, are
more susceptible to logical arrangement and processing than
message text for a variety of purposes. Control data relating
to the messages in the system may be based on any attribute
of the messages to which values may be assigned in such a
way as to permit orderly filing and searching. Some of the
attributes of each message may be evaluated from informa¬
tion associated with or contained in the message. Some may
be obtained from auxiliary sources, such as abstracting
journals or other messages which make reference to the
message.
257
The purpose of the theoretical model we are building is to
permit a prediction of a measure or measures of success of
the system in providing communication from sources to
users, and a measure or measures of the effort required to
build, maintain and operate the complex. We should like to
be able to compute the chosen measures of effort and per¬
formance as functions of parameters that describe the system.
We should prefer parameters that correspond as closely as
possible to variables that might be measured or controlled in
an experimental system.
TIME DELAY
259
selective dissemination. On the other hand, the use of selec¬
tive dissemination may allow a much larger fraction of the
users to whom the message is relevant to receive it at a
slightly greater time delay than would the normal user-
initiated search process.
RELEVANCE
260
trieval. The general conclusion from reading it is that a
useful concept of relevance will be much more complicated
than most of the workers in the field believe it to be. Addi¬
tional theoretical work, tied to well-planned experiments, is
certainly justified by the importance of the problem, though
perhaps not by the probability of success.
EASE OF OPERATION
Fig. 3
261
USER SERVICE DELAYS
MODEL PARAMETERS
262
SOURCE PARAMETERS
USER PARAMETERS
263
of presentation at a given scale factor -- say, full size.
MESSAGE PARAMETERS
CONTROL-DATA PARAMETERS
MESSAGE-PROCESS PARAMETERS
264
computer data,, at one extreme, to rare manuscripts kept
under glass, at the other. A theory that attempts to describe
the real situation, then, will describe the message handling
in terms of all the input, filing, retrieval, conversion and
output processes selected for the real system, with the addi¬
tional complication that conversions among several of the
file formats may be required to serve differing users ade¬
quately.
CONTROL-PROCESS PARAMETERS
265
enduring value to other users, locally and in related systems.
This load may be small relative to the load from external
sources, but it may be large enough that it cannot be ignored.
We must also remember to provide capacity for the normal
business processes related to acquisition.
INTERNAL PROCESS
USER SERVICE
266
system can gather the desired data by asking users to fill in
a questionnaire, by interviews, or perhaps more effectively
by monitoring the user interactions with the system in
terms of one or more of the types of control data suggested
above. If over-all system value to the user source commun¬
ity depends on total delay times, as we have suggested above,
selective dissemination can prove to be an extremely important
part of user service. Appropriate parameters describing the
file sizes needed and the processing capacities required to
handle specified kinds of user's interactions and system
processes can be selected, once the general policy on
selective dissemination is expressed. With these parameters
and with given system loads and capacities, it should be
possible to compute the system time delays for selective
dissemination. The effectiveness in terms of relevance of
the messages to users is also extremely important. It
should be possible to find parameters to describe the per¬
formance of the system in this regard in relation to the types
and volumes of control data processed in selecting messages
for dissemination, provided there is an adequate mechanism
for collecting information from users on their appreciation
of the relevance of the materials sent out. In the proposed
model system, it may also be possible to infer user opinions
by collecting data on their uses of other features of the
system (e.g., their searches through the catalog).
267
Times for delivery of specified messages to users may be
computed quite easily when the other features of the
system (mentioned under control and message handling) have
been selected, and when the geography and delivery means
have been determined. These times include the times for
receiving a clear request from the user, queueing times in
various parts of the process, which may depend on capacity
and the statistical fluctuations in the system loads, times for
format conversion if needed, and times for physical delivery
to the place specified by the user. Any real system should
be describable in terms of these times; then it should be
fairly easy to build a theoretical model to represent the real
system and to permit computation of expected performance
at differing work loads and with differing design choices.
R. C. Raymond
268
APPENDIX S
GRAPHIC COMMUNICATIONS FOR THE LIBRARY
TABLE I
MEDIA
MICROFORM
269
Microform scanning appears to be technically feasible and
can probably be provided by manufacturers in the future if
it can be justified from a business point of view. The future
development of acceptable microform graphic communication
equipment for libraries will therefore depend on manufac¬
turers' understanding of the needs of libraries. The number
of types of microforms complicates the problem of establish¬
ing specifications for these devices and makes difficult the
economic justification of development programs. Each
type of microform requires different mechanical mechanisms
for transporting the microimage. Also, optical units of dif¬
ferent resolving capabilities may be required. Each mecha¬
nism may require a large engineering program and a large
tooling or manufacturing investment if it is to be obtained
economically. The selection of a standard medium would
make the business opportunity much more attractive. Also,
the library and the user would benefit from the availability
of lower-cost equipment and service.
270
difficult and expensive. Also, unattended operation with high
reliability appears less probable than with microform input.
The need for book scanning in the 1970's should be clarified
by the libraries when they are considering the possible in¬
creased use of microform.
NETWORK CONSIDERATIONS
271
Dial-up (telephone-type) switch. This is similar
to the switching in XeroxTs own LDX network.
Dial-up could be subscriber-/or operator-controlled.
Frequency Switching. Switching is accomplished by
tuning receivers (printers) to a given frequency.
The transmitter (scanner) then selects the frequency
spectrum to be transmitted, based upon the desired
destination. (Our home TV receiver, in effect,
performs frequency-switching by selecting the
desired transmitted frequency.) For facsimile,
however, switching control (selection of the
frequency) is more desirable at the transmitting
station. A very-wide-band communication chan¬
nel, such as microwave or coaxial cable, is
required.
Computer-controlled switching. The switching
task for facsimile could be controlled by a time-
shared computing system such as used in MIT's
Project MAC. Because of the high data rates of
facsimile, it appears desirable today to accom¬
plish the actual switching with a mechanical
cross-bar as is used in many telephone dial-up
switches, but to control the switch bv the com¬
puter. The computer then "dials-up" the switch
via an AT&T 801 DATA PHONE which converts
computer words to dial pulses.
Both switching methods one and two above are being used
today in LDX systems.
272
economics will be especially attractive when parts of the
system, such as modulation equipment and computer pro¬
grams, are provided as research projects by graduate
students. The best system for a given university can be de¬
termined, however, only after analysis of the universityfs
needs and situation, and after formulation of a communica¬
tions system plan for the university.
Paul L. Brobst
273
APPENDIX T
275
An alternate route for developing faculty recommendations
on library acquisitions would be the provision of a fringe
benefit, book-purchasing credit of $150 to $250 to each
professor. And, as he ordered books for his personal tech¬
nical library, copies of the titles would be automatically
funneled to the acquisitions librarian. The system would
also have to record titles purchased from personal funds
(beyond the basic credit) and titles received for review, etc.,
on a complimentary basis. The latter category would, of
course, have to be double-checked by the librarian.
Stanley Backer
276