Usability Assessment of Academic Digital

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

Libri, 2005, vol. 55, pp.

96–121 Copyright Saur 2005


Printed in Germany All rights reserved
Libri
ISSN 0024-2667

Usability Assessment of Academic


Digital Libraries: Effectiveness,
Efficiency, Satisfaction, and Learnability
Judy Jeng
School of Communication, Information, and Library Studies, Rutgers,
The State University of New Jersey, New Brunswick, NJ, U.S.A.

This study is to develop and evaluate methods and in- ficiency, and satisfaction. It provides operational criteria for
struments for assessing the usability of digital libraries. It effectiveness, efficiency, satisfaction, and learnability. It dis-
discusses the dimensions of usability, what methods have covers users’ criteria on ”ease of use,” ”organization of
been applied in evaluating usability of digital libraries, their information,” ”terminology and labeling,” ”visual attractive-
applicability, and criteria. It is found in the study that there ness,” and ”mistake recovery.” Common causes of ”user lost-
exists an interlocking relationship among effectiveness, ef- ness” were found. ”Click cost” was examined.

Introduction Dimensions of usability


The development of digital library has reached a Usability is a multidimensional construct that can
mature stage. However, evaluation has not kept be examined from various perspectives. It is also
pace. This study is to develop and evaluate meth- an elusive concept and is determined by the tasks,
ods and instruments for assessing the usability the users, the product, and the environment. We
of digital libraries. Several authors have pointed have seen in the literature that the term usability
out that little work is being done to understand has been used broadly and means different things
usability of digital libraries in any context (Bland- to different people.
ford, Buchanan, and Jones 2004, 69; Blandford, Stel- One perspective is to look at usability from
maszewska, and Bryan-Kinns 2001, 181; Borgman interface effectiveness point of view. This view
et al. 2000, 229; Theng, Mohd-Nasir, and Thim- makes sense as usability has theoretical base on
bleby 2000a, 238; Thomas 1998). Blandford and human-computer interaction. Back in 1994, Rubin
Buchanan (2002b) call for a need for further work (1994) alerted that system designers were slow to
on methods for analyzing usability, including respond to the guidelines established from the re-
an understanding of how to balance rigor, ap- search in human-computer interaction. Today,
propriateness of techniques, and practical limi- there are many usability studies focusing on in-
tations. Popp (2001) also found that the literature terface design. Kim (2002, 26) has found that ”the
on usability testing on library Web sites was difference between interface effectiveness and us-
small. ability is not clear.” Interface is one of the most im-

Judy Jeng, School of Communication, Information, and Library Studies, Rutgers, The State University of New Jersey,
4 Huntington Street, New Brunswick, NJ 08901-1071, U.S.A. E-mail: judyjeng@scils.rutgers.edu
This paper has received honourable mention in the 2005 Libri Best Student Award Competition.

96
Usability Assessment of Academic Digital Libraries

portant aspects of usability as it is the medium that ards Organization (1994, 10) defines usability as
users communicate and interact with the system. ”the extent to which a product can be used by
Usability can also be related to usefulness, specified users to achieve specified goals with
usableness, and ease of use. Gluck (1997) dif- effectiveness, efficiency, and satisfaction in a spe-
ferentiates usefulness and usableness. Usableness cified context of use.” Nielsen (1993) points out that
refers to functions such as ”Can I turn it on?” usability has five attributes: learnability, efficiency,
”Can I invoke that function?” Usefulness refers to memorability, error recovery, and satisfaction.
functions such as ”Did it really help me?” ”Was it Brinck, Gergle, and Wood (2002) share a similar
worth the effort?” Blandford and Buchanan (2003) perspective that usability is: functionally correct,
also discuss the difference between usefulness and efficient to use, easy to learn, easy to remember,
usableness. Usefulness is generally taken to mean error tolerant, and subjectively pleasing. In ad-
”supporting the required functionality.” Usable- dition, Booth (1989) outlines that usability has
ness, at its simplest, means ”can be used.” Pearrow four factors: usefulness, effectiveness (ease of use),
(2000, 4) argues strongly that ”a Web site that is learnability, and attitude (likeability). Shackel
not usable is useless.” (1986) suggests that usability has the criteria of
Landauer (1995) defines usability as ”ease of effectiveness, learnability, flexibility, and user at-
operation” and usefulness as ”serving an intended titude. Hix and Hartson (1993) classify usability
purpose” although also comments that the two into initial performance, long-term performance,
are hard to separate in the context of evaluation. learnability, retainability, advanced feature usage,
Davis and his colleagues (Davis 1989; Davis, Ba- first impression, and long-term user satisfaction.
gozzi, and Warshaw 1989) also make a distinction Hix and Hartson are unique in that they take
between usefulness and ease of use. In their Tech- one step further to differentiate performance and
nology Acceptance Model (TAM) they defined satisfaction into initial and long-term measures.
”perceived usefulness” as the extent to which an Blandford and Buchanan (2003) gave usableness
information system will enhance a user’s per- the definition of 1) how efficiently and effective-
formance and that ”perceived ease of use” as the ly users can achieve their goals with a system,
extent to which a person believes a system will be 2) how easily users can learn to use the system
free of effort. Furtado et al. (2003) also consider (learnability), 3) how well the system helps the
usability from ease of use point of view and add user avoid making errors, or recover from errors,
that usability should also include ease of learning. 4) how much users enjoy working with the system
Grudin (1992), however, considers that usefulness or whether they find it frustrating, and 5) how
is the issue of whether the system can be used well the system fits within the context in which
to achieve some desired goal and can be broken it is used. Blandford and Buchanan’s definition
down into utility and usability, where utility is about usableness is similar to those definitions
the question of whether the functionality of the cited in this paragraph about usability.
system in principle can do what is needed, and It is also worth noting that satisfaction is found
usability is the question of how well users can use to be the most cited attribute of usability while
that functionality. usefulness is the one often overlooked (Thomas
Shackel (1991, 24) reports that the definition of 1998).
usability was probably first attempted by Miller In addition to those views mentioned above,
(1971) in terms of measures for ”ease of use,” and Gould (1988) defines usability into more com-
these were first fully discussed and a detailed for- ponents, including system performance (reli-
mal definition was attempted by Shackel (1981): ability, responsiveness), system functions, user
interface, reading materials, language transla-
the capability in human functional terms to be used easily tion, outreach program, ability for customers to
and effectively by the specified range of users, given
specified training and user support, to fulfill the specified modify and extend, installation, field mainte-
range of tasks, within the specified range of environmental nance and service-ability, advertising, and sup-
scenarios. port group users. Rushinek and Rushinek (1986)
found that system responsiveness (response time)
Perhaps the most widely cited definition is the is the most important variable affecting users’
ones of ISO and Nielsen. The International Stand- happiness.

97
Judy Jeng

Figure 1. The Four Principle Components in a Human-Ma- interplay of four components: user, task, tool, and
chine System environment. This relationship may be illustrated
as shown in Figure 1 which is a modified version
from Shackel (1991, 23), Bennett (1972; 1979) and
Eason (1981).
Usability has user focus. Dumas and Redish
(1993, 4) define usability as ”people who use the
product can do so quickly and easily to accomplish
their task.” Guillemette (1995, 215) refers usability
to ”the degree to which an information system
can be effectively used by target users in the per-
formance of tasks.” Clairmont, Dickstein, and
Mills (1999) also state, ”Usability is the degree
to which a user can successfully learn and use a
product to achieve a goal.”
Usability is different from functionality. Dumas
Karoulis and Pombortsis (2003) suspect that and Redish (1993) use the videocassette recorder
usability (effectiveness, efficiency, and satisfaction) (VCR) as an example to illustrate the difference
and learnability of educational environment are between the two: VCRs may have high func-
positively correlated and wonder how far one tionality (the feature works as it was designed to
affects the other, although they did not carry out work) but they have low usability (people cannot
a study to actually examine this possible cor- use them quickly and easily to accomplish their
relation. task).
Usability can also be grouped into two large Usability is also not equivalent to accessibility.
categories: inherent usability (Kurosu and Kashi- In a nutshell, accessibility involves making Web
mura 1995) and apparent usability (Kurosu and site content available to and usable by people with
Kashimura 1995; Tractinsky 1997). Inherent us- disabilities.
ability is mainly related to the functional or dy- Table 1 compares various perspectives on the
namic part of interface usability. It includes those attributes of usability.
attributes focus on how to make the product easy Usability has several aspects, including interface
to understand, easy to learn, efficient to use, less design, functional design, data and metadata, and
erroneous, and pleasurable. On the other hand, computer systems and networks (Arms 2000). Us-
apparent usability is more related to the visual ability is a property of the total system. All the com-
impression of the interface. At times, inherent us- ponents must work together smoothly to create an
ability and apparent usability may be contradictory effective and easy-to-use digital library.
(Fu 1999). For example, in Web page design, Usability can be tackled from various directions.
graphics enhance apparent usability but slows Blandford and Buchanan (2002a) suggest that us-
down the system. Pearrow (2000) also discusses ability is technical, cognitive, social, and design-
that form versus function are at the both ends of a oriented and it is important to bring these different
continuum. A truly great Web site should be both perspectives together, to share views, experiences
aesthetically pleasing and truly usable. and insights. Indeed, digital library development
In terms of usability framework, Thomas involves interplay between people, organization,
(1998) categories usability attributes into three and technology. The usability issue should look at
categories: outcome, process, and task. We may the system as a whole.
apply Thomas’s grouping and categorize usability In order to design systems for a variety of users,
criteria as follows: the outcome group includes Lamb (1995) suggests usability issues should be
effectiveness, efficiency, and satisfaction. The pro- extended beyond interface usability to include con-
cess group includes ease of use, interface, learn- tent usability, organizational usability, and inter-
ability, memorability, and error recovery. The task organizational usability.
group includes functionality and compatibility. In addition to those views, usability can also
Shackel (1991) views usability as the dynamic be examined from the perspectives of graphic de-

98
Usability Assessment of Academic Digital Libraries

Table 1. Attributes of Usability

Authors Attributes
Booth (1989) usefulness, effectiveness, learnability, attitude
Brinck et al. (2002) functionally correct, efficient to use, easy to learn, easy to remember, error tolerant, and subjectively
pleasing
Clairmont et al. (1999) successfully learn and use a product to achieve a goal
Dumas & Redish (1993) perform tasks quickly and easily
Furtado et al. (2003) ease of use and learning
Gluck (1997) useableness, usefulness
Guillemette (1995) effectively used by target users to perform tasks
Hix & Hartson (1993) initial performance, long-term performance, learnability, retainability, advanced feature usage, first
impression, and long-term user satisfaction
ISO (1994) effectiveness, efficiency, satisfaction
Kengeri et al. (1999) effectiveness, likeability, learnability, usefulness
Kim (2002) interface effectiveness
Nielsen (1993) learnability, efficiency, memorability, errors, satisfaction
Oulanov & Pajarillo (2002) affect, efficiency, control, helpfulness, adaptability
Shackel (1981) ease of use, effectiveness
Shackel (1986, 1991) effectiveness, learnability, flexibility, user attitude

Usability is the broad discipline of applying sound sci-


sign, navigation, and content (Spool et al. 1999). entific observation, measurement, and design principles
Turner (2002) categorizes usability into naviga- to the creation and maintenance of Web sites in order to
tion, page design, content, accessibility, media use, bring about the greatest ease of use, ease of learnability,
interactivity, and consistency. Clausen (1999) ad- amount of usefulness, and least amount of discomfort for
vocates that a high quality Web page should have the humans who have to use the system.
the criteria of accuracy, authority, currency, rele-
vant links, browsability, navigation, consistency, Usability evaluation at academic sites
use of frames, use of graphics, connectivity, inter-
activity, user-friendliness, and originality. There are a number of methods to evaluate us-
There are several approaches in evaluating us- ability. The techniques include formal usability
ability. One view is offered by Eberts (1985) who testing, usability inspection, card sort, category
categorizes those approaches into four: empirical, membership expectation, focus groups, question-
anthropomorphic, cognitive, and predictive mod- naires, think aloud, analysis of site usage logs,
eling approaches. The empirical approach empha- cognitive walkthrough, heuristic evaluation, claims
sizes obtaining a representative sample of target analysis, concept-based analysis of surface and
users and analyzes user performance and attitudes structural misfits (CASSM), paper prototyping,
under specified conditions. The anthropomorphic and field study (Askin 1998; Blandford et al. 2004;
approach evaluates user-system interaction in com- Campbell 2001; Kantner and Rosenbaum 1997;
parison to effective interpersonal communication. Keith et al. 2003; Nielsen and Mack 1994; Pearrow
The cognitive approach applies the theory and 2000; Popp 2001; Rosson and Carroll 2002; Snyder
methodology of cognitive psychology to interface 2003). The areas of digital library usability assess-
design. The predictive approach uses an analytic ment have covered breadth of coverage, naviga-
tool which examines or manipulates an abstract tion, functionality, utility, interface, metadata
representation of the interface in order to forecast appropriateness, and awareness of library re-
usability. Blandford et al. (2004), on the other hand, sources.
roughly group usability evaluation into two kinds: Table 2 is a review of usability tests in selected
empirical and analytical. Empirical techniques in- academic digital libraries. More detailed de-
volve testing systems with users, whereas ana- scriptions and discussions will be available in the
lytical techniques involve usability personnel author’s doctoral dissertation.
assessing systems using established theories and Although the literature on usability tests in
methods. libraries is still small, interest and involvement in
In conclusion, usability can be defined as (Pear- testing library Web sites grew exponentially, with
row 2000, 12): an explosion of work done in 1999. The survey of

99
Judy Jeng

Table 2. Usability Assessment at Academic Sites

100
Usability Assessment of Academic Digital Libraries

Popp (2001) indicates that there were only two of use is to evaluate user’s perception on the ease
tests in 1996, but there were 15 in 1997, 36 in 1998, of use of the system. Organization of information
and jumped to 82 in 1999. The literature review of is to evaluate if the system’s structure, layout,
this study found that there is continued interest and organization meets the user’s satisfaction.
since 1999. Many usability studies reported in Labeling is to evaluate from user’s perception if the
library literature are published after 2000. It is system provides clear labeling and if terminology
clear that libraries are rapidly moving forward in used are easy to understand. Visual appearance
evaluating the usability of their Web sites. evaluates the site’s design to see if it is visually
attractive. Contents evaluate the authority and
accuracy of information provided. Error is to test
Usability evaluation model
if users may recover from mistakes easily and if
This paper proposes an evaluation model for as- they make mistakes easily due to system’s design.
sessing usability of academic digital libraries. The Learnability is to measure learning effort and is
proposed evaluation model applies the definition examined by asking subjects to cross search the
of ISO 9241-11 (1994) that examines effectiveness, other test site (Rutgers participants search Queens
efficiency, and satisfaction. In addition, the mod- site and vice versa). It is then measured by 1) how
el includes learnability (see Figure 2). The ISO soon the subjects know how to begin searching,
definition (1994, 10) defines usability as ”the 2) how many questions are answered correctly,
extent to which a product can be used by specified and 3) how much time to answer questions.
users to achieve specified goals with effectiveness, Figure 2 is a diagram illustrating this evaluation
efficiency, and satisfaction in a specified context model. It is suspected that there exists an inter-
of use.” The ISO definition, however, does not ex- locking relationship among effectiveness, efficien-
plicitly specify operational criteria on what to cy, and satisfaction. It is also interesting to ex-
evaluate. Karoulis and Pombortsis (2003) sus- amine how learnability interacts with these three
pect that usability (effectiveness, efficiency, and attributes.
satisfaction) and learnability of educational en- This evaluation model considers both quanti-
vironment are positively correlated and wonder fying elements of performance (time, accuracy
how far one affects the other. Karoulis and rate, steps to complete tasks) as well as subjective
Pombortsis never actually carried out a study to criteria (satisfaction). The evaluation approach is
examine this possible correlation, nor did they empirical.
provide operational criteria.
In this study, effectiveness is to evaluate if the
Testing of the model and instruments
system as a whole can provide information and
functionality effectively and will be measured by Two academic library Web sites were selected for
how many answers are correct. Efficiency is like- this study: the Rutgers University Libraries Web
wise to evaluate if the system as a whole can be site (http://www.libraries.rutgers.edu) and the
used to retrieve information efficiently and will Queens College Web site (http://qcpages.qc.edu/
be measured by 1) how much time it takes to Library/). These two sites provide rich resources
complete tasks and 2) how many steps required. for their local holdings, connections to electronic
Satisfaction will look into the areas of ease of resources and digital collections, and are one kind
use, organization of information, clear labeling, of digital library.
visual appearance, contents, and error corrections Francisco-Revilla et al. (2001) report digital li-
and will be measured by Likert scales and ques- braries are increasingly being defined as ones that
tionnaires. Belkin and Vickery (1985, 195) have collect pointers to WWW-based resources rather
stated, ”Although the idea behind satisfaction as than hold the resources themselves. Greenstein
a criterion is simple, it is obvious that actually (2000) shares this view and says that the digital
measuring it is not, since it must be a multifaceted library is known less for the extent and nature
variable.” This study assesses satisfaction from of the collections it owns than for the networked
the perspectives of user’s reaction to ease of use, information space it defines through its online ser-
organization of information, clear labeling, visual vices. Paepcke et al. (1996) also state that a digital
appearance, contents, and error corrections. Ease library provides a single point of access to a wide

101
Judy Jeng

Figure 2. A Model of Usability Evaluation for Digital Library

range of autonomously distributed sources. These The experiments of this study were divided into
views can be seen as calling a library’s Web site two stages: stage one was conducted in February/
as a portal-typed digital library. Covi and Kling March of 2004 and stage two was conducted in
(1996, 672) specifically include a library’s Web site September/October of 2004. The Rutgers Uni-
as one kind of digital library: versity Libraries revamped its Web site in Sep-
tember 2004 which in part prompts the interest
Digital libraries include personal, distributed, and to repeat the usability testing. The other reason
centralized collections such as online public access catalogs of re-running the usability testing was to confirm
and bibliographic databases, distributed document
databases, scholarly and professional discussion lists and
the interlocking relationship among effectiveness,
electronic journals, other online databases, forums, and efficiency, and satisfaction that was discovered in
bulletin boards. stage one.

102
Usability Assessment of Academic Digital Libraries

Table 3. Effectiveness of Rutgers Subjects Searched Rutgers Site and Queens Subjects Searched Queens Site

Rutgers Web Site Queens Web Site


Stage 1 Stage 2 Stage 1 Stage 2
NR1=5 NR2=15 NQ1=6 NQ2=15
Question 1 Find a book 80% 100% 83% 100%
Question 2 Find a journal 100% 80% 83% 93%
Question 3 Use database 80% 60% 83% 93%
Question 4 Use database 100% 60% 83% 80%
Question 5 Use database 100% 67% 83% 80%
Question 6 Find encyclopedia 40% 53% 33% 40%
Question 7 Find e-book 80% 33% 50% 67%
Question 8 Find information 100% 93% 100% 87%
Question 9 Find information 100% 100% 100% 100%
Average 87% 72% 78% 82%

There were a total of forty-one subjects for the pose of searching the other institution’s site was to
study. The first stage had a total of eleven subjects measure learnability.
(five from Rutgers and six from Queens). The
second stage had a total of thirty subjects (fifteen
Results
from Rutgers and fifteen from Queens). Those
were graduate and undergraduate students. Con-
Effectiveness
venience sampling method was used for re-
cruitment. The convenience sampling method is Effectiveness in this study is measured by how
common in usability test. many answers are correct. Table 3 indicates the
A list of nine questions was designed to test the effectiveness scores of Rutgers subjects searched
effectiveness, efficiency, and user satisfaction of Rutgers site and Queens subjects searched Queens
the system. These nine questions were designed site respectively. Questions 6 (find an encyclopedia
to be representative of typical uses of a library’s article) and 7 (find an e-book) gave subjects more
Web site (see Appendix A). Question 1 is to locate difficulties on both Rutgers and Queens sites. This
a book. Question 2 is to locate a journal. Questions is consistent on stages one and two.
3 to 5 are to use databases. Question 6 is to use an
e-encyclopedia. Question 7 is to locate an e-book.
Efficiency
Questions 8 and 9 are to seek instructions on
library services. Efficiency in this study is measured by 1) how
There was a pre-test questionnaire that collects much time it takes to complete a task correctly
demographic data, including gender, age, status and 2) how many keystrokes/clicks (or steps/
(undergraduate, master’s, or doctoral student), movements). If the answer is wrong, the time
major, years at the institution, original nationality spent and the keystrokes/clicks are not counted
if coming from foreign country, and frequency of in the calculation. If a subject decides to give up
using the site. After completing a task, the subject a question, the time spent and the steps on the
was asked to rank satisfaction and to write down particular question are also not counted. Table 4
comments. There was also a post-test questionnaire indicates time used by Rutgers subjects to search
that specifically examines satisfaction in the areas Rutgers site and Queens subjects to search Queens
of ease of use, organization of information, clear site on stages one and two respectively.
labeling, visual appearance, contents, and error As indicated in Table 4, question 9 (find in-
corrections (see Appendix B). formation to set up remote access) is the easiest on
Each subject was asked to search two sites with both Rutgers and Queens sites. This is symbolic
the sequence alternated. When searching the other that it is important that library users are able to
institution’s site, questions 3–5 were eliminated be- remotely access a library’s electronic resources
cause of proxy server authorization and to limit easily. Although the instruction is readily available
the session to a reasonable timeframe. The pur- and easy to locate, it is equally important that

103
Judy Jeng

Table 4. Time Used by Rutgers Subjects to Search Rutgers Site and Queens Subjects to Search Queens Site

Rutgers Web Site Queens Web Site


Stage 1 Stage 2 Stage 1 Stage 2
NR1=5 NR2=15 NQ1=6 NQ2=15
Question 1 3 min. 17 sec. 2 min. 12 sec. 1 min. 29 sec. 1 min. 28 sec.
Question 2 1 min. 9 sec. 1 min. 39 sec. 1 min. 5 sec. 1 min. 26 sec.
Question 3 6 min. 11 sec. 3 min. 5 sec. 2 min. 28 sec. 2 min. 38 sec.
Question 4 3 min. 18 sec. 1 min. 33 sec. 1 min. 12 sec. 2 min.
Question 5 6 min. 33 sec. 2 min. 28 sec. 4 min. 39 sec. 2 min. 41 sec.
Question 6 4 min. 12 sec. 2 min. 50 sec. 2 min. 20 sec. 4 min. 33 sec.
Question 7 6 min. 17 sec. 5 min. 25 sec. 1 min. 36 sec. 2 min. 11 sec.
Question 8 2 min. 4 sec. 46 sec. 1 min. 8 sec. 1 min. 9 sec.
Question 9 1 min. 23 sec. 29 sec. 1 min. 10 sec. 46 sec.
Average 3 min. 49 sec. 2 min. 15 sec. 1 min. 54 sec. 2 min. 1 sec.

Table 5. Number of Steps by Rutgers Subjects to Search Rutgers Site and Queens Subjects to Search Queens Site

Rutgers Web Site Queens Web Site


Stage 1 Stage 2 Stage 1 Stage 2
NR1=5 NR2=15 NQ1=6 NQ2=15
Question 1 Find a book 9 9 7 7
Question 2 Find a journal 5 7 6 6
Question 3 Use database 20 10 8 10
Question 4 Use database 9 6 7 9
Question 5 Use database 19 8 17 11
Question 6 Find encyclopedia 15 12 11 18
Question 7 Find e-book 27 20 7 9
Question 8 Find information 6 3 2 3
Question 9 Find information 4 2 2 2
Average 13 9 7 8

the instruction is easy to follow. This study does they had difficulty in combining terms for ques-
not evaluate if the instruction is clear and easy to tion 3 (find an article about nursing homes and
follow. mental illness). Two participants suggested meta-
In addition to using time as an indicator to assess searching which allows users to enter search
efficiency, number of movements/steps is also strings to search all available databases so that
used in this study. This is counted by the number they do not need to learn various interfaces and
of keystrokes or clicks. A string of characters when commands.
performing a title or author search is counted as
one step. Each press on Enter key or each click on
Relationships among effectiveness, efficiency, and
mouse is also counted as one step. Table 5 indicates
satisfaction
the numbers of steps by Rutgers and Queens
subjects searching their own sites. One of the research goals of this study is to ex-
As indicated in Table 5, question 7 (find e-book) amine the relationships among effectiveness, ef-
took subjects more steps to answer in Rutgers site. ficiency, and satisfaction. This section reports the
Three participants suggested that Rutgers should results of stages one and two respectively.
make e-book collections more noticeable on its
Web page. It is hard to dig out e-book collections
in the cluster of Rutgers site. Effectiveness and satisfaction
The use of electronic databases to find articles Effectiveness in this study is measured by whether
is not an easy task for participants. Four partici- the subjects answer the questions correctly. Two
pants commented that it is necessary to know one-way analysis of variance were conducted to
those databases in order to select from the long evaluate the relationships between correctness of
list which one to use. Two participants said that answers and satisfaction for stage one and two

104
Usability Assessment of Academic Digital Libraries

Table 6. Differences of Satisfaction Ranking between Correct The correlation between time spent on an-
or Incorrect Answers (1=easy to use/high satisfaction, swering questions and the numbers of steps for
5=difficult to use/low satisfaction)
stage one was significant, r (134) = .78, p < .001;
Satisfaction Satisfaction for stage two was also significant, r (445) = .77, p
(Stage 1) (Stage 2) < .001. This means the longer the time that a
N=11 N=30
subject spent on answering a question, the more
Correctness of M SD M SD
the steps involved. The effect sizes were large for
Answer
both stages.
Wrong Answer 4.00 1.17 3.93 1.33 The correlation between the numbers of steps
Correct Answer 2.20 1.16 2.09 1.20
and satisfaction for stage one was significant, r
(134) = .51, p < .001; for stage two was also sig-
nificant, r (445) = .57, p < .001. This means the
respectively. The dependent variable was satisfac- more steps taken to answer a question, the lower
tion (1=easy to use/high satisfaction to 5=difficult the satisfaction. The effect sizes for both stages
to use/low satisfaction) and the independent vari- were large.
able was correctness of answers (0=wrong answer The correlation between the time spent and
and 1=correct answer). satisfaction for stage one was significant, r (134) =
The ANOVA for stage one was significant, F (1, .46, p < .001; for stage two was also significant,
163) = 57.57, p < .001. The strength of relationship r (445) = .61, p < .001. This means the more time
between satisfaction and correctness of answers, as spent on answering a question, the lower the satis-
assessed by η2, was strong, with the correctness of faction. The effect size for stage one was medium,
answer factor accounting for 26% of the variance for stage two was large.
of the dependent variable. Post-hoc tests were not Based on those statistical results, a conclusion is
performed because there were fewer than three drawn that there is correlation between efficiency
factors. and satisfaction. Users feel less satisfied when the
The ANOVA for stage two was also significant, F task takes more steps to get to the answer.
(1, 445) = 185.48, p < .001. Because the p-value was
less than .05, the null hypothesis that there were
no differences between the groups was rejected.
Effectiveness and efficiency
The η2 of .29 indicates a strong relationship be-
tween satisfaction and correctness of answers.
Post-hoc tests were not performed because there Effectiveness and steps
were fewer than three factors. Two ANOVAs were conducted, for stages one and
Based on the findings of the two ANOVA two respectively, to evaluate whether the group
analyses that there were significant differences in means of steps on correct answers and incorrect
the means of satisfaction between right and wrong answers differ significantly from each other. The
answers, an interpretation may be made that independent variable, effectiveness, included two
subjects felt less satisfied with the system when levels: correct and incorrect answers. The de-
they failed to perform the task correctly. pendent variable was the number of steps.
Table 6 indicates the means of satisfaction rank- The ANOVA for stage one was significant,
ing with regard to correctness of answers in both F (1, 163) = 29.19, p < .001. The strength of re-
stages. The rankings are similar in both stages. lationship between effectiveness and steps, as
assessed by η2, was strong, with the effectiveness
factor accounting for 15% of the variance of the
Efficiency and satisfaction dependent variable.
Correlation coefficients were computed among The ANOVA for stage two was also significant,
time spent on answering questions, numbers of F (1, 445) = 82.84, p < .001. The strength of re-
steps, and satisfaction. Using the Bonferroni ap- lationship between effectiveness and steps, as
proach to control for Type I error across the assessed by η2, was strong, with the effectiveness
three correlations, a p-value of less than .017 was factor accounting for 16% of the variance of the
required for significance. dependent variable.

105
Judy Jeng

Table 7. Effectiveness and Steps Table 8. Effectiveness and Tim

Stage 1 Stage 2 Stage 1 Stage 2


N=11 N=30 N=11 N=30
Correctness of M SD M SD Correctness of
M SD M SD
Answer Answer
Wrong Answer 16.97 1.47 13.39 9.29 4 min. 3 min. 2 min.
Wrong Answer 25 sec.
Correct Answer 8.18 0.71 6.74 5.55 12 sec. 53 sec. 25 sec.
2 min. 1 min. 1 min.
Correct Answer 12 sec.
23 sec. 48 sec. 36 sec.
Figure 3. Significant Differences in Steps Related to Correct-
ness of Answers
Figure 4. Significant Differences in Time Related to Correct-
ness of Answers

As indicated in Table 7 and Figure 3, incorrect


answers involve more steps while correct answers
involve less steps. When the subject knows how to
get the answer, it takes them less steps while when
they don’t know how to find the information, they Table 8 and Figure 4 indicate the means of time
struggle. for correct and incorrect answers for stages one
and two.
Based on the analyses from one-way ANOVAs
Effectiveness and time and correlation coefficients, it is found that there
Two ANOVAs were conducted to evaluate exist interlocking relationships among effective-
whether the group means of time on correctness of ness, efficiency, and satisfaction. Figure 2 indicates
answers and incorrect answers differ significantly these relationships.
from each other. The independent variable, the
effectiveness factor, included two levels: correct
Learnability
answer and incorrect answer. The dependent
variable was time spent on completing tasks. Learnability is in some sense the most fundamental
The ANOVA for stage one was significant, F (1, usability attribute (Nielsen 1993). The system
163) = 15.70, p < .001. The strength of relationship should be easy to learn so that the user can rapidly
between effectiveness and time, as assessed by η2, start getting some work done with the system.
was medium, with effectiveness factor accounting As Dzida et al. (1978) reported, learnability is
for 9% of the variance of the dependent variable. especially important for novice users.
The ANOVA for stage two was also significant, Learnability in this study is to examine learning
F (1, 445) = 107.44, p < .001. The strength of re- effort of using a new Web site and to measure how
lationship between effectiveness and time, as as- easy a site for new visitors to orient themselves
sessed by η2, was strong, with effectiveness fac- and get a good overview of what the site offers.
tor accounting for 20% of the variance of the Learnability is inferred from the amount of time
dependent variable. required to achieve user performance standards

106
Usability Assessment of Academic Digital Libraries

Table 9. Learnability of Both Sites in Stage One

Rutgers Web Site Queens Web Site


Correctness Speed Correctness Speed
Rutgers Subjects NR1=5 83% 3 min. 4 sec. 90% 2 min. 20 sec.
Queens Subjects NQ1=6 75% 2 min. 75% 1 min. 42 sec.

Table 10. Learnability of Both Sites in Stage Two

Rutgers Web Site Queens Web Site


Correctness Speed Correctness Speed
Rutgers Subjects NR2=15 77% 2 min. 14 sec. 71% 2 min. 14 sec.
Queens Subjects NQ2=15 67% 2 min. 18 sec. 81% 1 min. 56 sec.

and if they can perform the tasks correctly. The The use of Likert scale is known as an economi-
design of this study asked Rutgers subjects to cal way of measuring user satisfaction (Pearson
search Queens site and Queens subjects to search 1977).
Rutgers site. It is found that all subjects can begin
searching without difficulty and begin to perform
searches almost immediately. The experimenter Overall satisfaction
counted the time subjects took before their first The scores for calculating overall satisfaction are
click and found that they almost started im- gathered from the Likert scales after completing
mediately. each task. As indicated in Table 11, the overall
Learnability is also examined by how many satisfaction ratings by all subjects are very close
questions subjects can answer correctly and how between the two sites in both stages one and two.
much time it takes on the new Web site. This is to The satisfaction ratings for question 1 (locate a
determine if they can perform tasks at a proficient book) on Rutgers site are worse than the cor-
level. As indicated in Table 9, it seems that both responding ratings on Queens site in both stages.
Rutgers and Queens subjects were able to complete Thirteen subjects (32 percent) commented that a
tasks correctly and faster in Queens site. However, simple search by title on Rutgers catalog resulted
this finding is not confirmed in stage two. Table 10 in too many hits when it was supposed to be a
indicates that subjects feel more comfortable using concise search. Rutgers catalog provides two op-
their home institution’s site than the new site. tions for searching: ”contains” and ”begins with.”
A title search of Gone with the Wind uses ”contains”
results in 42 hits while the same search string using
Satisfaction
”begins with” results in 12 hits. The default is set
Satisfaction is a multi-dimensional construct. This to ”contains” by the system. Most users did not
study applies Likert scale to measure its directions realize the difference and did not change it. One
and intensity (1= easy to use/high satisfaction, subject said, ”I typed Gone with the Wind without
5=difficult to use/low satisfaction). Factors of sat- author’s name in the search for title and it wasn’t
isfaction are carefully identified as shown in the the first search result, which I thought it should
model (see Figure 2). In addition to Likert scales have been.” Another subject said, ”I don’t like it
used after each task and in post-test questionnaire, when a lot of unrelated things show up.”
participants’ comments were solicited. There were
also open-ended questions. In addition to overall
satisfaction and reaction, satisfaction was further Ease of use
examined in the areas of ease of use, organization Table 12 indicates the ratings of ease of use by
of information, terminology, labeling, visual ap- Rutgers and Queens subjects in both stages one
pearance, content, error correction, best features, and two. Queens site was rated easier to use than
and worst features. Rutgers site by both Rutgers and Queens subjects

107
Judy Jeng

Table 11. Satisfaction Rating by Question (1=easy to use/high satisfaction, 5=difficult to use/low satisfaction)

Rutgers site Queens site


Stage 1 Stage 2 Stage 1 Stage 2
N=11 N=30 N=11 N=30
Question 1 Find a book 3.2 2.7 1.9 1.7
Question 2 Find a journal 1.9 2.2 2.3 2.5
Question 3 Use database 3.4 2.7 1.8 2.5
Question 4 Use database 2.2 1.9 1.7 2.4
Question 5 Use database 2.8 2.5 3.2 2.6
Question 6 Find encyclopedia 2.7 3.4 3.7 4.3
Question 7 Find e-book 4.1 3.7 2.9 3.2
Question 8 Find information 2.2 1.9 2.1 2.2
Question 9 Find information 2.0 1.3 1.3 1.6
Overall 2.6 2.5 2.5 2.6

in stage one. However, in stage two, subjects explanations,” and ”from user’s perspective.” It
rated their own institution’s site easier to use. It is also found that thirteen subjects (31 percent)
is worth noting that Rutgers site receives better expressed various degrees of problems with ter-
rating in stage two, after its revamp of Web site minology used in both Rutgers and Queens site.
appearance. The concerns are centered on:
It is found that subjects evaluate ”ease of use”
from the perspectives of ”easy to get around,” ”can • the library site assumes that users have common sense
of library terms.
follow directions easily,” ”easy navigation,” ”clear
description,” ”intuitive,” and ”user-friendly.” • users do not know the difference between index and
database
• the terms are library-oriented, such as articles, encyclo-
pedia, serial, periodicals, and index.
Organization of information
Table 13 indicates the ratings of organization of • need better description/explanation to library jargons.
information. Queens site received better ratings
from both Rutgers and Queens subjects in stage
one. However, Rutgers site, after its redesign of its
Web site, receives better ratings in stage two by Attractiveness
both Rutgers and Queens subjects. Table 15 indicates how subjects rate the degrees of
It is found that subjects evaluate ”organization” visual attractiveness of both sites. It is interesting
from the perspectives of ”simple,” ”straightfor- to find that subjects give their home institution’s
ward,” ”logical,” ”easy to look up things,” and site better rating, in both stages one and two.
”placing common tasks upfront.” It is found that subjects evaluate ”attractiveness”
from the perspectives of ”appropriate graphics,”
”readability,” ”appropriate color,” ”not too com-
Terminology plicated,” and ”appropriate size of font.”
Table 14 indicates how subjects rate the ter-
minology used in both sites and if categories are
clearly labeled. Rutgers site receives better ratings Mistake recovery
than Queens site in both stages one and two and Table 16 indicates how subjects rate both sites’
by both Rutgers and Queens subjects. features on the ease of recovering from mistakes.
It is found that subjects evaluate ”terminology” The Rutgers site receives better rating from both
from the perspectives of ”simple,” ”straightfor- Rutgers and Queens subjects in stage one. How-
ward,” ”understandable,” ”generic,” ”label sec- ever, in stage two, subjects rated their own site
tions clearly,” ”no jargon,” ”clear descriptions/ easier to recover from mistakes.

108
Usability Assessment of Academic Digital Libraries

Table 12. Ease of Use (1=easy to use, 5=difficult to use) Table 16. Mistake Recovery (1=easy, 5=difficult)

Ease of use Rutgers site Queens site Mistake Recovery Rutgers site Queens site
Stage 1 Stage 2 Stage 1 Stage 2 Stage 1 Stage 2 Stage 1 Stage 2
Rutgers subjects 2.6 2.5 2.0 3.1 Rutgers subjects 2.2 1.9 2.6 3.1
Queens subjects 3.0 2.7 2.0 2.4 Queens subjects 2.0 2.4 2.7 1.9

Table 13. Organization of Information (1=clear, 5=unclear) Table 17. Overall Reaction (1=satisfied, 5=unsatisfied)

Organization of Overall Reaction Rutgers site Queens site


Rutgers site Queens site
Information Stage 1 Stage 2 Stage 1 Stage 2
Stage 1 Stage 2 Stage 1 Stage 2
Rutgers subjects 2.4 2.1 2.2 3.2
Rutgers subjects 3.0 1.9 2.2 3.1 Queens subjects 3.2 2.2 1.8 2.4
Queens subjects 2.6 2.1 1.6 2.6

Table 18. User Lostness (N2=30)


Table 14. Terminology (1=clear, 5=unclear)
Rutgers Site Queens Site
Terminology Rutgers site Queens site
Stage 1 Stage 2 Stage 1 Stage 2 Rutgers Subjects 6 11
Queens Subjects 8 6
Rutgers subjects 3.0 2.2 3.4 2.9
Total 14 17
Queens subjects 2.4 1.7 2.4 2.1

Table 19. Navigation (N2=30)


Table 15. Attractiveness (1=attractive, 5=unattractive)
Rutgers site Queens site
Attractiveness Rutgers site Queens site
Stage 1 Stage 2 Stage 1 Stage 2 Rutgers subjects 11 7
Rutgers subjects 2.6 2.1 3.2 2.6 Queens subjects 13 12
Queens subjects 3.0 2.8 2.0 2.3 Total 24 19

Table 20. Click Cost (N2=30)

Rutgers site Queens site


It is found that subjects evaluate ”mistake re- Rutgers subjects 10 11
covery” from the perspectives of ”easy naviga- Queens subjects 12 11
tion.” It is helpful to have navigation bar and Back Total 22 22
button. It is also important to be able to start over.
A ”Help” section is also suggested.

Specific comments on improvements for both


Overall reaction Rutgers and Queens sites, their best features, worst
The scores for calculating overall reaction are features, and desired features are also received.
from the Likert scales in post-test questionnaires.
This gives participants another opportunity to
rate the test sites and provide overall reaction af- Lostness
ter examining specific areas such as ease of use, User lostness issue was examined in the stage two
organization of information, terminology, attrac- of this study. It is found that fourteen subjects (46
tiveness, and mistake recovery. Table 17 indicates percent) felt lost in Rutgers site while there were
the overall reaction to both Rutgers and Queens seventeen subjects (57 percent) felt lost in Queens
sites. It turns out that subjects are more pleased site (see Table 18). An examination of Rutgers and
with Queens site in stage one, but are more Queens subjects separately indicates that subjects
pleased with Rutgers site in stage two, after are lost more frequently on a new site than their
Rutgers revamped its site. institution’s site.

109
Judy Jeng

It is found that subjects felt loss for the reasons exist statistically significant relationships between
of site design, navigation, tasks, lacking of con- those demographic factors and effectiveness.
fidence, and mistake recovery.
Discussion
Navigation
Number of subjects
Navigation disorientation is found by Brinck,
Gergle, and Wood (2002) among the biggest How many subjects are needed in the study of
frustrations for Web users. The navigation issue this kind? This study recruited forty-one subjects,
was examined in stage two of this study. Twenty- including twenty from Rutgers University and
four subjects (71 percent) said Rutgers site is easy twenty-one from Queens College. Each subject
to navigate while nineteen subjects (56 percent) was asked to perform a total of fifteen tasks (nine
said Queens site is easy to navigate. Table 19 on their institution’s site and six on the other site).
indicates further breakdown by Rutgers and Are these enough both to evaluate usability and
Queens subjects. to examine the relationships among effectiveness,
Subjects’ comments regarding navigation indi- efficiency, and satisfaction?
cate that links should be stable, self-explanatory, Nielsen and Molich (1990) found that not quite
consistent, and easy to recover from mistakes. Sev- half of all major usability problems were detected
eral participants commented that Queens site’s with three participants. Nielsen (1993) then found
drop-down menu is over sensitive and the links are that using six participants allowed the evaluator
”ghosting” (i.e. disappear when mouse moves). to identify the majority of important problems.
Virzi (1992) found that 80 percent of the usability
problems were detected with five participants and
Click Cost 90 percent were detected with ten participants.
”Click cost” is defined by McGillis and Toms The same findings were confirmed by Rubin
(2001) that ”users are very reluctant to click unless (1994). Dumas and Redish (1993) recommend
they are fairly certain they will discover what six to twelve participants for a typical usability
they are looking for.” The click cost issue was first testing. They said that additional participants
examined in stage two of this study. Forty-four are less and less likely to reveal new information.
subjects (73 percent) expressed that they expect Nielsen (2000) strongly advocates using five users
the click(s) will eventually lead them to the correct in a formal usability testing. He states, ”Elaborate
answer. Table 20 is a breakdown by Rutgers and usability tests are a waste of resources. The best
Queens subjects. results come from testing no more than five users
As one subject indicates, ”a click was supposed and running as many small tests as you can af-
to imply what you could expect.” Another subject ford.” This study includes forty-one subjects on
said, ”I expect, but it doesn’t always.” One said, the grounds that it is not only a usability test but
”I just hoped for the best.” Users of library Web also a research project to study the relationships
sites come to the site to look up information. They among attributes. It needs more subjects to en-
want to be able to get to the answer easily and sure sufficient statistical power. Each subject per-
rapidly. Each click should get them closer to the formed fifteen tasks. The unit of analysis is search.
information. Therefore there are a total of 615 sets of scores of
effectiveness, efficiency, and satisfaction. These
should provide enough statistical power to analyze
Demographic factors on performance the relationships among effectiveness, efficiency,
This study examines the relationships between and satisfaction.
demographic factors, such as gender, age, sta- Another reason of using a large number of
tus, major, ethnic background, years at the in- subjects is to help to eliminate the investigator’s
stitution, and frequency of using library Web site, judgment call whether a problem discovered from
on effectiveness. A series of ANOVAs and the the usability test is user-specific or a true usability
Pearson Product-Moment Correlation Coefficient problem. The small sample size makes it difficult
were conducted. It is found that there does not to determine.

110
Usability Assessment of Academic Digital Libraries

Interlocking relationships among effectiveness, a suite of instruments were designed and suc-
efficiency, and satisfaction cessfully tested. The relationships among effective-
ness, efficiency, and satisfaction were found. In
The study was divided into two phases: the phase addition, the user lostness and click costs issues
one included eleven subjects and was conducted were studied; and users’ criteria on usability at-
in February/March of 2004; the phase two had tributes such as ”ease of use,” ”organization of
thirty subjects and was conducted in September/ information,” ”terminology,” ”attractiveness,” and
October 2004, after Rutgers revamped its Web ”mistake recovery” were found. Users’ comments
page. Statistical analyses were performed sepa- on improvements were especially helpful on
rately. The first phase has a total of 165 sets of understanding what digital library users care the
scores. The second phase has a total of 450 sets of most.
scores. These should provide sufficient statistical Digital library is an information system over
power to analyze the relationships among ef- a network that is organized, well-managed, and
fectiveness, efficiency, and satisfaction. It is found supports the creation, use, and searching of digital
in both phases that these relationships exist. objects. Digital library should be looked as a tool
Because the relationships are confirmed in both that supports user’s information task. Users are
stages respectively, the investigator of this study looking for an information system that is easy and
thereby does not perform a combined analysis intuitive to use. This study applies user-centered
using the whole 615 sets of scores. formal usability testing technique to measure
The study of Frøkjær, Hertzum, and Hornbæk usability from the perspectives of effectiveness,
(2000) found effectiveness and efficiency are only efficiency, satisfaction, and learnability which
weakly correlated. The study of Walker et al. considers both performance elements as well as
(1998) found user satisfaction is not determined satisfaction.
by efficiency. This study found the strength of re- The evaluation model proposed in this study
lationship between effectiveness and steps was considers both the quantifying elements such as
strong (η2 = .15 in stage one; η2 = .16 in stage time, accuracy rate, steps to complete tasks, as well
two) and the strength of relationship between as subjective criteria, satisfaction, which is further
effectiveness and time was medium (η2 = .09) in examined in the areas of ease of use, organization
stage one and was strong (η2 = .20) in stage two. of information, labeling, visual appearance, con-
This study also found the strength of relationship tent, and error correction. Although the model is
between effectiveness and satisfaction were strong tested in two academic libraries in this study, it
in both stages (η2 = .26 in stage one; η2 = .29 in may be generalized to other digital libraries or
stage two). However, this study also realizes that information systems.
effectiveness, efficiency, and satisfaction are three The contributions of this study are: 1) an evalua-
separate criteria. Each has its specific emphasis tion model for digital libraries; 2) benchmarks
and should be measured separately. of performances; 3) the finding of interlocking
relationships among effectiveness, efficiency, and
satisfaction; 4) operational criteria and strategy
to measure effectiveness, efficiency, satisfaction,
Conclusion
and learnability; 5) user’s criteria regarding ease
The goals of this study were to develop a model of of use, organization of information, labeling,
evaluating usability for academic digital libraries; visual appearance, content, and error correction;
use the model for development of measures, in- 6) causes of user lostness; 7) what contributes to
struments, and methods for evaluating academ- easy navigation; 8) the finding of click cost; 9) the
ic digital libraries; test the measures, instruments, finding that demographic factors such as gender,
and methods by applying them in two academic age, status, major, and ethnic background do not
digital libraries; generalize the model, instru- have statistical significance on performance; 10) a
ments, and methods for use in academic librar- review of how usability have been and should be
ies; and to study the relationships among defined in the context of the digital library; and
effectiveness, efficiency, and satisfaction. Those 11) what usability evaluation methods have been
research objectives are achieved. A model and applied in academic digital libraries.

111
Judy Jeng

digital libraries: A case study. Proceedings of the Fourth


References
ACM/IEEE Joint Conference on Digital Libraries, 27–36.
Adams, A., and A. Blandford. 2002. Acceptability of New York: ACM Press.
medical digital libraries. Health Informatics Journal 8 Blandford, Ann, Hanna Stelmaszewska, and Nick
(2): 58–66. Bryan-Kinns. 2001. Use of multiple digital libraries:
Allen, Maryellen. 2002. A case study of the usability A case study. Proceedings of the First ACM/IEEE-CS
testing of the University of South Florida’s virtual li- Joint Conference on Digital Libraries, 179–188. New
brary interface design. Online Information Review 26 York: ACM Press.
(1): 40–53. Booth, Paul. 1989. An introduction to human-computer in-
Arms, William Y. 2000. Digital libraries. Cambridge, teraction. London: Lawrence Erlbaum Associates.
Mass.: MIT Press. Borgman, Christine L., Anne J. Gilliland-Swetland,
Askin, A. Y. 1998. Effectiveness of usability evaluation Gregory H. Leazer, Richard Mayer, David Gwynn,
methods at a function of users’ learning stages. Mas- Rich Gazan, and Patricia Mautone. 2000. Evaluating
ter’s thesis, Purdue University. digital libraries for teaching and learning in under-
Augustine, Susan, and Courtney Greene. 2002. Dis- graduate education: A case study of the Alexandria
covering how students search a library Web site: A Digital Earth ProtoType (ADEPT). Library Trends 49
usability case study. College & Research Libraries 63 (2): 228–50.
(4): 354–65. Brinck, Tom, Darren Gergle, and Scott D. Wood. 2002.
Belkin, Nicholas J., and Alina Vichery. 1985. Interaction in Designing Web sites that work: Usability for the Web.
Information Systems: A review of research from document San Francisco: Morgan Kaufmann.
retrieval to knowledge-based systems. Wetherby, West Campbell, Nicole. 2001. Usability assessment of library-
Yoorkshire, Great Britain: British Library. related Web sites: Methods and case studies. Chicago:
Benjes, Candice, and Janis F. Brown. 2001. Test, revise, LITA, American Library Association.
retest: Usability testing and library Web sites. Internet Chisman, Janet, Karen Diller, and Sharon Walbridge.
Reference Services Quarterly 5 (4): 37–54. 1999. Usability testing: A case study. College & Re-
Bennett, J. L. 1972. The user interface in interactive sys- search Libraries 60 (6): 552–69.
tems. Annual Review of Information Science and Tech- Clairmont, Michelle, Ruth Dickstein, and Vicki Mills.
nology 7, 159–96. 1999. Testing of usability in the design of a new information
Bennett, J. L. 1979. The commercial impact of usability gateway. http://www.library.arizona.edu/library/
in interactive systems. In Man-Computer Communica- teams/access9798 (accessed February 14, 2003).
tion, Infotech State-of-the-Art, Vol. 2, ed. B. Shackel, Clark, Jason A. 2004. A usability study of the Belgian-
1–17. Maidenhead: Infotech International. American Research Collection: Measuring the func-
Bishop, Ann Peterson. 2001. Logins and Bailouts: Meas- tionality of a digital library. OCLC Systems & Services:
uring access, use, and success in digital libraries. The International Digital Library Perspectives 20 (3): 115–
Journal of Electronic Publishing 4 (2). http://www.press. 27.
umich.edu/jep/04-02/bishop.html (accessed Jan- Clausen, Helge. (1999). Evaluation of library Web sites:
uary 23, 2002). The Danish case. The Electronic Library 17 (2): 83–87.
Blandford, Ann, and George Buchanan. 2002a. Usability Covi, Lisa, and Kling, Rob. 1996. Organizational di-
for digital libraries. Proceedings of the Second ACM/ mensions of effective digital library use: Closed
IEEE-CS Joint Conference on Digital Libraries, 424. rational and open natural systems models. Journal
New York: ACM Press. of the American Society for Information Science 47 (9):
Blandford, Ann, and George Buchanan. 2002b. Work- 672–89.
shop report: Usability of digital libraries @ JCDL’02. Davis, F. (1989). Perceived usefulness, perceived ease of
http://www.uclic.ucl.ac.uk/annb/DLUsability/ use, and user acceptance of information technology.
SIGIR.pdf (accessed June 11, 2003). MIS Quarterly 13, 319–339.
Blandford, Ann, and George Buchanan. 2003. Usability Davis, F. D., R. P. Bagozzi, and P. R. Warshaw. (1989).
of digital libraries: A source of creative tensions with User acceptance of computer technology: A compari-
technical developments. TCDL Bulletin. http:// son of two theoretical models. Management Science
w w w. i e e e - t c d l . o rg / B u l l e t i n / s u m m e r 2 0 0 3 / 35, 982–1003.
blandford/blandford.html (accessed October 30, Dickstein, Ruth, and Vicki Mills. 2000. Usability testing
2004). at the University of Arizona library: How to let the
Blandford, Ann, George Buchanan, and Matt Jones. users in on the design. Information Technology and
2004. Usability of digital libraries. International Jour- Libraries 19 (3): 144–51.
nal on Digital Libraries 4 (2): 69–70. Dorward, Jim, Derek Reinke, and Mimi Recker. 2002.
Blandford, Ann, Suzette Keith, Iain Connell, and Helen An evaluation model for a digital library services
Edwards. 2004. Analytical usability evaluation for tool. Proceedings of the Second ACM/IEEE-CS Joint

112
Usability Assessment of Academic Digital Libraries

Conference on Digital Libraries, 322–323. New York: Hammill, Sarah J. (2003). Usability testing at Florida
ACM Press. International University Libraries: What we learned.
Dumas, Joseph. S., and Janice C. Redish. 1993. A prac- Electronic Journal of Academic and Special Librarianship
tical guide to usability testing. Norwood, NJ: Ablex 4 (1).
Publishing Co. Hartson, H. Rex, Priya Shivakumar, and Manuel A.
Dzida, W., S. Herda, and W. D. Itzfeldt. 1978. User- Pérez-Quinones. 2004. Usability inspection of digital
perceived quality of interactive systems. IEEE Trans- libraries: A case study. International Journal on Digital
actions on Software Engineering 4 (4): 270–76. Libraries 4 (2): 108–23.
Eason, K. D. 1981. A task-tool analysis of the manager- Hennig, Nicole. 1999. Web site usability test. http://
computer interaction. In Man-Computer Interac- macfadden.mit.edu:9500/webgroup/usability/
tion, ed. B. Shackel. Amsterdam: Sijthoff and Noord- results/index.html (accessed February 26, 2003).
hoff. Hessel, Heather, and Tom Burton-West. 2003. Tools for
Eberts, R. E. 1985. Human-computer interaction. In Hu- improving WebVoyage usability: Search log analysis and
man Factors Psychology, ed. P. A. Hancock, 249–304. usability testing. http://library.pub.getty.edu:8100/
Amsterdam: Elsevier. vugm03.html (accessed February 2, 2005).
France, Robert K., Lucy Terry Nowell, Edward A. Fox, Hix, Deborah, and H. Rex Hartson. 1993. Developing
Rani A. Saad, and Jianxin Zhao. 1999. Use and us- user interfaces: Ensuring usability through product &
ability in a digital library search system. http:// process. New York: John Wiley.
www.dlib.vt.edu/Papers/Use_usability.PDF (ac- International Standards Organization. 1994. Ergonomic
cessed March 15, 2003). requirements for office work with visual display terminals.
Francisco-Revilla, Luis, Frank Shipman, Richard Fruta, Part 11: Guidance on usability (ISO DIS 9241-11).
Unmil Karadkar, and Avital Arora. 2001. Managing London: International Standards Organization.
change on the Web. Proceedings of the First ACM/ Kantner, Laurie, and Stephanie Rosenbaum. 1997. Us-
IEEE-CS Joint Conference on Digital Libraries, 67–76. ability studies of www sites: Heuristic evaluation
New York: ACM Press. vs. laboratory testing. Proceedings of the 15th Annual
Frøkjær, Erik, Morten Hertzum, and Kasper Hornbæk. International Conference on Computer Documentation,
2000. Measuring usability: Are effectiveness, ef- 153–160. New York: ACM Press.
ficiency, and satisfaction really correlated? Proceed- Karoulis, Aaroulis, and Andreas Pombortsis. 2003.
ings of the CHI2000 Conference on Human Factors in Heuristic evaluation of Web-based ODL programs.
Computing Systems, 345–52. New York: ACM Press. In Usability evaluation of online learning programs, ed.
Fu, Limin Paul. 1999. Usability evaluation of Web page Claude Ghaoui, 88–109. Hershey, PA: Information
design. PhD diss., Purdue University. Science Publishing.
Furtado, Elizabeth, João José Vasco Furtado, Fernando Keith, Suzette, Ann Blandford, Bob Fields, and Yin Leng
Lincoln Mattos, and Jean Vanderdonckt. 2003. Im- Theng. 2003. An investigation into the application of
proving usability of an online learning system by claims analysis to evaluate usability of a digital library
means of multimedia, collaboration and adaptation interface. Paper presented at the usability workshop
resources. In Usability evaluation of online learning of JCDL 2002. http://www.uclic.ucl.ac.uk/annb/
programs, ed. Claude Ghaoui, 69–86. Hershey, PA: DLUsability/Keith15.pdf (accessed June 13, 2003).
Information Science Publishing. Kengeri, Rekha, Cheryl D. Seals, Hope D. Harley, Hi-
Gluck, Myke. 1997. A descriptive study of the us- mabindu P. Reddy, and Edward A. Fox. 1999. Us-
ability of geospatial metadata. Annual Review of ability study of digital libraries: ACM, IEEE-CS,
OCLC Research. http://www.oclc.org/research/ NCSTRL, NDLTD. International Journal on Digital
publications/arr/1997/gluck/gluck_frameset.htm Libraries 2: 157–69.
(accessed February 11, 2002). Kim, Kyunghye. 2002. A model of digital library in-
Gould, J. D. 1988. How to design usable systems. In formation seeking process (DLISP model) as a frame
Handbook of Human Computer Interaction, ed. Martin for classifying usability problems. PhD diss., Rutgers
Helander, 757–89. New York: Elsevier. University.
Greenstein, Daniel. 2000. Digital libraries and their Krueger, Janice, Ron L. Ray, and Lorrie Knight. 2004.
challenges. Library Trends 49 (2): 290–303. Applying Web usability techniques to assess student
Grudin, J. 1992. Utility and usability: Research issues awareness of library Web resources. Journal of Aca-
and development contexts. Interacting with Computers demic Librarianship 30 (4): 285–93.
4 (2): 209–17. Kurosu, Masaaki, and Kaori Kashimura. 1995. Apparent
Guillemette, Ronald A. 1995. The evaluation of usability usability vs. inherent usability: Experimental analy-
in interactive information systems. In Human factors sis on the determinants of the apparent usability.
in information systems: Emerging theoretical bases, ed. Conference on Human Factors and Computing Systems,
Jane M. Carey, 207–221. Norwood, NJ: Ablex. 292–293. New York: ACM Press.

113
Judy Jeng

Lamb, R. 1995. Using online resources: Reaching for the Rubin, Jeffrey. 1994. Handbook of usability testing: How to
*.*s. In Digital Libraries’95, ed. F. M. Shipman, R. Fu- plan, design, and conduct effective tests. New York: John
ruta, and D. M. Levy, 137–46. Austin, TX: Depart- Wiley & Sons.
ment of Computer Science, Texas A&M University. Rushinek, Avi, and Sara F. Rushinek. 1986. What makes
Lan, Su-Hua. 2001. A study of usability evaluation of users happy? Communications of the ACM 29 (7):
information architecture of the University Library 594–98.
Web site: A case study of National Taiwan University Shackel, Brian. 1981. The concept of usability. Proc-
Library Web site. Bulletin of the Library Association of eedings of IBM Software and Information Usability Sym-
China 67: 139–54. posium, Poughkeepsie, NY, September 15–18, 1–30; and
Landauer, Thomas K. 1995. The trouble with computers: in J. L. Bennett, D. Case, J. Sandelin, and M. Smith
Usefulness, usability and productivity. Cambridge, eds. 1984. Visual Display Terminals: Usability Issues
MA: MIT Press. and Health Concerns, 45–88. Englewood Cliffs, NJ:
McGillis, Louise, and Elaine G. Toms. 2001. Usability of Prentice-Hall.
the academic library Web site: Implications for de- Shackel, Brian. 1986. Ergonomics in design for usability.
sign. College & Research Libraries 62 (4): 355–67. In People & computers: Designing for usability. Pro-
Miller, R. B. (1971, April 12). Human ease of use cri- ceedings of the second conference of the BCS HCI
teria and their tradeoffs. IBM Report TR 00.2185. specialist group, ed. M. D. Harrison and A. F. Monk.
Poughkeepsie, NY: IBM Corporation. Cambridge: Cambridge University Press.
Neumann, Laura J., and Ann Peterson Bishop. 1998. Shackel, Brian. 1991. Usability – Context, framework,
From usability to use: Measuring success of testbeds definition, design and evaluation. In Human Factors
in the real world. http://forseti.grainger.uiuc.edu/ for Informatics Usability, ed. Brian Shackel and Si-
dlisoc/socsci_site/dpc-paper-98.html (accessed Au- mon J. Richardson, 21–37. New York: Cambridge
gust 24, 2002). University Press.
Nielsen, Jacob. 1993. Usability engineering. Cambridge, Snyder, Carolyn. 2003. Paper prototyping: The fast and
MA: Academic Press. easy way to design and refine user interfaces. Boston:
Nielsen, Jacob. 2000. Why you only need to test with 5 users. Morgan Kaufmann.
http://www.useit.com/alertbox/20000319.html Spool, Jared. M., T. Scanlon, C. Snyder, W. Schroeder,
(accessed May 1, 2003). and T. DeAngelo. 1999. Web site usability: A designer’s
Nielsen, Jakob, and Robert L. Mack, eds. 1994. Usability guide. San Francisco, CA: Morgan Kaufmann.
inspection methods. New York: Wiley. Sumner, Tamara, Michael Khoo, Mimi Recker, and Mary
Nielsen, Jacob, and Rolf Molich. 1990. Heuristic evalua- Marlino. 2003. Understanding educator perceptions
tion of user interfaces. Proceedings of the SIGCHI of ”quality” in digital libraries. Proceedings of the 3rd
Conference on Human Factors in Computing Systems: ACM/IEEE-CS Joint Conference on Digital Libraries,
Empowering People, 249–56. New York: ACM Press. 269–79. New York: ACM Press.
Oulanov, Alexai, and Edmund F. Y. Pajarillo. 2002. Theng, Yin Leng, Norliza Mohd-Nasir, and Harold
CUNY+ Web: Usability study of the Web-based GUI Thimbleby. 2000a. Purpose and usability of digital
version of the bibliographic database of the City Uni- libraries. Proceedings of the Fifth ACM Conference on
versity of New York (CUNY). The Electronic Library Digital Libraries, 238–39. New York: ACM Press.
20 (6): 481–87. Theng, Yin Leng, Norliza Mohd-Nasir, and Harold
Paepcke, Andreas, S. B. Cousins, Héctor García-Molina, Thimbleby. 2000b. A usability tool for web evaluation
S. W. Hassan, S. P. Ketchpel, M. Roscheisen, and Terry applied to digital library design. Poster presented
Winograd. 1996. Using distributed objects for digital at the WWW9 Conference, Amsterdam. http://
library interoperability. Computer 29 (5): 61–68. citeseer.nj.nec.com/theng00usability.html (accessed
Pearrow, Mark. 2000. Web site usability handbook. February 24, 2003).
Rockland, MA: Charles River Media. Thomas, Rita Leigh. 1998. Elements of performance
Pearson, Sammy Wray. 1977. Measurement of computer and satisfaction as indicators of the usability of
user satisfaction. PhD diss., Arizona State University. digital spatial interfaces for information-seeking:
Popp, Mary Pagliero. 2001. Testing library Web sites: Implications for ISLA. PhD diss., University of
ARL libraries weigh in. Paper presented at the As- Southern California.
sociation of College and Research Libraries, 10th Na- Tractinsky, Noam. 1997. Aesthetics and apparent us-
tional Conference, Denver, Colorado, March 15–18, ability: Empirically assessing cultural and methodo-
2001. http://www.ala.org/ala/acrl/acrlevents/ logical issues. Proceedings of the SIGCHI Conference on
popp.pdf (accessed January 11, 2005). Human Factors in Computing Systems, 115–122. New
Rosson, M. B., and J. M. Carroll. 2002. Usability engineer- York: ACM Press.
ing: Scenariio-based development of human-computer Turner, Steven. 2002. The HEP test for grading Web site
interaction. San Francisco: Morgan Kaufmann. usability. Computers in Libraries 22 (10): 37–39.

114
Usability Assessment of Academic Digital Libraries

Virzi, R. 1992. Refining the test phase of usability Walker, Marilyn A., Jeanne Fromer, Giuseppe Di Fab-
evaluation: How many subjects is enough? Human briziio, Craig Mestel, and Don Hindle. 1998. What
Factors 34, 457–468. can I say? Evaluating a spoken language interface
Walbridge, Sharon. 2000. Usability testing and libraries: to email. Proceedings of CHI ’98, 582–589. New York:
The WSU experience. Alki: The Washington Library ACM Press.
Association Journal 16 (3): 23–24.

Editorial history:
paper received 30 May 2005;
accepted 4 August 2005.

115
Judy Jeng

Appendix A. Usability Testing Questions


The goal of this test is to evaluate the usability of the library’s Web site. I will ask you a series of questions
and would like you to think out loud while you look for the answer. Some questions are easy and some
are more difficult. Do not worry if you can’t find the answer every time. Please remember that we are
testing the effectiveness of the site design and this is not a test of you. The whole test should take less than
an hour. I thank you.

116
Usability Assessment of Academic Digital Libraries

117
Judy Jeng

118
Usability Assessment of Academic Digital Libraries

119
Judy Jeng

120
Usability Assessment of Academic Digital Libraries

121

You might also like