Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

2 10 British Journal of Educational Trchnology V o l 2 4 No 3 2993

SUMI: the Software Usability Measurement Inventory


lurek Kirukowski und M m y Corbett. Humun Factors Reseurch Group, University College, Cork, lrelund

SUMI is a solution to the recurring problem of measuring users’ perception of the


usability of software. It provides a valid and reliable method for the comparison of
competing products and differing versions of the same product, as well as providing
diagnostic information for future developments. It consists of a %)-item questionnaire
devised in accordance with psychometric practice. It is intended to be administered to a
sample of users who have had some experience of using the software to be evaluated.
The concept of usability as assessed by SUM1 draws on the definition in IS0 9241, and
relates to the European Directive on Health and Safety Standards for Workers with VDU
Equipment.

In order to use SUMI effectively we recommend a minimum of ten users. In special


circumstances where SUMI is being used for diagnostic purposes, then a smaller sample
size may be adequate. SUMI needs a working version of the software before usability can
be measured: software is rarely created from a vacuum, and the real (or total) software
life-cycleextends long before and after the sometimes rather frenetic months when code
is being designed, assembled, and tested. It is currently administered in paper-and-
pencil format, and comes supplied with either a manual scoring method (the Basic
Educational SIJMI), or a diskette for computer scoring and report generation (the Full
Professional version).

One of the most important aspects of SUMI has been the development of the
standardisation database, which now consists of usability profiles of over 200 different
kinds of applications such as: word processors, spreadsheets, CAD and graphics
packages, travel reservation systems, on-screen help systems. Basically, any kind of
application can be evaluated using SUMI so long as it has user input through a
keyboard or pointing device. display on a screen, and some i n m t and output between
secondary memory and peripheral devices. The standardisation of any questionnaire
requires extensive data collection activities. The development of SUMI was assisted by
the CEC ESPRIT programme through the project Metrics for Usability Standards in
Computing (MIJSiC, project number 5429). Since its creation, the questionnaire was
extensively tested in over 30 industrial partner sites in various parts of Europe.

If you are evaluating a product or series of products using SUMI, you may either do a
product-against-product comparison, or you may compare each product against the
standardisation database, to see how the product which is being rated compares
against a n average state-of-the-art market profile. The Full Professional version gives
you access to this database, which is being continually updated.

SUM1 gives three hierarchical layers of output. Outputs from the first two layers are
standardised on a scale whose mean is 5 0 and standard deviation 10. There is first a
global usability reading. which gives a single figure-of-merit approach. While this is
CoIloquium 2 11

useful as a summary, it is not itself terribly informative. The second layer has five sub-
scales: affect, efficiency, learnability, helpfulness, and control. The subscales relate to
the users’ perceptions of the qualities of the software they are interacting with, and each
of the subscales has a specific meaning, given in the SUMI manual. The third layer is
what is known as Item Consensual Analysis, available only in the Full Professional
version. This gives a comparison of the response pattern of each questionnaire item
between your obtained sample and what is predicted from the general standardisation
database. It therefore highlights very quickly which aspects of your software stand out
as in need of special attention and which are strong features.

Organisations which purchase the Basic Education SUMI receive a pack of 50


questionnaires in their chosen language, a handbook, and scoring stencils. The Full
Professional version contains in addition a computerised scoring program which
contains the standardisation database. The SUMI questionnaire has been translated
and standardised in the following languages: French, German, Dutch, Spanish and
Italian. In the ESPRIT MUSiC project, a number of large evaluations in which SUMI
played a role were carried out at partner sites in the Netherlands, Italy, and Spain.
These have been reported in technical working papers within the MUSiC consortium,
and will become available as part of the overall MUSiC starter toolkit around November
1993.

Although SUMI may be used as a stand-alone tool for usability assessment, it is most
effective when used as part of the MUSiC toolset. This toolset offers such valuable
functionalities as a context of use checklist, an evaluation design manager (as an
interactive software product), and tools for the evaluation of the performance and
subjective effort aspects of usability. To date, the feedback received from organisations
using the MUSiC toolset confirms that informative usability evaluation combines
measurements from at least two of the MUSiC tools, and that attention paid to the
context of use is vital when attempting to make wider claims about the usability of a
product on the basis of a laboratory-based or quasi-naturalistic investigation.

Case study - a SUMI evaluation


The description presented below is incomplete in order to protect the identity ofour client. I t is
presented here to give an indication of the type of information which a SUM1 evaluation may
provide.

Company ABC required a n evaluation on a n in-house development. The primary


objective of the study was to determine user reaction prior to public exploitation. The
product was in integrated office systems development. The product had been in use
internally for 1month by the staff. Staff members who had between 1 2 0 and 1 4 0 hours
of experience were selected for the study. A summary of the results are presented in
Figure 1.

The software got a n average rating for overall usability (Global) from the users.
However the Learnability and Helpfulness scales were poorly rated. This would indicate
212 British journal of Educational Technology V o l 2 4 No 3 1993

70 - .....................................................................................................................................

V"

-
50 ..........7...................

40 - ....... _._.__
................

30 - ...............................

20 -

that these two aspects need to be considered very carefully if the product is to be
commercialised.

The Usability Profile presents the results for each individual for each sub-scale
separately. From this output it became clear that the results were strongly influenced by
the extreme views of 3 of the users who participated in the evaluation.

The Item Consensual Analysis indicated that there were a large number of statements
on the inventory which received extremely low support. These items confirmed the
[Jsability Profile and specific areas of difficulty were identified.

Specific recommendations made to the organisation were to reexamine the logic of task
structuring, to examine mechanisms to reduce the complexity of the interface and to
improve the quality and level of on-line help offered.

Note
For more information on SUM1 and the MUSiC project contact Mary Corbett, telephone
+ 3 5 3 21 276871 ~ 2 4 1 2fax
, + 3 5 3 21 270439.

You might also like