Multivariate Workload Evaluation Combining Physiological and Subjective Measures

International Journal of Psychophysiology 40 Ž2001.


Multivariate workload evaluation combining

physiological and subjective measures

Shinji MiyakeU
Department of En¨ ironmental Management II, School of Health Sciences, Uni¨ ersity of Occupational and En¨ ironmental
Health, Japan 1-1 Iseigaoka, Yahatanishiku, Kitakyushu 807-8555, Japan

Received 6 January 2000; received in revised form 6 June 2000; accepted 15 June 2000


This paper suggests a way to integrate different parameters into one index and results obtained by a newly
developed index. The multivariate workload evaluation index, which integrates physiological parameters and one
subjective parameter through Principle Components Analysis, was proposed to characterize task specific responses
and individual differences in response patterns to mental tasks. Three different types of mental tasks were performed
by 12 male participants. Heart rate variability, finger plethysmogram amplitude, and perspiration were used as
physiological parameters. Three subscales, mental demand, temporal demand and effort out of six subscales in the
NASA-Task Load Index were used as subjective scores. These parameters were standardized within each participant
and then combined. It was possible to assess workload using this method from two different aspects, i.e. physiological
and subjective, simultaneously. 䊚 2001 Elsevier Science B.V. All rights reserved.

Keywords: Principal components analysis; HRV; Plethysmogram; Perspiration; NASA-TLX; Mental workload; Subjective

1. Introduction Standardization ŽISO. is attempting to standard-

ize a workload measurement method in which
It is an important matter in ergonomics to several physiological indices are assigned to sev-
develop an assessment technique for mental eral effects Žfatigue, monotony, satiation and vigi-
workload. The International Organization for lance. that are induced by mental workload ŽISO,
1998.. However, if these physiological parameters
can be integrated into one synthesized index with
Tel.: q81-93-691-7151; fax: q81-93-691-2694. variably-weighted coefficients, it may not be nec-
E-mail address: ŽS. Miyake.. essary to change measures for different mental

234 S. Miyake r International Journal of Psychophysiology 40 (2001) 233᎐238

workload effects. On the other hand, the re- 1.4 pixelrunit time, respectively. and the width of
sponse sensitivity to a mental task is different in the target moving area Ž350 = 330, 167 = 166 and
each person. The physiological responses induced 86 = 83 pixels for H, M and L, respectively.. The
by the same task may also differ from person to tracking task was similar to a simple reaction time
person. This is the individual difference problem task, however, some precognition Žprediction. may
ŽTurner, 1994.. Furthermore, the physiological re- have been necessary. The logical task was nearly
sponse pattern is different from task to task. For identical to the Mine Sweeper game in Microsoft
example, the response induced by the mental Windows ŽR.. In this task, the participants were
arithmetic task is different from the response told to guess whether there was a mine in a grid
induced by the mirror drawing task ŽSaab and cell and to click the ‘safe’ cells. In this task, even
Schneiderman 1993; Miyake 1997.. Thus, we must if the participants hit the mine, the task was not
consider such individual differences and task spe- finished as in the Windows game. The more dif-
cific response patterns when workload research is ficult the level, the more mines there were. This
investigated. One approach to solving these prob-
logical task may have required short-term me-
lems is to record and analyze several physiological
mory and logical inference. The task duration was
Žand subjective. responses with different at-
4 min for each difficulty level in the T and L
tributes and integrate them in a way which can
tasks. The tasks were ordered from more difficult
reflect individual differences in physiological sen-
to easier and the same fixed order was applied for
sitivity and task specific responses. The purpose
all participants, i.e. P, TH , L H , TM , L M , TL and
of this study was to investigate a new method of
mental workload assessment in which multiple L L ŽMiyake, 1996.. Using this approach, the task
physiological parameters and subjective indexes difficulty factor nullified the training effect and
are integrated into one index through multivari- emphasized the differences in difficulty levels
ate analysis. among tasks. The computer tasks ŽT and L. were
programmed by QuickBASIC 4.5 ŽMicrosoft. and
run on an MS-DOS operated 486 computer with
a 15-inch Ž640 = 400 pixels. CRT display.
2. Method Participants were 12 male university students
ranging in age from 18.9 to 25.9 years with an
average age of 22.7 years. All participants gave
Three different kinds of mental tasks were used: their informed consent before the experiments
a six-piece wooden puzzle ŽP., a two-dimensional and were given the same amount of payment
compensatory tracking task with a first-order con- regardless of their task performances.
trol ŽT., and a numerical logical task ŽL.. The T Three different physiological measures were
task and the L task had three levels of difficulty, acquired during all experimental blocks, including
high ŽH., medium ŽM. and low ŽL.. Therefore, before ŽPRE. and after ŽPOST. task rest periods
they were abbreviated as TH , TM and TL for the Ž5 min.: Ž1. the ln LFrHF Žnatural logarithm of
tracking task and L H , L M and L L for the logical
LF to HF ratio 1 . as a cardiovascular parameter;
task. In the P task, participants were instructed to Ž2. the photoelectric plethysmogram amplitude
make a simple silhouette pattern Žcross. using all
six wooden pieces ŽCross Puzzle II, D.1 Products.
in 8 min. This task requires the pattern recogni-
tion ability. The T task required participants to
keep an airplane icon target inside a central LF is the low-frequency component of HRV of approxi-
circular gunsight area using a joystick controller mately 0.10 Hz that primarily reflects baroreceptor-mediated
ŽFlight Stick Pro, CH Products.. The task dif- regulation of blood pressure. HF is the relatively high-
frequency component of approximately 0.25 Hz that corre-
ficulty level was controlled by the target speed sponds to the frequency of respiration in HRV spectral com-
Žaverage speed for H, M and L were 8.0, 3.4 and ponents.
S. Miyake r International Journal of Psychophysiology 40 (2001) 233᎐238 235

measured on the left index finger as a peripheral was sampled at 100 Hz, and the average ampli-
blood vessel activity parameter; and Ž3. the tude of this signal was obtained as an index for
amount of perspiration from a sweat rate meter the amount of perspiration. The subjective work-
attached to the left thumb as a non-vascular load score was obtained by using the NASA Task
autonomic nervous system parameter. The partic- Load Index ŽTLX. ŽHart and Staveland, 1988.
ipants were instructed to synchronize their respi- which contains six subscales: Mental Demand
ration pace with a computer generated tone dur- ŽMD., Physical Demand ŽPD., Temporal Demand
ing the rest and task periods to reduce the effect ŽTD., Own Performance ŽOP., Effort ŽEF. and
of irregular respiration on the HRV power spec- Frustration level ŽFR.. The NASA-TLX rating
tral components ŽGrossman et al., 1991; Hayano window automatically appeared on the computer
et al., 1994.. The respiration signal was recorded screen when each task duration had expired. Then
by a strain-gauge around the chest to monitor the the participants used the mouse cursor to rate
respiration regularity and to identify the respira- their subjective workload.
tory sinusarrythmia component in the HRV power Two participants were rejected from subse-
spectrum. quent analyses. A clear HF component was not
The HRV spectral analysis and other physio- detected in one participant because his respira-
logical data analyses were done as follows. A 30-s tion rate was so slow in one block that the HF
duration from the beginning of each block was component was contaminated by the LF compo-
not included in these analyses to reject the tran- nent. The other participant showed highly irregu-
sient, relatively large drift in signals which is lar respiration in one block even though he was
frequently observed just after the task started. A instructed to control his breath as described
200-s ECG recorded from the CM 5 lead ŽElles- above. Therefore, no clear HF component was
tad, 1986. was sampled at 1 kHz in each block. found in his HRV power spectrum.
The near-DC components Ž- 0.05 Hz. in the R-R All parameters were standardized within partic-
interval were removed by a digital filter, and the ipants. Physiological workload evaluation ŽPWE.
equidistant R-R interval data were obtained by scores were calculated by means of the PCA of
resampling Ž2 Hz. a spline-interpolated trend- the three physiological parameters mentioned
gram. The 10th-order AR spectral analysis was above. The Weighted Workload ŽWWL. score
applied ŽMiyake et al., 1994.. The LF and HF and the average score of MD, TD and EF sub-
components were extracted by the spectral de- scales of NASA-TLX, which was labeled as the
composition method, and the ln LFrHF was se- TLX-MTE ŽMental, Temporal and Effort. score,
lected for the Principal Components Analysis were calculated. This TLX-MTE score was de-
ŽPCA. procedure. A 200-s plethysmogram signed according to the Subjective Workload As-
recorded by a photoelectric plethysmograph sessment Technique ŽReid and Nygren, 1988.
ŽNihon-Koden MLV-2301. was sampled at 1 kHz, which contains only three dimensions, i.e. Time
and 10-ms interval data were obtained by resam- Load, Mental Effort Load and Psychological
pling every other tenth point Žcoarse graining.. Stress Load. The Time Load seems to be identical
Baseline fluctuation was removed by a 100-point to TD, and Mental Effort Load is similar with
moving average Žequivalent to a 0.443-Hz high MD and EF. The Psychological Stress Load may
pass filter., and the root mean squared ŽRMS. be equivalent to the FR in NASA-TLX. However,
value was calculated as an average beat compo- FR was not included in the TLX-MTE score. FR
nent amplitude. A small capsule with a highly was excluded because the purpose of this new
accurate static capacity moisture sensor and a subjective scale was to reflect subjective feelings
temperature sensor was attached on the skin sur- during the task; and it was assumed that the
face of the left thumb, and perspiration signal was frustration level ŽFR. rated after the task might
measured by the direct capsule method ŽSuzuken be affected greatly by the task result, i.e. success
Kenz-Perspiro OSS-100.. The perspiration signal or failure.
236 S. Miyake r International Journal of Psychophysiology 40 (2001) 233᎐238

The one subjective score ŽTLX-MTE. and three

physiological parameters were analyzed by PCA
and the first principal component scores were
obtained as multivariate workload evaluation
scores ŽMWE. for each participant. The MWE
score for j-th task for the k-th participant was:

MWE jk s W1 k P1 jk q W2 k P2 jk q W3 k P3 jk

q W4 k S1 jk Ž1.

where Wi k was the principal component coeffi-

cient, P1 jk was ln LFrHF, P2 jk was plethysmogram
Fig. 1. Subjective workload by means of the NASA-TLX WWL
amplitude, P3 jk was perspiration and S1 jk was and TLX-MTE scores. The scores were standardized in each
TLX-MTE score. The mean and the standard participant and averaged among participants Ž n s 10..
deviation of the MWE score for all tasks for each
participant were 0 and 1, respectively. Repeated
measures ANOVAs ŽGLM procedure, SPSS. were age Zr was 0.2182 and not significantly different
carried out with the Greenhouse᎐Geisser correc- from zero. Thus, a slightly higher correlation was
tion for inhomogeneity of variance, and applied found between PWE scores and TLX-MTE scores
where appropriate. Significant main effects were than between PWE scores and WWL scores. Fur-
followed up with Student᎐Newman᎐Keuls tests thermore, the PWE and TLX-MTE scores showed
and the significance level was set to P- 0.05. significant differences between P and TM , al-
though there was no significant difference
between them in regard to the WWL scores. The
3. Results MWE score in the P task was significantly higher
than those in the other tasks as shown in Fig. 3.
There was also a significant difference in this
There was a significant effect of ‘task’ on score between TM and L H .
NASA-TLX WWL scores calculated from all six
subscales Ž F6,54 s 8.44, ␧ s 0.582, P- 0.000, Fig.
1., on PWE scores Ž F8,72 s 3.32, ␧ s 0.358, P-
0.05, Fig. 2. and on MWE scores Ž F6,54 s 8.184,
␧ s 0.537, P- 0.000, Fig. 3.. The task effect on
TLX-MTE scores failed to reach significance
Ž F6,54 , ␧ s 0.629, Ps 0.052.. Fig. 1 indicates that
the WWL scores were significantly lower in TL
than in the other tasks and significantly higher in
P than L L . A significant positive correlation
between PWE scores and TLX-MTE scores was
found in three participants Ž P- 0.05 in two par-
ticipants and P- 0.01 in one participant.. The
average Zr ŽFisher’s z-transformation. in the
whole sample of participants was 0.4096 and not
significantly different from zero. Two participants Fig. 2. Physiological workload evaluation ŽPWE. scores com-
showed a significant correlation Ž P - 0.05. posed of ln LFrHF, finger plethysmogram amplitude and
between PWE scores and WWL scores. The aver- perspiration amount.
S. Miyake r International Journal of Psychophysiology 40 (2001) 233᎐238 237

a very complex and delicate task such as a scale

model ship assembly at the end of the task pe-
riod, perhaps he or she would have a feeling of
accomplishment. So, hisrher subjective workload
score concerning the own performance ŽOP. scale
in the NASA-TLX might be low. On the contrary,
if at the very end of the task the participant
dropped the model ship and it broke into pieces,
hershe may feel very depressed and frustrated.
So, hisrher workload may be very high. However,
the physiological responses recorded during the
task period would have been identical because, in
both cases, the participants performed their task
Fig. 3. Multivariate workload evaluation ŽMWE. scores com-
posed of ln LFrHF, finger tip plethysmogram amplitude,
in quite the same manner except for the accident.
perspiration amount and the NASA-TLX MTE scores by The accident that occurred at the end of the task
means of principal components analysis ŽPCA.. could not affect the responses during the task.
Thus, even if the task is the same, the subjective
workload scores rated after the task may be af-
4. Conclusions fected greatly by the task results, while the physi-
ological responses recorded during the task are
not. Feelings of achievement or one’s perfor-
The original NASA-TLX WWL scores showed mance are important in evaluating workload.
good correlation with the difficulty level in the However, the correlation between such feelings
tracking task. The WWL scores in P, TH and L H and the physiological responses during the task
were almost the same. Thus, the WWL score did may be low, as described above.
not differentiate between these three tasks. On The MWE scores were relative parameters
the contrary, the TLX-MTE scores, which were within the participants because they were calcu-
the average of MD, TD and EF subscales, were lated by standardized scores in each participant.
relatively low for the tracking tasks and showed This means that the weight coefficient, Wi k of Eq.
no correlation with task difficulty level. However, Ž1., which was used to calculate the MWE scores,
the pattern of the TLX-MTE scores across all was different from participant to participant. This
tasks was similar to the pattern of the PWE individually-based multivariate workload evalua-
scores, which were composed of three physiologi- tion method seems to be useful for workload
cal parameters, HRV, plethysmogram and per- research ŽFuruta et al., 1997. because the sensitiv-
spiration. Therefore, the TLX-MTE scores, but ities of physiological parameters to a given work-
not the WWL scores, were integrated together load are different in each participant. This proce-
with the three physiological parameters in the dure, in which the weights are decided in each
multivariate evaluation wcf. Eq. Ž1.x. These results individual according to his responses, is very simi-
suggested that the PD, OP and FR subscales in lar to the calculation procedure for the WWL
the NASA-TLX did not covary much with the score with the NASA-TLX. Of course, the
physiological responses recorded during task per- NASA-TLX does not use the PCA procedure.
formance. However, before calculating the weighted average
Thus, the results of this experiment indicate ŽWWL. of the six subscale scores, the weight
one of the reasons for the discrepancy between values for those subscales were obtained by means
physiological parameters and the subjective work- of the paired comparisons of the subscales for
load evaluations by WWL. Two sample cases are each subject. Thus, the weight values for the
discussed here. If a participant were to complete WWL score reflected individual differences in
238 S. Miyake r International Journal of Psychophysiology 40 (2001) 233᎐238

workload evaluation. When we obtain several References

different kinds of physiological parameters and
try to combine them into one single value using
Ellestad, M.H., 1986. Stress Testing ᎏ Principles and Prac-
different weight coefficients among individuals, tice, 3rd edn FA Davis Co., Philadelphia, pp. 129᎐135.
the PCA method is a useful approach. Further- Furuta, T., Miyakawa, T., Kubota, R., Ikeda, K., Miyake, S.
more, this method can integrate subjective mea- and Osaki, H. Ž1997. Experiment on human factors-devel-
sures, also. The MWE score calculated in this opment of method for workload evaluation. Proc. 1997 Fall
study was composed of three physiological Meeting At. Energy Soc. Japan, 320 Žin Japanese..
Grossman, P., Karemaker, J., Wieling, W., 1991. Prediction of
parameters and one subjective parameter. How- tonic parasympathetic cardiac control using respiratory si-
ever, it should be noted that the important point nus arrhythmia: the need for respiratory control. Psy-
here is the method of calculation ŽPCA on stan- chophysiology 28, 201᎐216.
dardized parameters. and not the score itself. Hart, S.G., Staveland, L., 1988. Development of NASA task
That is, we can integrate any physiological and load index ŽTLX.: results of empirical and theoretical re-
search. In: Hancock, P.A., Meshkati, N. ŽEds.., Human
subjective parameters by this method and, of Mental Workload, Elsevier Science Publishers B.V., Ams-
course, the MWE method can assess workload terdam, pp. 139᎐183.
objectively Žphysiologically. and subjectively at the Hayano, J., Mukai, S., Sakakibara, M., Okada, A., Takata, K.,
same time. The MWE method proposed in this Fujinami, T., 1994. Effects of respiratory interval on vagal
study seems to be useful for the evaluation of modulation of heart rate. Am. J. Physiol. Heart Circ. Phys-
iol. 267, H33᎐H40.
work stress ŽISO, 1991. during performance tasks. ISO 10075. Ž1991. Ergonomics principles related to mental
The tasks employed in this study were a puzzle work-load ᎏ general terms and definitions. ISO, Geneva.
and PC-like games and may be far from work in ISO 10075. Ž1998. Ergonomics principles related to mental
the field. However, highly controlled laboratory work-load ᎏ Part 3: measurement and assessment of men-
studies have to be performed first before one may tal work-load wunpublished internal Working Draft ŽWD.x.
Miyake, S., Akatsu, J., Sato, N., Kumashiro, M., 1994. Heart
go into the field with a new method. Furthermore, rate variability as a mental workload index: a methodologi-
it is necessary to employ simple ‘laboratory’ tasks cal proposal for autoregressive power spectral analysis.
in which the task attributes or the resource de- Proc. 12th Triennial Congr IEA 6, 417.
mands of the tasks are apparently different from Miyake, S., 1996. Psychophysiological responses induced by
each other. In either case, further experimental different mental tasks ᎏ comparison between perspiration,
heart rate variability and T-wave amplitude. Psychophysiol.
investigation may be necessary to examine the Ergonomics 1, 45᎐47.
validity and the reliability of this MWE method. Miyake, S., 1997. Factors influencing mental workload indexes.
J. UOEH 19, 313᎐325.
Reid, G.B., Nygren, T.E., 1988. The subjective workload as-
sessment technique: a scaling procedure for measuring
mental workload. In: Hancock, P.A., Meshkati, N. ŽEds..,
Human Mental Workload, Elsevier Science Publisher B.V.,
This work was supported by the Ministry of Amsterdam, pp. 185᎐218.
Education, Sciences, Sports and Culture under Saab, P.G., Schneiderman, N., 1993. Biobehavioral stressors,
the Grant-in-Aid for Scientific Research ŽC. laboratory investigations, and the risk of hypertension. In:
Blascovich, J., Katkin, E.S. ŽEds.., Cardiovascular Reactiv-
No.06670393. The author would like to thank
ity to Psychological Stress and Disease. American Psycho-
Wolfram Boucsein, Jay Miller and the anony- logical Association, Washington DC, pp. 49᎐82.
mous reviewers for their very helpful comments Turner, J.R., 1994. Cardiovascular Reactivity and Stress.
and suggestions on earlier versions of this paper. Plenum, New York, pp. 51᎐53, pp. 71᎐89.

