Inter-Rater Reliability of The Original and Modified Barthel Index, and A Comparison With The Functional Independence Measure

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Australian Occupational Therapy Journal (1996) 43, 22-29

Inter-rater r e l i a b i l i t y o f t h e o r i g i n a l a n d
m o d i f i e d B a r t h e l Index, a n d a c o m p a r i s o n w i t h
t h e Functional Independence Measure

Janet Fricke and Carolyn A. Unsworth


School of Occupational Therapy, Faculty of Health Sciences, La Trobe University, Victoria, Australia

The Barthel Index (BI), the Modified Barthel Index ( M B I ) and the Functional Independence Measure (FIM) are
all widely used by occupational therapists as assessment tools f o r clinical decision-making and outcome
measurement. All of these tools have demonstrated validity and the BI and the FIM have demonstrated inter-rater
reliability. The M B I has been modified to increase sensitivity; however, there have been no publications on the
inter-rater reliability of this tool following the changes. The purpose of this research was to examine the inter-rater
reliability of two versions of the Barthel Index, and draw some comparisons between this assessment tool and the
FIM. Twenty-five patients with neurological and orthopaedic conditions were assessed by three occupational
therapists using the three tools. The method of analysis selected was percentage agreement and intraclass
correlation coefficient. The results indicated that both the original and modified versions of the Barthel Index
possess good inter-rater reliability. As all three tools have demonstrated adequate reliability and validity, it is
suggested that clinicians select the most sensitive tool that best meets their clinical needs, and use this assessment
tool in its standardized format.

K E Y W 0 R D S activities of daily living, functional assessment, Functional Independence Measure, modified


Barthel Index, original Barthel Index.

INTRODUCTION and limitations of each tool must be carefully considered.


It is essential that the selected tool has demonstrated
The selection of functional assessment tools for use in validity, reliability and sensitivity (Eakin, 1989; Fricke,
rehabilitation and geriatric care facilities is a widely 1993; Law, 1987).
debated issue. Such tools are being used not only to In 1995 the Department of Health and Community
inform clinical discussions such as patient treatment and Services, Australia, recommended that a non-standardized
discharge plans but to monitor programme outcomes. version of the Barthel Index (BI) should be adopted for
Choosing a tool is a time-consuming process because no rehabilitation programmes. This version of the Barthel
functional assessment measure will perfectly meet the cir- index appeared to be ‘cobbled together’ from the original
cumstances of every client or hospital, and the advantages Barthel Index (OBI; Mahoney & Barthel, 1965) and ver-

Janet Fricke BAppSc(OccTher1; Lecturer. Carolyn A. Unsworth PhD, OTR; Senior Lecturer.
Correspondence: Janet Fricke, School of Occupational Therapy, Faculty of Health Sciences, La Trobe University, Locked Bag 12, Carlton South, Vic.
3053, Australia. Email: J.Fricke@latrobe.edu.au
Accepted for publication January 1997.
Inter-rater Reliability of ADL Assessments 23

sions of the BI by Granger, Albrecht and Hamilton (1979), pragmatism and long clinical experience of its designers in
and Fortinsky, Granger and Seltzer (1981). Thus previ- its frankly preferential weighting of mobility and conti-
ously established reliability was no longer valid. nence above other variables’ (p. 358).
The occupational therapists at North-west Hospital’s Studies of various versions of the BI have stated that it
Greenvale campus (a 92-bed aged care and rehabilitation is a valid and relatively reliable tool but not very sensitive
hospital) were concerned that the proposed tool was not a to change (Collin, Wade, Davies & Horne, 1988; Granger,
standardized version of the BI (Mahoney & Barthel, Dewis, Peters, Sherwood & Barrett, 1979; Gresham,
1965), and that the three-point scale it used lacked sensi- Phillips & Labi, 1980; Kidd et al., 1995; Wade & Collin,
tivity to change in patient status. North-west Hospital is a 1988). The measurement of clinical change is crucial to the
multi-campus hospital specializing in aged care. It was assessment of outcome, benefit and cost effectiveness,
proposed that the more sensitive Modified Barthel Index thus requiring a sensitive tool (Kidd et al., 1995; Wellwood,
(MBI; Shah, Vanclay & Cooper, 1989) should be used in Dennis & Warlow, 1995). Shah ef al. (1989) revised the
preference to the recommended tool. However, no litera- OBI to increase its sensitivity to change or small improve-
ture could be found on the inter-rater reliability of the ments, stating that ‘greater sensitivity of the BI is required
MBI. In addition, the North-west Hospital group currently in scoring those individuals who require assistance of
uses (and will continue to use) the Functional Indepen- some nature to perform the tasks’ (p. 704). The new tool,
dence Measure (FIMSM; Granger, Cotter, Hamilton, the MBI (Shah ef al., 1989), was used in a research project
Fiedler & Hens, 1990). on initial and discharge assessments of 258 patients who
had experienced their first stroke. While the activities
Review of B a r t h e l Index l i t e r a t u r e being assessed were not changed, the scoring system was
modified to be more sensitive by including coding for help
The BI, in its various versions, is one of the most fre- required: from 1 (unable) to 5 (independent) (Appendix
quently cited activities of daily living (ADL) assessments. 2). The score for each category reflected the original
In recent occupational therapy literature the strengths and weighting. The MBI included detailed operational defini-
weaknesses of the BI have become topical issues (Eakin, tions, with research demonstrating it to have good internal
1993; Murdock, 1992a, 1992b; Shah & Cooper, 1993a, consistency (Shah ef al., 1989).
1993b). Despite concerns over possible limitations associ- The lack of sensitivity of the original BI suggests that
ated with some versions, the BI is frequently used in Aus- the changes to the MBI were worthwhile. However, these
tralia and has been widely adopted for clinical changes, while making the MBI more sensitive, may have
decision-making and as a measure of outcome for general altered the inter-rater reliability of the tool. A major prob-
use with elderly patients. This tool is recommended for lem, therefore, was the lack of research demonstrating the
use by the Aged Care Assessment Teams in Victoria (But- inter-rater reliability of the MBI (Di Fabio, 1990; Eakin,
ler, Fricke, & Humphries, 1993). 1993). Kidd et al. (1995) asserted that for a measurement
The OBI was developed by Dorothea Barthel to moni- tool to be effective it requires ‘reliability: that the mea-
tor orthopaedic patients’ progress in selfcare and mobility surement is repeatable and reproducible when measured
skill during inpatient rehabilitation (Mahoney & Barthel, by single and different observers’ (p. 12). Inter-rater relia-
1965). The activities that were evaluated included feeding, bility is imperative in clinical fields where therapists make
wheelchairlbed transfer, personal toilet (personal care), judgements on patients’ functional status. These judge-
toilet transfer, bathing, walking, stair climbing, dressing, ments determine treatment protocols and accommodation
and bladder and bowel control. The OBI uses a 3-point placement, in addition to being used as a measure of ther-
scale where raters assign scores of 0,5,10 or 15 to items on apy outcome. It is therefore essential that acceptable relia-
the assessment (Appendix 1). The scores are often bility of the MBI scoring is regularly attained.
summed to give a maximum score of 100. The assessment The study reported in this article was based on
incorporates a weighting system giving extra emphasis to methodologies from several other related studies about
mobility and continence: the two areas that appear to be inter-rater reliability of functional assessment scales.
most important for accommodation placements. Gresham, Collin et al. (1988) undertook a reliability study of the
Phillips and Labi (1980) suggest that the OBI ‘reflects the original BI that investigated four different methods of
24 J. Fricke and C. A. Unsworth

obtaining the score with 25 patients. In one section of the brief FIM correlate with this tool. The brief FIM was
study, scores taken by a trained nurse and an occupational included in this analysis because of its current use and per-
therapist, who tested the patient within 72 hours of ceived value by the occupational therapists at North-west
admission, were compared. Kidd et al. (1995) also used 25 Hospital.
patients in their study on the reliability of the FIM and the
BI. Cole (1989) studied the inter-rater reliability of the
Crichton Geriatric Behavioural Rating Scale, by having METHODS
two raters interview each of the carers of 47 subjects with
dementia. The method and analyses (namely percentage
agreement and intraclass correlation coefficient, ICC)
selected for Cole’s study were also successfully adopted in Twenty-five consecutively admitted patients who were
research examining the inter-rater reliability of the per- referred to the occupational therapy programme were
sonal care section of the FIM (Fricke, Unsworth & Wor- included in this study. The age range of patients was 52-87
rell, 1993). years, with 50% being over 77 years, and a mean age of
75.4 years. Seventeen of the patients were diagnosed with
Study aims a lower limb orthopaedic condition and eight were neuro-
logical patients or patients who had had a stroke. The sex
The aims of the current study, and rationale to support distribution was 12 males and 13 females. Informed con-
these, were as follows. sent was obtained from each of the 25 patients. Each of
(1) To examine the spread of scores for the OBI, MBI the patients were identified by number to maintain confi-
and brief FIM using frequency data. The purpose of this dentiality. Three occupational therapists, who had more
aim was to provide a description of scores that are gener- than 12 months clinical experience, conducted initial
ated when therapists rate the same patient using three dif- assessments of the patients and obtained MBI, OBI and
ferent assessment tools. FIM scores.
(2) To determine the individual item reliability of the
OBI and MBI. When examining the inter-rater reliability Instruments
of a tool, it is also important to consider the reliability of
individual scale items. This is particularly important when The assessment tools used in this study were the OBI
considering that more sensitive scales (with greater range (Mahoney & Barthel, 1965), the MBI (Shah et al., 1989),
of score options) often consequently have reduced and a brief FIM (Granger et al., 1990). These tools have all
reliability. been recommended for multidiagnostic groups (Granger
(3) To determine the inter-rater reliability of the OBI et al., 1990; Mahoney & Barthel, 1965; Shah et al., 1989).
and MBI. No published studies reporting the inter-rater The brief FIM used in this study consisted of summing the
reliability of the MBI have been identified, hence research totals from the FIM subsections of ‘selfcare’, ‘sphincter
of this nature is required. In addition, a further inter-rater control’, ‘mobility’ and ‘locomotion’. The developers of
reliability study of the OBI served as a comparison point FIM suggest that this is a legitimate practice, and the FIM
in the present study, and may offer further supportive score is rated out of 91. The use of the brief FIM suited the
information about the psychometric properties of this purposes of this study, given that the OBI and MBI are
tool. compared with the FIM and yet do not contain items on
(4) To examine patterns of score shift between the ‘communication’ or ‘social cognition’.
OBI and MBI. The purpose of this aim was to examine if
the three therapists were consistent in any scoring shift Procedure
when scoring patients on both the OBI and the MBI.
(5) To correlate the MBI, OBI and brief FIM to look The three occupational therapists who were members of
at the relationship between scores on these three assess- the rehabilitation team were trained for approximately 2
ments. Given that the OBI is considered a gold standard, hours in the use of the MBI and the OBI by the investiga-
it is important to determine if, and how well, the MBI and tors. Training consisted of reviewing the operational defin-
Inter-rater Reliability of ADL Assessments 25
I

itions and specific guidelines related to each assessment, Table 1. Summary of averaged kappa scores for original
and practice using each tool with a written case study. Barthel Index (OBI) and the modified Barthel Index (MBl)
Each rater was issued with operational definitions pub- items for three raters
lished in the original articles (Mahoney & Barthel, 1965;
Item OBI MBI
Shah et al., 1989) and with specific guidelines for the two
versions of the BI modified from Collin and Wade (1988; Personal hygiendtoilet 0.61 0.68
Appendix 3). They were all credentialled FIM administra- Bathing self not calculable 0.66
tors. Two of the occupational therapists observed an ADL Feeding 0.85 0.88
assessment being undertaken by the third occupational Getting on/off toilet 0.67 0.61
therapist. Some data, such as continence, was obtained by Stair climbing insufficient data insufficient data
questioning the patient or nursing staff. The FIM score Dressing 0.57 0.52
from the rehabilitation team was recorded for each Bowel control 0.81 I 0.67
patient. The two occupational therapists observing and the Bladder control 0.75 0.60
therapist administering the assessments scored the activi- Am bulation 0.67 0.68
ties undertaken using the MBI and the OBI. The therapist Wheelchair (only if ambulation
administering the assessment changed according to who is 0) insufficient data insufficient data
had been assigned to assess the patient’s showering ability. Wheelchair or chair to
Scoring was completed independently by each rater. Score bed transfers 0.66 0.53
sheets were colour coded to indicate which therapist was
scoring. At the conclusion of the data collection phase, the
three occupational therapists completed a questionnaire bility. These are important questions because, often,
to obtain information on their perception of the advan- greater scale sensitivity is achieved with a corresponding
tages and disadvantages of the three assessments, includ- decrease in inter-rater reliability. Initially, the individual
ing operational definitions, scale sensitivity, ease of use reliability of each of the items on the OBI and MBI was
and clinical utility. established. Following convention, this was achieved using
the kappa coefficient. The results are summarized in Table
1. The reliability of four items could not be assessed due to
RESULTS
lack of variability in the scoring, or missing scores (e.g.
Spread of scores f o r t h e OBI, MBI and only wheelchair or ambulation is scored). All other items
brief F I M were found to possess moderate to good reliability, with
the kappa scores ranging from 0.52 on the MBI for dress-
Frequency distributions of scores generated by the three ing to 0.88 for feeding.
therapists using the OBI and MBI for 25 patients were
constructed. A frequency distribution was also generated The i n t e r - r a t e r r e l i a b i l i t y of t h e OBI and
for the brief FIM. The results indicated that, while FIM MBI scales
and MBI assessments produce a range of scores, scores on
the OBI are clustered and have a more limited range. Inter-rater reliability of each therapist’s scores using the
OBI was calculated using the ICC (type 2,l; Shrout &
Individual i t e m reliability of the OBI and Fleiss, 1979). Scores generated from both the OBI and the
MBI MBI were treated parametrically as argued in Unsworth,
Thomas and Greenwood (1995). The ICC for the OBI was
Given the above finding, that the MBI produces a better
spread of scores, we examined then whether the MBI and found to be 0.957. Inter-rater reliability of each therapist’s
the OBI possess adequate item and scale inter-rater relia- scores using the MBI was then calculated using the same
bility. These are important questions because, often, method. The ICC for MBI was also found to be very good
greater scale sensitivity is achieved with a corresponding at ICC (type 2 , l ) = 0.979.
26 I. Fricke and C.A. Unsworth

P a t t e r n s of score shift b e t w e e n t h e OBI tion’ and ‘communication’ were removed from the analy-
and t h e MBI ses because the OBI and the MBI do not contain these
items. Results shown in Table 2 indicate that results on the
The data were also examined to determine if the thera- three assessments are highly correlated, with coefficients
pists were consistent in any scoring shift when moving ranging from 0.86 to 0.96.
from the OBI to the MBI, for the 25 patients. For exam-
ple, for the item ‘feeding’, a patient who needs assistance
scores a 5 (with help) on the OBI. However, on the MBI DISCUSSION
the therapists could score a patient who needs assistance
as a 2 (attempts task but unsafe), a 5 (moderate help Spread o f scores f o r t h e OBI, t h e M B I
required), or an 8 (minimal help required). This led the and t h e b r i e f F I M
researchers to question if patient 1 was scored as a 5 on
the OBI by all therapists, did they consistently rate the As already suggested, the frequency distributions for the
patient as a 2, 5 or 8 on the MBI? The findings for ‘feed- OBI, the MBI and the brief FIM indicated that both the
ing’ indicate that the raters used both 5 and 8 for patients MBI and the FIM can give a more ‘sensitive’ or precise
who had been rated as 5 on the OBI, and did so consis- indication of a patient’s status, due to the greater range of
tently: of 13 patients originally rated as 5 by all three ther- possible total scores. It becomes clear from this simple
apists, eight were subsequently given a rating of 8 on the analysis that we can tell more about the patient’s ADL
MBI and five patients were consistently left with an MBI performance using the totals from the brief FIM and the
rating of 5. Thus, the raters appeared to be making use of MBI, than using the OBI.
the expanded scale, and did so consistently. Visual inspec-
tion of the data for all 10 items revealed that this was the Individual i t e m r e l i a b i l i t y of t h e OBI and
case for six items. The six items were: feeding, transfers, the MBI
grooming, bathing, toileting and ambulation. It was not
possible to see this trend in the scoring for dressing, conti- All items were found to possess moderate to good reliabil-
nence of bowels, continence of bladder or stair climbing, ity. The items that were found to be under 0.60 for the
due to lack of variability in the data or lack of data (e.g. MBI were ‘dressing’, ‘wheelchair or chair to bed transfers’
very few patients used a wheelchair). and ‘bladder control’. The only item on the OBI with reli-
ability of 0.60 or less was ‘dressing’. It is possible that the
Correlations b e t w e e n t h e M B I , OBI a n d operational definitions for these items require revision to
F I M subsections reduce any ambiguity. In other versions of the BI (Fortin-
sky et al., 1981; Granger, Albrecht & Hamilton, 1979) and
The Pearson product-moment correlation coefficient in the FIM, ‘dressing’ has been divided into upper body
(Pearson’s r ) was used to examine the relationships dressing and lower body dressing. This could overcome
between the scores on the three tools given by each thera- the reliability problem and also provide a more sensitive
pist. For this analysis, data from only four of the six FIM score.
subsections was used. The FIM subsections ‘social cogni-
The i n t e r - r a t e r r e l i a b i l i t y of t h e OBI and
t h e MBI
Table 2. Correlations between the Original Barthel Index
(OBI), the Modified Barthel Index (MBI) and the brief The ICC figure for both the OBI and the MBI were very
Functional Independence Measure (FIM) high, suggesting that the tools possess excellent inter-rater
Correlation Therapist A Therapist B Therapist C reliability. It was expected that scores on the MBI may
have been lower than the OBI, simply as a function of the
MBI and OBI 0.90 0.96 0.93
greater choice of scores available. However, this was not
MBI and FIM 0.91 0.89 0.90
the case. The three therapists who participated in this
OBI and FIM 0.86 0.87 0.90
study were diligent in their attention to detail on both
Inter-rater Reliability of ADL Assessments 27

scales. It is also possible that having worked together for misleading as it also included perineal hygiene. ‘Cleaning
some time, these therapists have calibrated their style of self’ is also included in ‘bowels’ and ‘on and off toilet’.
rating. While the OBI has long since established inter- Secondly in ‘on and off toilet’ in the MBI, ‘washing hands’
rater reliability, further studies of the MBI are required is mentioned in level 3 but nowhere else. Overall, the ther-
using multiple facilities and therapists. apists felt that the FIM was the easiest tool to use, and
score, because of the availability of clear guidelines and
P a t t e r n s of s c o r e shift b e t w e e n t h e OBI the mandatory need for training. Although they felt that
and t h e MBI training to use any functional assessment tool was
required to ensure reliable data, the clinicians felt that
The consistent patterns of score shift when rating patients training was particularly important with the FIM.
on the OBI and the MBI for several of the scale items pro-
vides further evidence of the reliability and increased sen- Study l i m i t a t i o n s
sitivity of the MBI. Although data plots have not been
provided for visual inspection, these trends are of signifi- A limitation of this study was the modest sample size.
cance. They indicated that, in the main, scores from the While studies of a similar nature that evaluate inter-rater
OBI seemed to be ‘reliably’ or ‘consistently’ divided into reliability also use samples of approximately 25 patients
two or three of the MBI categories. (Cole, 1989; Collin et al., 1988; Kidd et al., 1995), future
studies using larger patient numbers should be conducted.
Correlations b e t w e e n t h e MBI, t h e OBI A further limitation of this study is the possibility that
and t h e F I M subsections raters may have been influenced by observing each other
administer the assessment, and thus alter their own prac-
Results for the three assessments were found to be highly
tice as a consequence. All inter-rater reliability studies
correlated for each of the three therapists. Although an
potentially suffer this limitation. In addition, it is assumed
expected result, this is reassuring as it ihdicates the stabil-
that the results of this study can be generalized to other
ity of the tools, and the fact that these three assessments
clinical facilities. However, further research is required to
all seem to be measuring a patient’s level of functional
confirm the external validity of this study’s findings.
independence.

Clinicians’ c o m m e n t s
CONCLUSION
The three therapists involved in the study were also asked
several questions relating to their perceptions of the tools. This study has indicated that the OBI and the MBI both
They found it helpful to have all three scales ranked from possess high inter-rater reliability when clinicians spend a
most independent to most dependent. We suggest that the short training period familiarizing themselves with the
MBI operational definitions should be reversed to become scoring protocols. The findings of this and other research
5-1, or independent to dependent. The raters should pro- presented in the introduction suggests that since the OBI,
vide the patient with every possible opportunity to com- the MBI and the FIM all appear to possess sound psycho-
plete a task, rather than assist the patient too soon. While metric properties, choice of tool remains a matter of per-
the therapists reported that the OBI was much easier to sonal preference. However, as the FIM and the MBI are
score than the MBI, they felt that the OBI did not always more sensitive tools and have acceptable inter-rater relia-
reflect the incremental nature of patient improvement bility, they may be of more value in showing change over
made in rehabilitation. They also found it frustrating that time and demonstrating more precise scores for ADL
patients were not rewarded with some score points on the tasks. Factors that will continue to guide this choice
OBI for partially completing tasks (e.g. bathing and include the range of items that are covered in the tool,
grooming/personal toilet). There were two other areas of clinical utility and time taken to administer the assess-
confusion with the operational definitions of the MBI. ment. Clinicians must be aware of the importance of using
The first occurred with the term ‘on and off toilet’, being a standardized version of the OBI, the MBI or the FIM
28 I. Fricke and C. A. Unsworth

(as opposed to a composite version), and should always Archives of Physical Medicine and Rehabilitation, 60,
indicate which version is being used. 145-154.
Granger, C . V., Cotter, A. C., Hamilton, B. B., Fiedler,
R. C., & Hens, M. M. (1990). Functional assessment
ACKNOWLEDGEMENTS scales: A study of persons with multiple sclerosis.
Archives of Physical Medicine and Rehabilitation, 71,
The authors wish to thank the occupational therapists, 870-875.
Colin Steel, Sally Dyte and Monique Baxter at the Green- Granger, C., Dewis, L., Peters, N., Sherwood, C., & Bar-
vale Campus of North-west Hospital, for participating in rett, J. (1979). Stroke rehabilitation: Analysis of
the research. Their enthusiasm and dedication to the pro- repeated Barthel Index measures. Archives of Physical
ject ensured its success. We also wish to thank the School's Medicine and Rehabilitation, 60,14-17.
Senior Research Assistant, Diane Worrell, for skilled data Gresham, G., Phillips, T., & Labi, M. (1980). ADL status
management. in stroke: Relative merits of three standard indexes.
Archives of Physical Medicine and Rehabilitation, 61,
355-358.
REFERENCES Kidd, D., Stewart, G., Baldry, J., Johnson, J., Rossiter, D.,
Petruckevitch, A,, & Thompson, A. (1995). The Func-
Butler, A., Fricke, J., & Humphries, S. (1993). Standard tional Independence Measure: A comparative validity
assessment instruments suitable for use by ACATs and and reliability study. Disability and Rehabilitation, 17,
HACC Agencies. Lincoln Papers in Gerontology, 21,
10-14.
1-74.
Law, M. (1987). Measurement in occupational therapy:
Cole, M. (1989). Inter-rater reliability of the Crichton
Scientific criteria for evaluation. Canadian Journal of
Behaviour Rating Scale. A g e and Ageing, 18,57-60.
Occupational Therapy, 54,133-138.
Collin, C., Wade, D. T., Davies, S., & Horne, V. (1988).
Mahoney, F., & Barthel, D. (1965). Functional evaluation:
The Barthel ADL Index: A reliability study. Inter-
The Barthel Index. Maryland State Medical Journal, 14,
national Disability Studies, 10 (2), 6143.
6143.
Di Fabio, R. P. (1990). Reliability and validity of func-
Murdock, C. (1992a). A critical evaluation of the Barthel
tional assessment in patients with stroke. Journal of
Index part 1. British Journal of Occupational Therapy,
Neurological Rehabilitation, 4,145-152.
55,109-111.
Eakin, P. (1989). Problems with assessments of activities of
Murdock, C. (1992b). A critical evaluation of the Barthel
daily living. British Journal of Occupational Therapy,
Index part 2. British Journal of Occupational Therapy,
52,50-53.
Eakin, P. (1993). The Barthel Index: Confidence limits. 55,153-156.
British Journal of Occupational Therapy, 56,184-185. Shah, S., & Cooper, B. (1993a). Commentary on 'a critical
Fortinsky, R., Granger, C., & Seltzer, G. (1981). The use of evaluation of the Barthel Index'. British Journal of
functional assessment in understanding home care Occupational Therapy, 56,70-72.
needs. Medical Care, 19,489-497. Shah, S., & Cooper, B. (1993b). Issues in the choice of
Fricke, J. (1993). Measuring outcomes in rehabilitation: A activities of daily living assessment. The Australian
review. British Journal of Occupational Therapy, 56, Occupational Therapy Journal, 40,77-82.
217-221. Shah, S., Vanclay, F., & Cooper, B. (1989). Improving the
Fricke, J., Unsworth, C., & Worrell, D. (1993). The relia- sensitivity of the Barthel Index for stroke rehabilita-
bility of the Functional Independence Measure with tion. Journal of Clinical Epidemiology, 42,703-709.
occupational therapists. The Australian Occupational Shrout, P. E., & Fleiss, J. L. (1979). Intraclass correlations:
Therapy Journal, 40,7-15. Uses in assessing rater reliability. Psychological
Granger, C., Albrecht, G., & Hamilton,'B. (1979). Out- Bulletin, 86,420-428.
come of comprehensive medical rehabilitation: Mea- Unsworth, C. A., Thomas, S. A., & Greenwood, K. (1995).
surement by PULSES profile and the Barthel Index. Rehabilitation team decisions concerning discharge
Inter-rater Reliability of ADL Assessments 29

housing for stroke patients. Archives of Physical Medi- Wellwood, I. D., Dennis, M. S., Warlow, C . F! (1995). A
cine and Rehabilitation, 76,331-340. comparison of the Barthel Index and the OPCS Dis-
Wade, D . T., & Collin, C. (1988). The Barthel A D L Index: ability Instrument used to measure the outcome of
A standard measure of physical disability? Znter- stroke. Age and Ageing, 24,54-57.
national Disability Studies, 10,64-67.

Appendix 1. Scoring system for the original Barthel Index Appendix 2. Scoring system for the modified Barthel Index
(Mahoney & Barthel, 1965) (Shah eta/., 1989)
Activity Dependent With help Independent Items/ 1 Unable 2 Attempts 3 Moderate 4 Minimal 5 Fully
activity to perform task but help help indepen-
Feeding 0 5 10
task unsafe required required dent
Bathing self 0 0 5
Dressing 0 5 10 Feeding 0 2 5 8 10
Grooming/personal toilet 0 0 5 Bathing self 0 1 3 4 5
Getting on and off toilet 0 5 10 Dressing 0 2 5 8 10
Controlling bowels 0 5 10 Grooming/personal
Controlling bladder 0 5 10 hygiene 0 1 3 4 5
Moving from wheelchair to bed Toilet 0 2 5 8 10
and return 0 5-10 15 Bowel control 0 2 5 8 10
Walking on level surface 0 10 15 Bladder control 0 2 5 8 10
(or propelling wheel chair if Chair/bed transfers 0 3 8 12 15
unable to walk*) O* O* 5* Ambulation 0 3 8 12 15
Ascend and descend stairs 0 5 10 Wheelchair (only if
Total score 100 ambulation is 0) 0 1 3 4 5
Stair climbing 0 2 5 8 10
* Score only if unable to walk.
Total score 100

Appendix 3. Rules for using the modified and original Barthel Index for this research
It is recommended that the guidelines modified from Collin and Wade (1988) be used for any version of the Barthel adopted. They follow the general
thrust of the original article by Mahoney and Barthel (1965) and are well set out, logical and easy to interpret. They can be summarized as:
Use as a record of what a patient does, NOT as a record of what a patient could do.
The main aim is to establish degree of independence from any help physical or verbal, however, minor and for whatever reason.
The need for supervision renders the patient NOT independent.
Usually the performance over the preceding 24-72 h is important, but occasionally longer periods will be relevant.
Unconscious patients should score ‘0’ throughout, even if not yet incontinent.
Middle categories imply that patient supplies over 50% of the effort.
Use of devices to be independent is allowed.
If patient is using a wheelchair on admission, but is likely to be discharged with a walking aid, assess both methods of ambulation at admission.
Set up for feeding and dressing is not included in the rating; a patient can still score 5 if clothes are laid out or food placed on a tray or table.
Antiembolic stockings are not assessed as part of dressing.
Ambulation is assessed by observation and/or physiotherapy consultation or notes. Walking device is to be within reach of patient.
Stairs: the ability to go up and down three steps.
Bowel and bladder continence can be assessed from the nursing notes and/or observation.
If patient refuses to undertake an activity, score 0 and indicate refusal on score sheet.
In bladder assessment, an external device refers to condom drainage.

You might also like