Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Current Concepts

Analysis of Evidence-Based Medicine for Shoulder Instability

Kevin D. Plancher, M.D., and Sheryl L. Lipnick, D.O.

Abstract: Clinical research has become a major influencing factor in the determination of treatment
choice in our society. Outcome data have been requested by third-party payers, patients, and
administrators alike. Currently, there are over 10 different scoring systems that have been used to
evaluate the efficacy of treatment for shoulder instability. Some of these scoring systems are based
on the specific condition of shoulder instability; however, other systems are broadly based to
incorporate a spectrum of shoulder conditions. This review summarizes the process of proper
development and testing of the scoring systems, discusses their role in clinical research with respect
to shoulder instability, and explains the dichotomy of postoperative recurrence of instability and high
shoulder scores. The Shoulder Rating Questionnaire (SRQ), Melbourne Instability Shoulder Score
(MISS), Western Ontario Shoulder Instability Index (WOSI), Oxford Instability Score (OIS), and
Simple Shoulder Test were shown to be reliable for patients with instability. The SRQ, MISS, WOSI,
OIS, and American Shoulder and Elbow Surgeons score have all been shown to be largely
responsive. There are 2 shoulder scoring systems, the WOSI and the MISS, that we recommend be
used to evaluate shoulder instability. The SRQ and OIS were found to be less responsive for patients
with instability compared with patients with other shoulder dysfunctions. Other scoring systems lack
inter-rater reliability, validity, and/or responsiveness for patients in the instability population. The
optimal scoring system for patients with upper extremity problems other than those with shoulder
instability has yet to be determined; however, the American Shoulder and Elbow Surgeons score may
be considered, because this instrument has been proven to be valid, reliable, and responsive. Key
Words: Shoulder—Scoring systems—Outcome measurements—Validity—Melbourne Instability
Shoulder Score (MISS)—Western Ontario Shoulder Instability Index (WOSI).

A n increasing number of outcome measurement


systems have been designed to report on the
effectiveness of treatment for shoulder pathologies.
only patients but also administrators and third-party
payers. Quality-of-life measures have become the
standard means of assessing the results of health care
These scoring systems need to be valid, reliable, and interventions.1 There is a need in the orthopaedic
responsive, with high intraobserver agreement. More community for well-designed outcome studies that
emphasis on outcome data has been requested by not measure the improvement of health status and quality
of life of patients. Using the appropriate instrument is
essential if outcome measures are to be valid and
From the Orthopaedic Foundation for Active Lifestyles (K.D.P., clinically meaningful. Recently, the patient’s percep-
S.L.L.), Cos Cob, Connecticut; Plancher Orthopaedics & Sports tions have assumed greater importance in outcome
Medicine (K.D.P., S.L.L.), New York, New York; and Albert Ein-
stein College of Medicine (K.D.P.), New York, New York, U.S.A. measurements. Failure to account for these factors has
The authors report no conflict of interest. been a major limitation in previous shoulder scoring
Address correspondence and reprint requests to Kevin D.
Plancher, M.D., Orthopaedic Foundation for Active Lifestyles, 31
systems.
River Rd, Cos Cob, CT 06807, U.S.A. E-mail: kplancher@ When one is evaluating a shoulder scoring system,
plancherortho.com the different forms of treatment must be compared
© 2009 by the Arthroscopy Association of North America
0749-8063/09/2508-0928$36.00/0 along with the results. Adequate outcome data to
doi:10.1016/j.arthro.2009.03.017 support the treatment strategies must be evaluated.

Arthroscopy: The Journal of Arthroscopic and Related Surgery, Vol 25, No 8 (August), 2009: pp 897-908 897
898 PLANCHER AND LIPNICK

The scoring system should account for the confound- their article on disease-specific quality-of-life mea-
ing factors that may affect the outcomes. surement tools, “Nowhere are there more insensitive,
An article was recently published on measuring unreliable, unvalidated measurement tools than in the
arthroscopic outcome.2 The authors discussed how orthopaedic literature.”
measures of symptoms, activities, and function may According to Gerber,8 the more commonly used
be specific to region/joint, disease, or injury. General systems have major deficiencies when applied to in-
outcome measures are usually less responsive to stability populations. One common deficiency is an
change in the patient’s status compared with more insufficient number of items in the evaluation. Some
specific outcome measures, which focus primarily on response options are not mutually exclusive. In addi-
the condition or population of interest to increase tion, some scoring systems rely on only part of the
responsiveness. Region-specific measures have the tool to be used to generate a score. Other grading
advantage of being appropriate for a wide variety of systems use poorly defined techniques for performing
injuries, whereas more specific measures may be more examinations. Different authors weigh items arbi-
relevant to patients and clinicians. The authors further trarily. Lastly, there is inadequate consultation with
stated that a measure must be validated for a specific patients in the item-generation process, leading to
purpose such as the evaluation of a population of diminished comprehensiveness.
subjects with a specific condition. Not surprisingly, Leggin and Iannotti9 evaluated shoulder outcome
there has been a recent trend in the orthopaedic liter- measurements in 1999, concluding that the goal of any
ature to evaluate region-specific scoring systems for treatment intervention should be to improve the pa-
acceptable psychometric parameters (validity, respon- tient’s pain, function, and satisfaction. They urged for
siveness, and reliability) for injury-specific patholo- the creation of a patient self-assessment tool that mea-
gies.3,4 sures outcome for the general population of shoulder
A review of the numerous knee injury–measuring patients in conjunction with a scale designed for the
outcomes currently in use was recently published in athletic population. In addition, the level of improve-
The Journal of the American Academy of Orthopae- ment in impairments such as range-of-motion and
dic Surgeons.5 The author noted that there has been strength impairments should be compared with the
a shift from clinician-based outcomes tools to the patient’s assessment of outcome.
development and validation of patient-reported out- The solution to these problems involves a multifac-
comes measures. He concluded that a general health torial approach. Knowledge of strengths and weak-
outcomes measure should be used in conjunction nesses of the traditionally used systems may serve as
with one or more disease- or condition-specific rat- a guide for future improvements. There needs to be
ing scales. more awareness of appropriate methods by which
Nagi6 developed a disablement scheme in 1965 to these instruments should be developed and tested.
further describe functionality and the consequences of Traditional systems need to be thoroughly evaluated
pathology. The term “disablement” refers to the con- to recognize deficiencies. Clinicians should be made
sequences that a disease, injury, or congenital abnor- aware of the newer systems supported by psychomet-
mality may have on human functionality at many ric testing data.
different levels. Nagi’s disablement scheme character-
izes the transition of active pathology to impairment,
which is an anatomic or physiologic abnormality or INSTRUMENT DEVELOPMENT
loss. Impairment then leads to functional limitation AND TESTING
and, finally, to disability.
The orthopaedic literature has traditionally focused A proper quality-of-life tool requires a formal de-
on measures of impairment, which include range of velopment process and extensive instrument testing.
motion, strength, stability, and pain. In the past clini- The approach to this methodology is described by
cians determined the various outcome priorities with- Guyatt and colleagues10 and is divided into 2 phases
out accounting for patient preferences. A standard, with a total of 9 steps. The first phase, the develop-
universally accepted shoulder scoring system for as- ment process, involves 4 steps. The first step specifies
sessing the functional state of a normal, diseased, or measurement goals. The next step is item generation.
postoperative shoulder does not exist. This is one of This is both expert and patient based. The third step
the most important factors preventing progress in clin- involves item reduction. The fourth step is question-
ical orthopaedics. As Kirkley and Griffin7 stated in naire formatting. This often involves the visual analog
SHOULDER INSTABILITY 899

scale (VAS) and Likert scale. The first phase is out- that item they are experiencing. The Likert scale has 7
lined below in detail. options for which patients indicate agreement with
each statement. For example, terms such as excellent,
Development very good, good, moderate, poor, very poor, and ex-
Specifying Measurement Goals: Specific mea- tremely poor are used. Ideally, the questionnaire
surement goals must first be identified. The population should be at an eighth-grade reading level or less. The
must be adequately described, and the appropriate questions should be as short as possible and worded in
sampling of this population must be considered. Usu- a positive manner for ease of interpretation.
ally, this involves an evaluative instrument that de-
tects important changes over time in health status.
Testing
Sampling instruments can also be discriminative or
predictive. Discriminative instruments differentiate The second phase of the methodology used to for-
between patients with different levels of health at one mulate a proper quality-of-life tool, as originally de-
time, whereas predictive instruments predict the pa- scribed by Guyatt and colleagues,10 is discussed in the
tient’s future health status. For the purpose of this following sections. This phase involves rigorous in-
article, the instrument should evaluate the quality of strument testing and includes 5 components: (1) pre-
life for patients with symptomatic instability of the testing, (2) reliability, (3) responsiveness, (4) valida-
shoulder. Patients with problems that will prevent tion, and (5) interpretability.
reliable completion of interviews or questionnaires, Pretesting: The pretesting phase identifies any
such as language barriers or psychiatric disorders, problems that might arise with the set of items in the
must be excluded. Patients who have other disorders questionnaire. The clarity of the wording is fully eval-
that may contribute to shoulder dysfunction or other uated. Subjects are given the questionnaire and asked
major illnesses that would influence their quality of to give their interpretation of each item, as well as to
life should also be excluded.
identify unclear questions or the need for the addition
Item Generation: Item generation is a critical step
of any questions that were omitted.
in the overall development process of a measurement
Reliability: Reliability is the extent to which a
tool. Domains that represent overall health include
system is consistent and free from error. This involves
physical symptoms, sports and recreation, work, life-
both testing and retesting. Repeat administration of a
style, and emotions. Item generation involves 3 steps:
measurement tool is performed in stable subjects and
(1) a thorough review of the literature, including mod-
ifications of similar disease-specific quality-of-life should produce the same results. The instrument
outcome measures; (2) interviews with colleagues and should be measured initially and then again after an
experienced clinicians to identify issues involved with interval long enough that the subject will have forgot-
treating patients with the condition of interest (in this ten the previous responses but short enough that the
case, shoulder instability); and (3) interviews with responses will not have changed. The ideal time for
patients with the condition of interest. retesting has not yet been determined. Reliability is
Item Reduction: To make a tool practical and important for discriminative tools, and intrarater and
reasonable in length, items must be eliminated. inter-rater reliability should be established. The intra-
Focus groups can be very helpful in the item-reduc- class correlation coefficient (ICC) is a statistical test of
tion phase to identify questions that contain the agreement between repeated measures and evaluates
most importance and frequency. Patients can identify the ratio of between-subject variance to total variance.
items that have the greatest relevance to their experience Values range from ⫺1 to ⫹1, with 0 indicating ran-
based on the importance to their overall health. Items can dom correlation. Although the degree of reliability is
also be compared with one another to eliminate those not concrete, acceptable test-retest reliability is gen-
that measure a similar concept. erally considered to have an ICC of 0.70 or greater.11
Question Formatting: Response systems need to Internal consistency (IC), as measured by the Cohen ␬
be able to detect minimal but clinically significant statistic, is another measure of reliability. This mea-
changes within each subject. Two commonly used sures how well individual items “hang” together. The
systems are the simple VAS and the Likert scale. The acceptable IC as measured by the Cronbach ␣ is
VAS consists of a 10-cm line on which patients can greater than 0.60, where an ␣ of 1.0 represents perfect
mark their response for each item. Patients respond by IC.12 Other authors, however, have suggested that
placing a mark on the line estimating the amount of acceptable scores should be greater than 0.70.13
900 PLANCHER AND LIPNICK

Responsiveness: Responsiveness refers to the abil- tool in a context for physicians. The minimal impor-
ity of a tool to measure change over time. There are a tant difference (MID) is the smallest difference in
variety of methods that have been proposed to mea- score that patients perceive as beneficial and that
sure the responsiveness of a tool, including the re- would mandate a change in management. The MID
sponsiveness index, standardized response mean can help to judge the magnitude of benefit from dif-
(SRM), relative efficiency, and effect size. The SRM ferent treatments, assist in calculating the sample size
is calculated by the following formula: (Mean post- for clinical trials, and help to determine the proportion
operative scale ⫺ Mean preoperative scale)/SD of of patients who will benefit from a given treatment.
change in scale. The effect size is calculated by the
following formula: (Mean postoperative scale ⫺ SCORING SYSTEMS
Mean preoperative scale)/SD of preoperative scale.
Small effects are considered greater than or equal to When creating a shoulder scoring system, patient
0.20, moderate effects are considered greater than or input and feedback should be emphasized. The popu-
equal to 0.50, and large effects are considered greater lation for whom the tool is being designed should be
than or equal to 0.80.14 The most frequently used adequately defined. The purpose of the instrument
formula, the SRM, is the method that best correlates should be discriminative, evaluative, or predictive.
with the power of a test and is most relevant to Discriminative systems will differentiate between pa-
designing clinical trials.15 A highly responsive scale tients with different levels of a condition, whereas
has the very important benefit of requiring fewer sub- evaluative systems will determine the effectiveness of
jects in clinical trials to show a statistically significant the treatment. Predictive systems will classify individ-
difference between treatment groups. uals against an external criterion. Self-administered
Validation: Validity discusses the extent to which scoring systems show less observer bias and are easier
an instrument measures what it is intended to measure. to use.
There are various types of validity. Criterion validity Overall, common guidelines and criteria for treat-
is a way to show that the results of the new scale ment need to be developed. Identification of effective
correlate to an external “gold standard.” Content va- treatments for similar conditions needs to be facili-
lidity is shown when the scale measures all important tated and comparative prospective studies encouraged.
aspects of the condition to be examined. It addresses There is a great need for a validated measuring sys-
floor and ceiling effects. Construct validity refers to tem. This system should be simple, effective, and
whether there is a correlation with other measures as strongly weighted toward functional outcome. It
could be predicted if an instrument is measuring what should satisfy the development testing criteria and be
it is supposed to measure. Construct validation is the self (patient)–administered.
major category for health-related quality-of-life in- Various scoring systems for functional assessment
struments. An extra component to validity is the sen- of the shoulder have been reported.16,17 Unfortunately,
sitivity to change or responsiveness. these scores concentrated on range of motion, pain,
Currently, there is no shoulder scoring system avail- functional limitation, or strength while the primary
able for instability or other conditions that has been complaint in the patient with shoulder instability, and
established as the gold standard. Thus the validation sometimes the only complaint, is apprehension or
process requires the use of multiple types of validity avoidance of activity. Because most instruments have
for verification. The scoring systems that will be re- not focused on apprehension, dichotomy has occurred
viewed in this article have been compared with other in many recent studies between postoperative recur-
shoulder scoring systems, the Short Form (SF)–12, rence of instability and reporting of high shoulder
and the SF-36. Although some of these systems may scores.
have been validated for a combination of various types The scoring systems that address instability, in no
of shoulder dysfunction, many have not been vali- particular order, include the Rowe/modified Rowe
dated specifically for the subgroup of instability. Al- score; the American Shoulder and Elbow Surgeons
though the SF-36 is a known standard for health status (ASES) score; the Shoulder Rating Questionnaire
measurement, it was not originally developed to be (SRQ); the Melbourne Instability Shoulder Score
used specifically in relation to outcomes of surgical (MISS); the Disabilities of the Arm, Shoulder and
treatment. Hand (DASH) score; the Western Ontario Shoulder
Interpretability: Interpretability relies on placing Instability Index (WOSI); the Oxford Instability Score
the magnitude of a change seen with a measurement (OIS); the Constant-Murley (CM) score; the Athletic
SHOULDER INSTABILITY 901

TABLE 1. Characteristics of Shoulder Scoring Systems


Scoring System No. of Questions Total Score Pain Stability Function ROM Strength Satisfaction S/O

ASES self-report 11 100 50 0 50 0 0 0 S


ASORS 6 100 10 10 60 10 10 0 S/O
CM 7 100 15 0 20 40 25 0 S/O
DASH 30 100 13 0 80 0 7 0 S
MISS 24 100 15 33 52 0 0 0 S
OIS 12 60 10 10 40 0 0 0 S
Rowe 3 100 0 50 30 20 0 0 S/O
SRQ 21 100 40 0 60* 0 0 Separate S
SST 12 12 2 0 7 3 0 0 S
UCLA 5 35 10 0 10 5 5 5 S/O
WOSI 21 21 4 2 11 2 2 0 S

Abbreviations: ROM, range of motion; S, subjective; O, objective.


*Includes 15 points for global assessment.

Shoulder Outcome Rating Scale (ASORS); the Uni- Modified Rowe Score
versity of California, Los Angeles (UCLA) score; and
the Simple Shoulder Test (SST). Of these, the WOSI Jobe et al.19 made additions to the Rowe score,
and the MISS are rigorously designed and evaluated which resulted in the modified Rowe scale. The new
but not widely used. The more commonly used sys- scoring system took into consideration the ability to
tems are also reviewed but have major deficiencies throw, the return to prior level of competition, and a
when applied to instability populations. The reader pain scale. There are still a total of 100 points, with 10
therefore should be cautious when making conclu- points for pain, 30 points for stability, 10 points for
sions that could alter clinical decisions, because some range of motion, and 50 points for function. Similar
scoring systems used may not be evaluating subjects problems exist for both the Rowe and modified Rowe
adequately. The various characteristics of the different scales. There are no published reports on development
systems are summarized in Table 1. or testing of the modified Rowe scale. There are
double-barrel questions, weighting is not supported,
Rowe Score the measurement for apprehension is not clearly de-
fined, and the scoring system combines 2 items of
The Rowe score was initially described in 1978 as subjective evaluation with 1 physical examination.
a method to assess the outcome of treatment for an- The main disadvantage of the new scoring system is
terior shoulder instability after Bankart repair.18 There the increased value given to athletic activities. This
are a total of 100 points: 50 points for stability, 20 change decreases the sensitivity to the effect of treat-
points for range of motion, and 30 points for function. ment for nonathletic patients.
There are, however, a number of problems associated
with the scoring system. No published reports exist on ASES Score
development or testing. The test includes double-bar-
rel questions, where subjects must consider more than The ASES score was developed by the ASES Re-
one question at the same time. The assignment of search Committee and published by Richards et al.20
weighting is not supported. The measurement for ap- in 1994. There is a self-reporting section with a 100-
prehension is not clearly defined. The type of range of point score index. There is 1 question with respect to
motion, whether active versus passive, is not speci- pain, which is marked on a 10-cm VAS and multiplied
fied. In addition, loss of external rotation, which can by 5 and is worth a total of 50 points. The pain score,
be crucial for the overhead athlete, is allocated only a unfortunately, is not specific to any particular activity.
small percentage of the overall score. The scoring The section on activities of daily living is composed of
system combines 2 items of subjective evaluation with 10 questions and is also worth 50 points. However,
1 physical examination, leading to a total score com- there are only 4 Likert categories, giving the tool a
posed of separate attributes. Lastly, there is no specific high likelihood of poor responsiveness. The system
evaluation of pain. has been found to be reliable, valid, and responsive by
902 PLANCHER AND LIPNICK

subsequent testing but not by the original develop- question that is scored and presented separately. An
ers.21 There is no rationale for weighting items, no additional, nongraded domain allows the patient to
“satisfaction” category, and no description on how select 2 areas in which he or she believes improve-
items were selected. The physical assessment portion ment is most important. Test-retest reliability was
of the test, which is not scored, evaluates the follow- completed in 40 patients at a mean of 3 days.
ing clinical components: range of motion, manual However, the diversity of the population chosen as
motor testing, tenderness to palpation, crepitus, im- well as the short time to retesting may have erro-
pingement, apprehension, relocation test, general lax- neously inflated the degree of correlation. The SRQ
ity, and voluntary instability. is applicable to a broad range of disorders related to
Michener et al.21 examined the psychometric prop- the shoulder but may not be specific enough for use
erties of the ASES patient self-reporting section in the within the instability population.
Journal of Shoulder and Elbow Surgery in 2002. They
made calculations based on data from 63 patients with Melbourne Instability Shoulder Score
various shoulder dysfunctions who underwent physi-
cal therapy for operative and nonoperative diagnoses. The MISS is a self-administered tool that was de-
They concluded that the ASES system is a reliable, veloped in 2004.11 There are a total of 100 points, and
valid, and responsive outcome tool, which can be used some questions are not part of the scoring system. The
as an evaluative and discriminative tool. The concern test is valid and reliable and has been directly com-
is the amount of responsiveness that the ASES system pared with the SRQ of L’Insalata et al.13 There is
has for instability patients. increased responsiveness compared with the SRQ,
Kocher et al.12 reviewed the ASES scoring system giving the MISS greater range to show change. Poor
in The Journal of Bone and Joint Surgery, American agreement exists between the MISS and the SRQ, and
Edition, in 2005 and found that there is reliability, it is more difficult to obtain a high score on the MISS
validity, and responsiveness of the ASES subjective system. The authors believe that the MISS offers a
shoulder scale in a heterogeneous population of pa- greater spread of questions relating to functional tasks
tients with instability, rotator cuff disease, and gleno- and instability than the WOSI, DASH, or ASES in-
humeral osteoarthritis. There were a total of 455 pa- struments. They believe that these other scoring sys-
tients with instability, 474 patients with rotator cuff tems may not truly reflect the full functional status of
disease, and 137 patients with glenohumeral osteoar- patients and may underestimate the patients’ clinical
thritis. They concluded that the ASES scale shows impairment. This, however, is only the authors’ opin-
overall acceptable psychometric performance for out- ion and has not been proven. The DASH and ASES
comes assessment in all subgroups. They cautioned, instruments do not have as many specific questions
however, that the ASES may not be an optimal out- relating to instability as the MISS and would most
come measure. Although the reliability is acceptable, likely be unable to fully discover the clinical impair-
it may not be precise enough to be used on an indi- ment of patients in that population. The WOSI, on the
vidual basis. The minimal difference in the ASES other hand, asks questions similar to those of the
scale that represents a clinically important difference MISS but in a somewhat broader manner. For in-
in functional status was not established. The limita- stance, the MISS, using a Likert scale, asks patients to
tions involve the use of a large, prospectively main- mark their ability to carry heavy objects by their side,
tained computer database, which showed the hetero- whereas the WOSI, using a VAS, asks patients to
geneity among patients and the lack of uniformity grade how much difficulty they experience lifting
among the surgical techniques that were used. heavy objects below shoulder level. We suspect that
both of these scoring systems have a role in evaluating
Shoulder Rating Questionnaire patients with shoulder instability. The scoring system
that best estimates the patient’s full functional status
In 1997 L’Insalata et al.13 published the SRQ, remains to be determined.
which is a self-administered evaluation of symptoms One of the sections asks for patients to give a
and function of the shoulder. The method for item percentage rating score, the percentage that they
generation is unknown. There are 6 separately scored would give the affected shoulder compared with what
domains: global assessment, pain, daily activities, rec- they consider a “normal” shoulder. Overall, the MISS
reational and athletic activities, work, and satisfaction. has a good correlation to the patient’s percentage
The satisfaction domain is composed of a single rating score, unlike the SRQ. The system is sensitive
SHOULDER INSTABILITY 903

to clinical change in nonathletic and athletic popula- UCLA, and SF-12 scoring systems. The WOSI corre-
tions. The MID is 5. The main disadvantage of the lated highly with the DASH scale, as predicted be-
MISS system is that it is not applicable to the general cause many of the items are similar. The DASH scale,
shoulder population. however, includes items relating to other joints of the
upper extremity and conditions other than shoulder
DASH Score instability. There is a correlation with the UCLA sys-
tem, reflecting the emphasis that patients place on
The DASH score was developed by the American pain, even with instability. The WOSI lacked a cor-
Academy of Orthopaedic Surgeons along with the relation with the CM scoring system. Range of motion
Institute for Work & Health (Toronto, Ontario, Can- has been shown to correlate poorly with shoulder
ada). This is a general shoulder score that can be used function, and the CM system emphasizes range of
for patients with any condition of any joint of the motion.
upper extremity. The methodology for item generation Another study by Kirkley et al.,24 published in
and reduction has been published,22 and a user’s man- Arthroscopy in 1999, evaluated first-time anterior
ual is available.23 There are several advantages of the shoulder dislocators. The validated WOSI was used
DASH instrument. It can be completed before a diag- to show statistically better results in the surgically
nosis is made, there is a detailed user’s manual, and treated group compared with the non–surgically
the system covers a broad scope of pathologies. On treated group. The authors also noted improvement
the other hand, there are many disadvantages of the in disease-specific quality of life by early surgical
scoring system; the item-generation phase did not stabilization. This study shows appropriate clinical
include interviews with patients, the format only uses use of the WOSI to aid in the determination of
5-category Likert-type responses, there is redundancy functional improvement.
of some questions, and the test is less responsive than
condition-specific instruments, making it less effec- Oxford Instability Score/Oxford Shoulder Score
tive in the field of research. We cannot recommend
using this scoring system for patients with instability. The Oxford Shoulder Score was developed by Daw-
son et al.25 Two different questionnaires exist; one
Western Ontario Shoulder Instability Index was constructed in 1999 for instability patients, now
called the OIS, and the other was constructed for
Kirkley et al.15 developed the WOSI in 1998, the shoulder operations other than instability, a few years
first in a series of disease-specific quality-of-life mea- earlier, in 1996.26 Both contain a total of 12 items
sure tools. The methodology is well described, using a scored from 1 to 5. There is no statement on how the
VAS and applying equal weighting to questions based items were selected or eliminated. Construct validity
on uniformly high impact scores. There are a total of has been shown compared with other outcome tools.
21 items, with 4 domains. Physical symptoms account The OIS is more sensitive than the CM system but not
for 10 items; sport/recreation/work function, 4; life- as sensitive as the Rowe system. The reliability and
style function, 4; and emotional function, 3. The sensitivity of the CM score relative to the OIS were
WOSI was originally designed to be used as the pri- significantly reduced over long-term follow-up. There
mary outcome measure in the evaluation of clinical is a possibility that the OIS may be used as an appro-
trial treatments for patients with shoulder instability. priate testing tool for patients with instability. How-
To make an instrument that could potentially be used ever, with only 12 questions and less sensitivity than
to evaluate both nonoperative and operative treat- the Rowe system, we believe that other scoring sys-
ments, the patients included in the study received no tems would be better suited as testing instruments.
treatment or any combination of physical therapy and
operative treatment. CM Score
In the original article published in The American
Journal of Sports Medicine, Kirkley et al.15 compared The CM scoring system was developed in 1987 and
the WOSI with other scoring systems. They adminis- is the most widely used instrument in the European
tered the measurement tool to a randomly selected literature.27 There are a total of 100 points, with 35
group of 47 patients undergoing treatment for anterior being assigned to pain and function and the remaining
shoulder instability and found the WOSI to be more 65 for range of motion and strength. There is no
responsive than the Rowe, DASH, CM, ASES, category for patient satisfaction and only 1 pain scale,
904 PLANCHER AND LIPNICK

making an inadequate assessment. The initial publica- duced by Ellman et al.31 in 1986, has 5 separate areas
tion does not include any rationale for item selection, scored for a total of 35 points. Pain is allocated 10
relative weighting, or data on validity or responsive- points; function, 10 points; active forward flexion, 5
ness. However, the method for administering the tool points; strength in forward flexion, 5 points; and sat-
is well described, and reliability has been evaluated, isfaction, 5 points. An excellent score is 34 to 35
but only on a limited basis. points, a good score is 29 to 33 points, and a poor
Both Conboy et al.28 and Dawson et al.25 evaluated score is less than 29 points. The following year, Ell-
the CM scoring system in The Journal of Bone and man32 changed the scoring system so that an excellent
Joint Surgery, British Edition, in 1996 and 1999, score is 34 to 35 points; a good score, 28 to 33 points;
respectively. Conboy et al. believed that the CM scor- a fair score, 21 to 27 points; and a poor score, 0 to 20
ing system is easy to use and has a low systematic points.
error but is not sufficiently reliable for use in the
The test is easy to administer, but there are no data
clinical follow-up of patients. In addition, it is impre-
on methodology of development or testing, and the
cise in repeated measurements and not sensitive for
position of forward flexion is not standardized. There
instability. Dawson et al. agreed that the CM system is
relatively insensitive to changes in clinical status in are double-barrel questions with multiple variables
patients with instability and therefore not an appropri- and descriptors per item, and weighting is not sup-
ate tool to measure outcome in patients with shoulder ported. The older version of the UCLA scoring system
instability. They further concluded that the CM score can only be used after intervention and combines 2 items
places undue weight on concerns such as pain and of subjective evaluation with 2 items of physical exam-
power, which are frequently not relevant to patients ination. The newer version should also be used only after
with instability. intervention and combines 3 items of subjective evalua-
tion with 2 items of physical examination. There are only
Athletic Shoulder Outcome Rating Scale 5 total questions on the newer version, with no specific
questions on instability. The scoring system is minimally
Tibone and Bradley developed the ASORS, which
responsive for patients with instability and should not be
combines subjective and objective evaluations.29 The
used for this population.
subjective evaluation is allocated 90 points, which
Burkhart et al.33 published an article in Arthroscopy
includes pain, strength/endurance, stability, intensity,
and performance. The other 10 points are allocated for in 1994 in which they evaluated 14 patients who
objective testing. The ASORS may be more sensitive underwent partial repair of a massive rotator cuff tear.
for high-level athletes, but the expectations and goals The mean score was 27.6 points, which is only be-
for shoulders in athletes are greater than those in the tween good and fair according to the criteria of Ell-
general population. There are multiple disadvantages man32 from 1987 and poor according to the original
to the scoring system; the scale is only applicable to criteria of Ellman et al.31 Although 6 of 14 patients
competitive athletes, it does not define the position for had a fair or poor score, 13 of 14 patients answered
testing range of motion, the responses have greater “satisfied and better” on the rating scale, leading to
than 1 variable (similar to the UCLA system), it has a doubt regarding the consistency of the subjective find-
combined scoring system, and it requires proper psy- ings of the patient with the scoring system.
chometric testing. This scoring system may be ade- In The American Journal of Sports Medicine in
quate for competitive athletes, but further testing 1996, Romeo et al.34 evaluated 52 patients with 53
needs to be performed. The scoring system has yet to shoulder stabilization procedures. The majority (85%)
be found reliable, responsive, or valid. of the patients had an excellent result with the UCLA
scoring system, but only 38% of the patients had an
UCLA Score excellent result with the modified Rowe system. The
The UCLA scoring system was developed by Am- authors believed that the UCLA scoring system was
stutz et al.30 in 1981. This system was intended to the most “forgiving” of the systems evaluated. For
study patients who underwent a total shoulder arthro- example, a patient can have no pain and have full use
plasty for degenerative joint disease. The original ver- of the arm for activities of daily living yet lose 50% of
sion had a total of 30 points divided between pain, his or her external rotation and still obtain an excellent
function, and muscle power/motion. There was no result, with 28 of 30 points (scoring done before the
category for satisfaction. The newer version, intro- addition of the satisfaction category).
SHOULDER INSTABILITY 905

Simple Shoulder Test was found to be the most sensitive, but an ideal
scoring system has yet to be developed. The authors
The SST, developed in 1992 by Lippitt et al.35 from realized that a “normal” or “symptom-free” shoulder
the University of Washington, has a total of 12 yes/no after injury or surgery may not be realistic.
questions. The scale indirectly assesses pain, range of Uhorchak et al.38 published their results on recur-
motion, and strength through the questionnaire. The rent shoulder instability after open reconstruction in
main goal of the score is to document functional athletes who participated in collision and contact
improvement resulting from a specified procedure by sports, in The American Journal of Sports Medicine in
a specific surgeon, in response to a given diagnosis. 2000. The study used both the ASES (before 1994)
There are no patient satisfaction items, and no formal and Rowe systems for comparison evaluations. The
testing of reliability was done by the authors. The ASES scores were much higher, because stability was
dichotomous responses diminish sensitivity, and the only allocated 5 of 100 points. Conversely, a patient
development methodology does not include any pa- received a Rowe score of only 10 of 50 points if he or
tient input. The main advantage to the system is that it she had recurrent subluxation. The authors concluded
does not require a clinician to administer the test or that the ASES score may not be applicable to insta-
any elaborate equipment. bility. In fact, the current ASES system does not even
Godfrey et al.36 evaluated the reliability, validity, have a section for instability. The Rowe score more
and responsiveness of the SST in the Journal of Shoul- accurately reflects the occurrence of instability. How-
der and Elbow Surgery in 2006. This article compared ever, recurrence of instability was not consistent with
psychometric properties by age and injury type. Over- functional status or patient satisfaction in the popula-
all, the SST was found to reliable, valid, and generally tion tested. The group with only rare (⬍3) subluxa-
responsive. However, the SST was only moderately tions had a low Rowe score but high satisfaction. Of
responsive in patients aged younger than 40 years note, the population was US military academy cadets,
regardless of injury type. The SST was also only and all patients had to return to collision sports and
moderately responsive in patients with instability in- military training after surgery.
juries. The authors cautioned against using the system As discussed previously, Romeo et al.34 evaluated and
in a clinical investigation. Kirkley et al.17 published an compared 5 commonly used scoring systems for shoul-
article in Arthroscopy in 2003 that critically evaluated der evaluation: Rowe, modified Rowe, UCLA, and pre-
scoring systems for the functional assessment of the 1994 ASES scoring systems. This was a retrospective
shoulder. They concluded that the SST is unlikely to analysis of 53 consecutive patients undergoing shoulder
be sensitive to small but clinically important changes stabilization procedures. The UCLA scores showed ex-
in patient function because of the dichotomous re- cellent results in 85% of patients, whereas the modified
sponse option. In addition, the test was likely to have Rowe score showed excellent results in only 38%. The
poor discriminative function, with a decreased ability UCLA scoring system correlated poorly with the other
to differentiate between patients with varying severi- systems. In addition, the inter-rater reliability among the
ties of the same condition. 4 systems was poor. This shows that generalized results
Additional Studies of an investigation can be biased based on the selection
of a scoring system. Scoring systems are frequently used
Additional studies have been performed to compare to describe the results of various procedures. Romeo et
and contrast the numerous scoring systems that have al.34 concluded that the lack of a widely accepted scoring
been used. Soldatis et al.,37 in 1997, performed a system limits comparison of management for shoulder
comparison study in healthy college athletes during conditions.
the midseason. They compiled scores from the ASES,
Rowe, UCLA, CM, and SST scoring systems. They DISCUSSION
added a sport-specific function item, an item for pain
in high-demand situations, and items for the ability to Overall, there were a total of 11 scoring systems
weight train and throw. This was named the Baylor/ that were thoroughly evaluated. Systems were evalu-
Houston Athletic Shoulder Assessment Questionnaire. ated for proper development, reliability, validity, and
They concluded that significant shoulder symptoms responsiveness for a homogeneous subset of patients
exist in athletes during full participation in their re- with known instability (Table 2). Only 3 of the 11
spective sports. Pain was the most frequent symptom, scoring systems, the MISS, the DASH, and the WOSI,
occurring in 47% of all shoulders. The UCLA system appear to have used an adequately described develop-
906 PLANCHER AND LIPNICK

TABLE 2. Development, Reliability, Validity, and Responsiveness for Shoulder Scoring Systems
Scoring System Proper Development Reliable for Instability Valid for Instability Responsive for Instability

ASES self-report20 No No (ICC ⱖ0.75*; IC, 0.61) Large Moderate/large (SRM, 0.54/0.93)
ASORS29 No Unknown Small Unknown
CM27 No Unknown Small Moderate (SRM, 0.59)
DASH22,23 Yes Unknown Small Moderate (SRM, 0.71)
MISS11 Yes Yes (ICC, 0.98) Moderate Large (SRM, 1.57)
OIS25,26 No Yes (IC, 0.91) Moderate Large (ES, 0.8)
Rowe18 No No Small Moderate (SRM, 0.76)
SRQ13 No Yes† (ICC ⱖ0.94, IC ⱖ0.71) Moderate Large (SRM, 1.4)
SST35 No Yes (ICC, 0.97) Moderate Moderate (SRM, 0.63; ES, 0.61)
UCLA30,31 No Unknown Small Small (SRM, 0.39)
WOSI15 Yes Yes (ICC, 0.95) Large Large (SRM, 0.93)

Abbreviation: ES, effect size; SRM, standardized response mean.


*Ten of eleven domains. Testing was performed in patients with instability, rotator cuff disease, and glenohumeral arthritis.
†The article13 states that the instability group was evaluated separately; however, exact numbers not provided.

ment process. Most of the other systems either did not and the effect size. The ASORS has not yet been
provide a description of item generation and/or reduc- evaluated. The UCLA system was found to be mini-
tion or provided a limited description. mally responsive, with an SRM of 0.39. The Rowe,
Some of the scoring systems, as depicted in Table 1, DASH, CM, and SST systems were found to be mod-
include both subjective and objective components. It erately responsive. The ASES system was found to be
has been suggested that clinical examination variables largely responsive (SRM, 0.93) by Kocher et al.12
be collected and reported separately, because these However, Kirkley et al.15 found the ASES system to
variables tend to have very poor reliability and corre- be only moderately responsive (SRM, 0.54), placing
late poorly with patients’ subjective evaluations of below the Rowe, DASH, and CM systems. Systems
their function, even when performed by experienced that were found to be largely responsive include the
clinicians.15 The Rowe, CM, ASORS, and UCLA SRQ, MISS, WOSI, and OIS.
scoring systems all have objective components incor- The validity of the scoring systems was also eval-
porated into them. uated taking into account construct, criterion, and
Reliability was determined by both the ICC and IC. content validity. Many authors claim that certain scor-
IC was acceptable at levels greater than 0.70, with ing systems are valid; however, there currently is no
0.80 indicating good, 0.90 indicating excellent, and formal gold standard that can be used for comparison
1.0 indicating perfect. As discussed earlier in this to determine criterion validity. Thus criterion validity
review, this cutoff is subject to debate, because some cannot solely be used for verification. There is also no
authors classify levels greater than 0.60 as acceptable. specific set of questions to truly predict that what is
However, numerous systems evaluated had an IC being measured is actually what is supposed to be
level greater than 0.90, and therefore we believed that measured. Therefore the determination of validity is
the higher limit should be used. Some studies were very difficult and somewhat subjective. For instance,
found to be reliable for a broad category of patients the model used for criterion validity for the SRQ was
with multiple pathologies, not just instability. If there the Arthritis Impact Measurement Scales 2, which has
was no specific population of patients with instability been validated in patients with osteoarthritis. Al-
noted, the scoring system was unable to be labeled as though the authors of the SRQ concluded that their
reliable. The SRQ, MISS, WOSI, OIS, and SST sys- findings support validity for patients in each diagnos-
tems were all found to be reliable for instability. The tic group, including instability, they are using an ar-
ASES system was found to be unreliable, with an IC bitrary scoring system that has not been validated for
of 0.61 in instability patients and an ICC of 0.75 or patients with instability. In addition, the MISS ques-
greater in only 10 of 11 domains in all patients, tionnaire used the SRQ as a comparison tool to aid in
including those with pathologies other than instability. the validation process, based on the claim that the
There has been no reliability testing for the Rowe, SRQ is one of the few self-administered shoulder
DASH, CM, ASORS, or UCLA systems. scores that has been shown to be a reliable and valid
Responsiveness was evaluated by both the SRM measure for the assessment of glenohumeral instabil-
SHOULDER INSTABILITY 907

ity. There is clearly some confusion when it comes to CONCLUSIONS


the validation of a system when there is no clear gold
standard for comparison. It is imperative to develop common guidelines and
criteria for treatment, which will facilitate identifica-
At present, validity cannot be confirmed in patients
tion of effective outcomes for similar conditions.
with instability for the Rowe, DASH, CM, ASORS, or
Comparative prospective studies should be encour-
UCLA scoring systems. The SRQ, MISS, OIS, and
aged. There is a need for a standardized measuring
SST systems all appear to have moderate validity. The system that incorporates the following: (1) it must
ASES and WOSI systems have both been shown to satisfy the needs of those using it, (2) it must satisfy
have significant support to conclude high validity for development testing criteria, (3) it must be strongly
patients with instability. weighted toward functional outcome, (4) it must be
Even though the ASES system has been shown to simple and effective, and (5) it should include self
be valid, we strongly caution against using it for (patient)–assessment.
patients with instability, because it does not appear to Both the WOSI and the MISS meet these criteria.
optimally determine outcome measurement and was The SRQ and the OIS both have potential but may or
found to be less responsive in patients with instability may not be adequately responsive in the instability
compared with other pathologies. Like the ASES, the population. Although a discussion on the optimal scor-
SRQ may not be specific enough in the evaluation of ing system for shoulder pathologies other than shoulder
patients with instability. instability is beyond the scope of this article, the ASES
The SST was found to be only moderately respon- scoring system may be considered, because it has been
sive in patients aged younger than 40 years as well as proven to be valid, reliable, and responsive. We recom-
patients presenting with instability. This scoring sys- mend using both the WOSI and the MISS to evaluate
tem will most likely be unable to detect small but patients with instability to make clinical decisions. Fur-
clinically important changes and therefore is not an ther testing evaluating these tools in a clinical setting will
ideal system to use in patients with instability. determine whether these scoring systems can later be
Although the OIS has potential, the development used as the gold standard for patients with shoulder
process is questionable and the Rowe system was instability.
found to have a larger effect size after direct compar- Standardizing the scoring system used in clinical
ison, implying that the OIS is even less responsive studies performed for the treatment of shoulder insta-
bility will lead to more consistent results as well as
than the Rowe in patients with instability. The MISS,
results that may be able to be compared between
a relatively new scoring system, also appears to have
studies. With the multitude of scoring systems used in
potential for use in the evaluation of patients with
clinical trials, a direct comparison between studies is
shoulder instability. The development process is ade- difficult and what is tested may not even be an accu-
quate, and the system has been shown to be respon- rate measure of shoulder instability outcome mea-
sive, reliable, and valid by use of the SRQ for com- sures.
parison. There are still a number of outstanding questions.
The WOSI has been shown to be reliable, valid, and We have yet to determine whether problem-specific
responsive in patients with instability. The WOSI was measuring tools are more desirable than a single sys-
originally tested in patients undergoing both nonop- tem. This review is specific to shoulder instability
erative and operative treatment, making the scoring problems and would not necessarily be adequate for
system available for use in both subsets of patients. the arthritic shoulder or other shoulder dysfunctions. It
However, there was no distinction in scoring in the is possible that the development of population-specific
2 groups, leaving the reader to wonder whether the tools for competitive athletes and other groups may be
WOSI could differentiate between the benefits of advantageous; however, it may merely add further
surgery versus physical therapy. Kirkley et al.24 did confusion and difficulty for testing.
follow the original article using the WOSI in a
prospective, randomized clinical trial comparing
early arthroscopic stabilization versus rehabilitation REFERENCES
in first-time anterior shoulder dislocators. They con-
1. Fitzpatrick R, Fletcher A, Gore S, Jones D, Spiegelhalter D,
cluded that each of the 4 domains reached or ap- Cox D. Quality of life measures in health care. I: Applications
proached statistical significance. and issues in assessment. BMJ 1992;305:1074-1077.
908 PLANCHER AND LIPNICK

2. Irrgang J, Lubowitz JH. Measuring arthroscopic outcome. 21. Michener LA, McClure PW, Sennett BJ. American Shoulder
Arthroscopy 2008;24:718-722. and Elbow Surgeons standardized shoulder assessment form,
3. Crawford K, Briggs K, Rodkey W, Steadman R. Reliability, patient self-report section: Reliability, validity, and respon-
validity, and responsiveness of the IKDC score for meniscus siveness. J Shoulder Elbow Surg 2002;11:587-594.
injuries of the knee. Arthroscopy 2007;23:839-844. 22. Hudak P, Amadio P, Bombardier C. Development of an upper
4. Martin R, Kelly B, Philippon M. Evidence of validity for the extremity outcome measure: The DASH (Disabilities of the
Hip Outcome Score. Arthroscopy 2006;22:1304-1311. Arm, Shoulder and Hand) [corrected]. The Upper Extremity
5. Wright R. Knee injury outcomes measures. J Am Acad Orthop Collaboration Group (UECG). Am J Ind Med 1996;29:
Surg 2009;17:31-39. 602-608.
6. Nagi S. Some conceptual issues in disability and rehabilitation. 23. Solway S, Beaton DE, McConnell S, Bombardier C. The
In: Sussman M, ed. Sociology and rehabilitation. Washington, DASH outcome measure user’s manual. Toronto, Ontario:
DC: American Sociology Association, 1965;100-113. Institute for Work & Health, 2002.
7. Kirkley A, Griffin S. Development of disease-specific quality 24. Kirkley A, Griffin S, Richards C, Miniaci A, Mohtadi N.
of life measurement tools. ISAKOS scientific committee re- Prospective randomized clinical trial comparing the effective-
port. Arthroscopy 2003;19:1121-1128.
ness of immediate arthroscopic stabilization versus immobili-
8. Gerber C. Integrated scoring systems for the functional assess-
zation and rehabilitation in first traumatic anterior dislocations
ment of the shoulder. In: Matsen FA, Fu FH, Hawkins RJ, eds.
The shoulder: A balance of mobility and stability. Rosemont, of the shoulder. Arthroscopy 1999;15:507-514.
IL: American Academy of Orthopaedic Surgery, 1992;531- 25. Dawson J, Fitzpatrick R, Carr A. The assessment of shoulder
550. instability. The development and validation of a questionnaire.
9. Leggin BG, Iannotti JP. Shoulder outcome measure. In: Ian- J Bone Joint Surg Br 1999;81:420-426.
notti JP, Williams GR, eds. Disorders of the shoulder: Diag- 26. Dawson J, Fitzpatrick R, Carr A. Questionnaire on the percep-
nosis and management. Philadelphia: Lippincott Williams & tions of patients about shoulder surgery. J Bone Joint Surg Br
Wilkins, 1999;1023-1040. 1996;78:593-600.
10. Juniper EF, Guyatt GH, Jaeschke R. How to develop and 27. Constant CR, Murley AHG. A clinical method of functional
validate a new health-related quality of life instrument. In: assessment of the shoulder. Clin Orthop Relat Res 1987:160-
Spilker B, ed. Quality of life and pharmacoeconomics in 164.
clinical trials. Philadelphia: Lippincott-Raven, 1996;49-56. 28. Conboy V, Morris R, Kiss J, Carr A. An evaluation of the
11. Watson L, Story I, Dalziel R, Hoy G, Shimmin A, Woods D. Constant-Murley shoulder assessment. J Bone Joint Surg Br
A new clinical outcome measure of glenohumeral joint insta- 1996;78:229-232.
bility: The MISS questionnaire. J Shoulder Elbow Surg 2005; 29. Tibone JE, Bradley JP. Evaluation of outcomes for athletes’
14:22-30. shoulders. In: Matsen FA, Fu FH, Hawkins RJ, eds. The
12. Kocher M, Horan M, Briggs K, Richardson T, O’Holleran J, shoulder: A balance of mobility and stability. Rosemont, IL:
Hawkins R. Reliability, validity, and responsiveness of the American Academy of Orthopaedic Surgery, 1992;519-529.
American Shoulder and Elbow Surgeons subjective shoulder 30. Amstutz HC, Sew Hoy AL, Clarke IC. UCLA anatomic total
scale in patients with shoulder instability, rotator cuff disease, shoulder arthroplasty. Clin Orthop Relat Res 1981:7-20.
and glenohumeral arthritis. J Bone Joint Surg Am 2005;87: 31. Ellman H, Hanker G, Bayer M. Repair of the rotator cuff:
2006-2011. End-result study of factors influencing reconstruction. J Bone
13. L’Insalata JC, Warren RF, Cohen SB, Altchek DW, Peterson Joint Surg Am 1986;68:1136-1144.
MGE. A self-administered questionnaire for assessment of 32. Ellman H. Arthroscopic subacromial decompression: Analysis
symptoms and function of the shoulder. J Bone Joint Surg Am of one- to three-year results. Arthroscopy 1987;3:173-181.
1997;79:738-748. 33. Burkhart SS, Nottage WM, Ogilvie-Harris DJ, Kohn HS,
14. Kane R. Outcome measures. In: Kane R, ed. Understanding Pachelli A. Partial repair of irreparable rotator cuff tears.
health care outcomes research. Gaithersburg, MD: Aspen Arthroscopy 1994;10:363-370.
Publishers, 1997;17-18. 34. Romeo AA, Bach BR Jr, O’Halloran KL. Scoring systems for
15. Kirkley A, Griffin S, McLintock H, Ng L. Development and
shoulder conditions. Am J Sports Med 1996;24:472-476.
evaluation of a disease-specific quality of life measurement
35. Lippitt SB, Harryman DT II, Master FA III. A practical tool
tool for shoulder instability (WOSI). Am J Sports Med 1998;
26:764-772. for evaluating function: The Simple Shoulder Test. In: Matsen
16. Kirkley A. Scoring systems for the functional assessment of FA, Fu FH, Hawkins RJ, eds. The shoulder: A balance of
the shoulder. Tech Shoulder Elbow Surg 2002;3:220-233. mobility and stability. Rosemont, IL: American Academy of
17. Kirkley A, Griffin S, Dainty K. Scoring systems for the func- Orthopaedic Surgery, 1992;501-518.
tional assessment of the shoulder. ISAKOS scientific commit- 36. Godfrey J, Hamman R, Lowenstein S, Briggs K, Kocher M.
tee report. Arthroscopy 2003;19:1109-1120. Reliability, validity, and responsiveness of the Simple Shoul-
18. Rowe CR, Patel D, Southmayd WW. The Bankart proce- der Test: Psychometric properties by age and injury type.
dure—A long-term end-result study. J Bone Joint Surg Am J Shoulder Elbow Surg 2006;16:260-267.
1978;60:1-16. 37. Soldatis J, Moseley JB, Etminan M. Shoulder symptoms in
19. Jobe F, Giangarra C, Kvitne R. Anterior capsulolabral recon- healthy athletes: A comparison of outcome scoring systems.
struction of the shoulder in athletes in overhand sports. Am J J Shoulder Elbow Surg 1997;6:265-271.
Sports Med 1991;19:428-434. 38. Uhorchak JM, Arciero RA, Huggard D, Taylor DC. Recurrent
20. Richards RR, Arik N, Bigliani LU, et al. A standardized shoulder instability after open reconstruction in athletes in-
method for the assessment of shoulder function. J Shoulder volved in collision and contact sports. Am J Sports Med
Elbow Surg 1994;3:347-352. 2000;28:794-799.

You might also like