Professional Documents
Culture Documents
10.1007@s00405 020 06036 1
10.1007@s00405 020 06036 1
10.1007@s00405 020 06036 1
https://doi.org/10.1007/s00405-020-06036-1
REVIEW ARTICLE
Abstract
Purpose Oral potentially malignant disorders (OPMDs) may have varying degrees of oral epithelial dysplasia (OED). Tra-
ditional grading schemes separate OED into three-tiers (mild, moderate, and severe). Alternatively, a binary grading system
has been previously proposed that stratifies OED into low-risk and high-risk categories based on a quantitative threshold of
dysplastic pathologic characteristics. This systematic review evaluates the predictive value of a binary OED grading system
and examines agreement between pathologists.
Methods This meta-analysis queried 4 databases (PubMed, Ovid-MEDLINE, Cochrane, and SCOPUS) and includes 4 stud-
ies evaluating binary OED grading systems. Meta-analysis of proportions and correlations was performed to pool malignant
transformation rates (MTR), risk of malignant transformation between OED categories, and measures of interobserver
agreement.
Results Pooled analysis of 629 lesions from 4 different studies found a six-time increased odds of malignant transformation
in high-risk lesions over low-risk lesions [odds ratio (OR) 6.14, 95% 1.18–15.38]. Reported ORs ranged from 2.8 to 22.4.
The overall MTR was 26.8%, with the high-risk and low-risk lesions having MTRs of 57.9% (95% CI 0.386–0.723) and
12.7% (95% CI − 0.210 to 0.438), respectively. Pooled unweighted interobserver kappa values for the binary grading system
and three-tiered system were 0.693 (95% CI 0.640–0.740) and 0.388 (95% CI 0.195–0.552), respectively.
Conclusion Binary grading of OED into low-risk and high-risk categories may effectively determine malignant potential,
with improved interobserver agreement over three-tiered grading. Improved grading schemes of OED may help guide man-
agement (watchful waiting vs. excision) of these OPMDs.
13
Vol.:(0123456789)
European Archives of Oto-Rhino-Laryngology
hyperplasia, mild dysplasia, moderate dysplasia, severe dys- category, but will depend on if histopathological examina-
plasia, and carcinoma in situ (CIS) [12]. Although squamous tion finds greater than 4 architectural and 5 cytological dys-
hyperplasia is considered benign, the spectrum of OED is plastic features.
separated into mild, moderate, and severe categories. Rec- The aim of this systematic review is to evaluate the pre-
ognition of CIS is indicative of non-invasive malignant dictive value of binary OED grading for malignant transfor-
transformation. This system was recently updated in 2017, mation and to examine the interobserver consensus between
with CIS now used synonymously with severe dysplasia pathologists when using the binary versus the WHO 2005
and “squamous hyperplasia” no longer being used [13]. grading system.
The original 3 tiers of mild, moderate, and severe dysplasia
grading still remain.
Malignant transformation rates (MTR) of these grades Methods
have been well reported in the literature. In a meta-analysis
comprised of 24 studies with 1546 oral leukoplakia lesions This systematic review was conducted in accordance with
treated with CO2 laser vaporization, Dong et al. [14] found the Preferred Reporting Items for Systematic Reviews and
a MTR of 5.23% in mild, 12.57% in moderate, and 24.98% Meta-Analyses (PRISMA) guidelines [29]. A query of four
in severe dysplasia, with an overall MTR rate of 4.5% databases was performed, including: PubMed (NLM NIH),
[14]. Similarly, in a meta-analysis including 9 studies and Scopus (Elsevier), Cochrane Library (Wiley) and Ovid
992 patients with a mix of treated and untreated lesions, MEDLINE (Wolters Kluwer). The search strategy was
Mehanna et al. [15] found severe dysplasia/CIS having a initially queried in PubMed using subject headings (e.g.,
MTR of 24.1% versus that of mild/moderate dysplasia of MeSH in PubMed) and keywords for the following concepts:
10.3%. oral dysplasia, oral premalignant disorders, World Health
OED grade, in addition to numerous patient and lesion- Organization, binary, classification, and grading. This Pub-
specific risk factors, can help guide management of these Med search strategy was modified for the other three data-
lesions. There are no clear guidelines regarding treatment or bases. The databases were searched from inception through
follow-up of OPMD with OED; however, generally speak- September 21, 2019. A bibliography search of the included
ing, mild dysplasia may be conservatively managed through articles, as well as citing articles, was performed to identify
watchful waiting, while severe OED may prompt excision additional studies to be included. References were uploaded
of the lesion and frequent follow-up for recurrence [16]. to EndNote (Clarivate Analytics, Philadelphia, PA, USA)
Interpretation of the moderate dysplasia category may vary, and screened for relevance. This systematic review does not
with some clinicians taking a “watch and wait approach” involve human subjects and does not require IRB review.
whereas others electing to excise [16–18]. Multiple studies
have shown poor to moderate interobserver agreement when Selection criteria
pathologists have used the three-tiered OED grading sys-
tem [17, 19–22]. As management may be contingent upon This systematic review included studies examining diag-
accurate risk stratification and consistent grading of these nostic efficacy and reproducibility of the binary classifica-
lesions, improvements in reliability of and predictive poten- tion of OED. Studies were considered for inclusion if they
tial of OED may aid clinicians in best treating these lesions. contained: (1) OED confirmed by pathology; (2) histologi-
In 2006, a working group, coordinated by the WHO Col- cal assessment of OED via binary grading with or without
laborating Centre for Oral Cancer and Pre-cancer, proposed comparison to the WHO 2005 three-tiered grading system;
a two-tiered classification method for OED grading to reduce and (3) rates of malignant transformation or interobserver
subjectivity and disagreement [23]. Kujan et al. [17] further consensus kappa values between pathologists. Exclusion
developed this binary system by establishing a numerical criteria included: (1) insufficient data; (2) non-English lan-
threshold of 4 architectural and 5 cytological dysplastic fea- guage; (3) not pertinent classification methods; (4) extra-oral
tures to differentiate between low-risk and high-risk lesions. lesions with epithelial dysplasia.
Of note, Kujan et al. used the same architectural and cyto-
logical features evaluated with the WHO 2005 criteria. Data extraction
Since then, a number of single-institution studies have been
performed to examine the efficacy of this binary grading Two authors (FY and PR) independently extracted data from
system [17, 18, 24–28]. This proposed binary system will each relevant article. Any disagreement between the authors
categorize the majority of mild dysplasia into the low-risk over the eligibility of particular studies was resolved through
category and the majority of severe dysplasia into the high- discussion with a third reviewer (SAN).
risk category. The majority of previously grade moderate Extracted data included: study population demograph-
dysplasia is hypothesized to be categorized into the high-risk ics, diagnosis of histopathological dysplasia, malignant
13
European Archives of Oto-Rhino-Laryngology
transformation rate, follow-up time intervals, odds ratio for considered allowable. A p < 0.05 was considered to indicate a
malignant transformation, and Cohen’s kappa values for statistically significant difference for all statistical tests.
interobserver agreement. Both unweighted and weighted Finally, the Egger tests were performed for further assess-
kappa values were recorded. The weighted kappa value ment of risk of publication bias [35, 36]. Potential publica-
assigns more weight towards agreement of closely related tion bias was evaluated by visual inspection of the funnel
ordinal variables. For example, a disagreement between plot (Supplemental Fig. 1), which statistically examines the
mild and moderate dysplasia would be weighted less than a asymmetry of the funnel plot. In a funnel plot, the treatment
disagreement between mild and severe dysplasia. Weighted effect is plotted on the horizontal axis and the standard error
kappa values are often used when three or more categories on the vertical axis [37]. The vertical line represents the
are involved. Because the binary classification system only summary estimated derived using fixed-effect meta-analysis.
has two categories of classification, only unweighted kappa Two diagonal lines represent (pseudo) 95% confidence lim-
values were recorded for this. Weighted and unweighted its (effect ± 1.96 SE) around the summary effect for each
kappa values were recorded for agreements using the standard error on the vertical axis. These show the expected
WHO 2005 grading system. The degree of interobserver distribution of studies in the absence of heterogeneity or of
agreement based on the kappa values are as follows: slight selection bias. In the absence of heterogeneity, 95% of the
(0.01–0.20), fair (0.21–0.40), moderate (0.41–0.60), sub- studies should lie within the funnel defined by these diago-
stantial (0.61–0.80), and almost perfect (0.81–0.99) [30]. nal lines. Publication bias results in asymmetry of the funnel
plot.
Quality review and assessment of risk of bias
Results
Level of evidence for each included article was performed
using Oxford Center for Evidence-Based Medicine [31]. Search results
The risk of bias was assessed according to the Cochrane
Handbook for Systematic Reviews of Interventions version The initial database search provided 334 studies for review
5.1.0 [32]. Specifically, the ROBINS-I tool was used as this with two additional studies identified from outside sources.
systematic review evaluated non-randomized studies. Two Thirty-two duplicate studies were removed, leaving 304
authors (PR and FY) performed a pilot assessment on three studies for title and abstract screening. Full text review of
studies to check for consistency of assessment. Both then the remaining 36 studies identified 4 studies for quantita-
performed independent risk assessment on the remaining tive analysis. Figure 1 provides a summary of this literature
studies. All disagreements were resolved by the way of dis- search. All studies were classified as Oxford Level of Evi-
cussion with a third author (SAN). The risk of bias items dence Type 2b given their retrospective, single-institution
included the following: bias due to confounding, selection nature [38]. A funnel plot including data points regarding
of participants into the study, classification of interventions, pooled interobserver agreement demonstrated little publica-
deviations from intended interventions, missing data, meas- tion bias, as all studies remained within the funnel (Supple-
urement of outcomes, and selection of reported results. The mental Fig. 1). Assessment for risk of bias for each included
risk of bias for each aspect is graded as “low”, “unclear”, study is summarized in Supplemental Figs. 2 and 3.
or “high” [33].
Study characteristics
Statistical analysis
The four included studies [24, 17, 18, 28] comprised a total
Statistical analyses were performed using MedCalc 18.10.2 of 629 OPMDs in 613 patients, with 284 male and 261
(MedCalc Software bvba, Belgium). All analyses were female patients, forming a male: female ratio of 1.09. One
weighted according to the number of patients affected. Meta- study did not report the sex of their cohort [17]. Average age
analysis of correlations and proportions were performed to was 55.62 years, ranging from 21 to 94 years. The weighted-
pool interobserver agreement kappa values, MTR, and odds average follow-up period was 55.49 months. Table 1 pro-
ratios. MedCalc uses the Hedges–Olkin method for calculat- vides an overview of all studies.
ing the weighted summary correlation coefficient and uses a
Fisher Z transformation of the correlation coefficients [34]. Malignant transformation and survival outcomes
For all meta-analyses, the heterogeneity statistic (I2) is gen- per binary classification and WHO 2005
erated to dictate which analytic model is used. If there was classification
high heterogeneity (I2 > 50%), the random-effects model was
used; if low heterogeneity, then a fixed-effects model has been The overall weighted-mean time to malignant transforma-
tion of high-risk lesions was shorter than that of low-risk
13
European Archives of Oto-Rhino-Laryngology
Fig. 1 Overview of search
strategy using PRISMA meth-
odology
13
Table 1 Studies examining malignant transformation
Study (years) Study type Country OLE Subject (n) Male (n) Female (n) Mean Age SD (years) Age range OPMD (n) Treatment Site
age (years)
(years)
Diajil Retrospective United King- 2 100 68 32 52.5 NR 30–94 100 OPMD CO2 Laser 46 FOM; 33
(2013) [18] dom (76OL, tongue; 9 SP; 5
16ELP, 8EP) BM; 4 fauces; 2
alveolus; 1 RM
Kujan Retrospective United King- 2 68 NR NR NR NR NR 68 OED NR NR
(2006) [17] dom
Liu (2012) [28] Retrospective China 2 320 145 175 54.1 11.6 21–83 320 OL Medicationa 121 lateral/ven-
(261), Surgi- tral T, 93 BM,
European Archives of Oto-Rhino-Laryngology
BM buccal mucosa, ELP erythroleukoplakia, EP erythroplakia, FOM floor of mouth, OED oral epithelial dysplasia, OL oral leukoplakia, OLE Oxford level of evidence, OPML oral pre-malig-
nant lesion, P palate, RM retromolar, SP soft palate, T tongue
a
Medication includes vitamin A/unspecified herbal medication
and high‑risk lesions
formation of high-risk lesions over low-risk lesions
13
sia, and severe grade dysplasia into high-risk dysplasia. The
the binary classification system [24, 17]. In general, most
Fig. 2 Forest plot of studies to determine the risk of malignant trans-
23c (7–84)
30 of their moderate dysplastic lesions into 14 (47%) low
risk and 16 (53%) high risk. Nankivell et al. [24] and Kujan
45.41
39.6
NR
64
et al. [17] both included data regarding malignant transfor-
mation. After reclassification, 14 (88%) high-risk lesions
5 years (%) time (mo)
74.15
88.7
69.6
NR
NR
17]. It was unclear how many reclassified low-risk lesions
underwent malignant transformation. Four (36%) reclassi-
fied high-risk lesions and 5 (31%) low-risk lesions under-
went malignant transformation in Nankivell’s cohort [24].
59%b
29%a
NR
NR
90.5%b
63%a
NR
NR
Discussion
30/91 (33.0%)
10/26 (38.5%)
H-R MT (%)
28/35 (80%)
3/56 (5.4%)
9/53 (17.0%)
Overall MT (%) OR high grade (95% CI) L-R MT (%)
2/44 (4.5%)
4.59 (1.36–15.38)
6.14 (1.18–15.38)
4.33 (2.55–7.36)
Prognostic value
DFS defined by free from recurrence of post-excisional OPMD or malignancy
19/79 (13.5%)
46c (7–95)
61.2
Liu (2012) [28]
13
European Archives of Oto-Rhino-Laryngology
Table 3 Pooled interobserver Classification system Exact agreement ratio Unweighted kappa Weighted kappa
kappa values based on the
classification system WHO 2005 0.386 (0.299–0.467) 0.302 (0.181–0.414) 0.592 (0.504–0.669)
Binary 0.752 (0.680–0.810) 0.568 (0.463–0.657) NR
Fig. 3 Reclassification of mild,
moderate, severe dysplasia
according to a binary grading
method. a Mild dysplasia. b
Moderate dysplasia. c Severe
dysplasia. d Low-risk dyspla-
sia. e high-risk dysplasia. This
case of moderate dysplasia
(b) reveals 3 architectural
(loss of polarity of basal cells,
increased mitotic figures, and
dyskeratosis) and 4 cytological
(anisonucleosis, anisocytosis,
increased nuclear size, and
nuclear:cytoplasmic ratio)
dysplastic characteristics and
would be reclassified as a low-
risk lesion (d) (black arrow).
Moderate dysplasia can also be
reclassified into the high-risk
category (e) (dotted arrow). The
adjacent patient photographs
show examples of clinical leu-
koplakias on the right ventrolat-
eral tongue that corresponded to
various degrees of microscopic
dysplasia upon biopsy. All
photomicrographs stained with
hematoxylin and eosin, ×200
magnification. Architectural
characteristics (a), cytologic
characteristics (c)
These results indicate the binary classification system to rea- [24] found an OR of 2.25 of severe dysplasia over moderate,
sonably reflect malignant transformation and prognosticate whereas Diajil et al. [18] found severe dysplasia to have an
survival. OR of 4.6 over mild dysplasia on univariate analysis, respec-
Of the included studies, only Nankivell et al. [24] and tively. Interestingly, on multivariate analysis, this improved
Diajil et al. [18] compared MTR rates between different greatly to OR of 5.99. Kujan et al. [17] also noted significant
grades of the WHO 2005 classification. Nankivell et al. association between dysplasia grading, using both binary
13
European Archives of Oto-Rhino-Laryngology
and WHO 2005 classification, with malignant transforma- all head and neck intraepithelial lesions, it may be worth
tion. Based on this reported literature, the binary classifica- considering application of an adapted 2-tiered classification
tion system has slightly improved potential for malignant system over both.
transformation than the WHO 2005 grading system.
In the WHO 2005 classification model, moderate dyspla- Reproducibility
sia has variable predictive value. This review demonstrates
reclassification into low or high-risk categories to have vari- Several studies have commented on the increased reproduc-
able results. In Kujan’s study [17], 16 “moderate” dysplastic ibility of the binary classification system. This may simply
lesions were reclassified to high-risk lesions. Fourteen of be because having fewer categories will naturally lower
these did undergo malignant transformation. In this case, potential disagreement. However, the structured quantitative
the binary system was better able to differentiate the cases of method of explicitly listing out dysplastic features may be a
“moderate” dysplasia, often ambiguous in malignant poten- more objective measure of classification and contribute to
tial. This is in comparison to Nankivell et al. [24], where better consensus agreement. The pooled kappa value using
equal proportions of moderate dysplasia re-classified as the binary system was 0.568, classifying this as moderate
high-risk or low-risk lesions, had transformed into malig- in agreement [45]. This was compared with the unweighted
nancies. In addition to considering the number of dysplastic kappa of the WHO 2005 system at 0.302. When only path-
pathological features, clinical characteristics of the lesion ological interpretation by oral pathologists was included,
can be taken into account reclassifying moderate dysplasia kappa agreements of reads utilizing both classification
into low-risk and high-risk categories. Also, moderate and systems improved. In particular, the unweighted kappa of
severe dysplasia are often grouped together during assess- the binary system was improved over both weighted and
ment of malignant transformation, indicating a slightly pro- unweighted kappa of WHO 2005. Improved consensus scor-
pensity of moderate dysplasia for more aggressive disease. ing ensures higher reliability of the pathologists’ reads.
For example, moderate dysplasia may be reclassified as a
high-risk lesion if presenting with clinical high-risk fea- Limitations
tures, such as lateral tongue or floor of mouth involvement,
or speckled, multi-colored appearance. This systematic review has several limitations. An elec-
tronic search of multiple databases was performed; however,
Comparison to other squamous intraepithelial therein still lies a possibility of missing pertinent studies
lesions of head and neck not caught by our search strategy. All included studies were
retrospective in nature and carry their own risk of selection
Previously, dysplasia of oral and laryngeal squamous bias, uncontrolled confounding variables, and heterogeneity
intraepithelial lesions of the head and neck had been graded in reported outcomes. Studies varied in the types of OPMDs
similarly based on the degree of epithelial involvement. Of examined and treatment of which. The studies were also
recent, the WHO 2017 criteria have simplified grading of conducted across two different countries, which may lead to
laryngeal lesions into a two-tiered system—mainly by uni- different interpretations by the pathologists. Also, the pooled
fying the former moderate, severe dysplasia, and carcinoma data were limited by a small number of included studies and
in situ into high-grade dysplasia. This change was based small sample size and be more prone to bias. Overall, these
on a series of studies that characterized moderate dysplasia limitations should be considered during the interpretations
behaving more similarly to severe rather than mild dysplasia of the findings.
[39–42]. Crissman and Sakr [43] had proposed a similar
two-tiered classification of laryngeal dysplasia, splitting Future directions
into squamous intraepithelial neoplasia (SIN I) (low-grade
dysplasia) and SIN II (high-grade dysplasia). However, two- The simplicity of a binary classification scheme may give
tiered classification systems have not been widely accepted clinicians a better tool in assessing malignant transformation
for oral dysplastic lesions. Oral and laryngeal intraepithelial and thus guiding management. Future prospective studies
lesions share many commonalities, both comprised of kerati- are warranted to optimize grading of OED, with possible
nized mucosal epithelium located within the head and neck. tailoring towards type of OPMD. Incorporation of clinical
Small nuances can still distinguish between the two subsites; risk features along with OED severity can be used to create
one example being that rete ridge elongation is more promi- an effective assessment of malignant potential of OPMDs.
nent in OEDs. Although these differences exist, Cho et al.
[44] did not find such characteristics to warrant completely
different methods of evaluation between oral and laryngeal
lesions. For more consistent and reliable evaluation across
13
European Archives of Oto-Rhino-Laryngology
13
European Archives of Oto-Rhino-Laryngology
29. Moher D, Liberati A, Tetzlaff J, Altman DG (2009) Preferred 39. Hu Y, Liu H (2014) Diagnostic variability of laryngeal premalig-
reporting items for systematic reviews and meta-analyses: the nant lesions: histological evaluation and carcinoma transforma-
PRISMA statement. BMJ 339:b2535 tion. Otolaryngol Head Neck Surg 150(3):401–406
30. Viera AJ, Garrett JM (2005) Understanding interobserver agree- 40. Karatayli-Ozgursoy S, Pacheco-Lopez P, Hillel AT, Best SR,
ment: the kappa statistic. Fam Med 37(5):360–363 Bishop JA, Akst LM (2015) Laryngeal dysplasia, demographics,
31. Jeremy Howick ICJLL, Glasziou P, Greenhalgh T, Heneghan and treatment: a single-institution, 20-year review. JAMA Otolar-
C, Liberati A, Moschetti I, Phillips B, Thornton H, Goddard O, yngol Head Neck Surg 141(4):313–318
Hodgkinson M. The Oxford 2011 levels of evidence. http://www. 41. Weller MD, Nankivell PC, McConkey C, Paleri V, Mehanna
cebm.net/index.aspx?o1/45653 HM (2010) The risk and interval to malignancy of patients with
32. Higgins JPT GS (2011) Cochrane Handbook for System- laryngeal dysplasia; a systematic review of case series and meta-
atic Reviews of Interventions. Version 5.1.0. The Cochrane analysis. Clin Otolaryngol 35(5):364–372
Collaboration 42. Spielmann PM, Palmer T, McClymont L (2010) 15-Year review
33. Sterne JA, Hernán MA, Reeves BC et al (2016) ROBINS-I: a tool of laryngeal and oral dysplasias and progression to invasive car-
for assessing risk of bias in non-randomised studies of interven- cinoma. Eur Arch Otorhinolaryngol 267(3):423–427
tions. BMJ 355:i4919 43. Crissman JD, Sakr W (2001) Squamous neoplasia of the upper
34. Hedges LVOI (1985) Statistical methods for meta-analysis. Aca- aerodigestive tract: intraepithelial and invasive squamous cell car-
demic, London cinoma. In: Head and neck surgical pathology
35. Egger MDSG, Schneider M, Minder C (1997) Bias in meta-analy- 44. Cho KJ, Song JS (2018) Recent changes of classification for squa-
sis detected by a simple, graphical test. BMJ 315(7109):629–634 mous intraepithelial lesions of the head and neck. Arch Pathol Lab
36. Sterne JAEM (2001) Funnel plots for detecting bias in meta- Med 142(7):829–832
analysis: guidelines on choice of axis. J Clin Epidemiol 45. McHugh ML (2012) Interrater reliability: the kappa statistic. Bio-
54(10):1046–1055 chem Med (Zagreb) 22(3):276–282
37. Sterne JACSA, Ioannidis JPA et al (2011) Recommendations for
examining and interpreting funnel plot asymmetry in meta-anal- Publisher’s Note Springer Nature remains neutral with regard to
yses of randomised controlled trials. BMJ (Clinical research ed) jurisdictional claims in published maps and institutional affiliations.
343:d4002
38. Centre for Evidence-Based Medicine (2009) Oxford centre for
evidence-based medicine—levels of evidence. https: //www.cebm.
net/2009/06/oxford -centre -eviden ce-based-medicine-levels-evide
nce-march-2009/. Accessed 6 Feb 2020
13