Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

RESEARCH ARTICLE

Seizure Detection in Continuous Inpatient EEG


A Comparison of Human vs Automated Review
Taneeta Mindy Ganguly, MD, Colin A. Ellis, MD, Danni Tu, MS, Russell T. Shinohara, PhD, Correspondence
Kathryn A. Davis, MD, MSCE, Brian Litt, MD, and Jay Pathmanathan, MD, PhD Dr. Pathmanathan
jay.pathmanathan@
®
Neurology 2022;98:e2224-e2232. doi:10.1212/WNL.0000000000200267 pennmedicine.upenn.edu

Abstract MORE ONLINE

Background and Objectives Class of Evidence


The aim of this work was to test the accuracy of Persyst commercially available automated Criteria for rating
seizure detection in critical care EEG by comparing automated seizure detections to human therapeutic and diagnostic
studies
review in a manually reviewed cohort and on a large scale.
NPub.org/coe
Methods
Automated seizure detections (Persyst versions 12 and 13) were compared to human review in
a pilot cohort of 229 seizures from 85 EEG records and then in an expanded cohort of 7,924
EEG records. Sensitivity, specificity, positive predictive value (PPV), and negative predictive
value (NPV) were calculated for individual seizures (pilot cohort) and for entire records (pilot
and expanded cohorts). We assessed EEG features associated with the accuracy of automated
seizure detections.

Results
In the pilot cohort, accuracy of automated detection for individual seizures was modest
(sensitivity 0.50, PPV 0.60). At the record level (did the recording contain seizures or not?),
sensitivity was higher (pilot cohort 0.78, expanded cohort 0.91), PPV was low (pilot cohort
0.40, expanded cohort 0.08), and NPV was high (pilot cohort 0.88, expanded cohort 0.97).
Different software versions (version 12 vs 13) performed similarly. Sensitivity was higher for
records containing focal-onset seizures compared to generalized-onset seizures (0.93 vs 0.85,
p = 0.012).

Discussion
In critical care continuous EEG recordings, automated detection of individual seizures had rates
of both false negatives and false positives that bring into question its utility as a seizure alarm in
clinical practice. At the level of entire EEG records, the absence of automated detections
accurately predicted EEG records without true seizures. The true value of Persyst automated
seizure detection appears to lie in triaging of low-risk EEGs.

Classification of Evidence
This study provides Class II evidence that an automated seizure detection program cannot
accurately identify EEG records that contain seizures.

From the Department of Neurology (T.M.G., C.A.E., K.A.D., B.L., J.P.), and Penn Statistics in Imaging and Visualization Endeavor (PennSIVE) Center of Excellence (D.T., R.T.S.), Center
for Clinical Epidemiology and Biostatistics, Perelman School of Medicine, University of Pennsylvania; Department of Biostatistics, Epidemiology, & Informatics (D.T., R.T.S.) and
Center for Biomedical Image Computing and Analytics (R.T.S.), University of Pennsylvania, Philadelphia.

Go to Neurology.org/N for full disclosures. Funding information and disclosures deemed relevant by the authors, if any, are provided at the end of the article.

e2224 Copyright © 2022 American Academy of Neurology


Copyright © 2022 American Academy of Neurology. Unauthorized reproduction of this article is prohibited.
Glossary
cEEG = continuous long-term EEG; EMU = epilepsy monitoring units; GEE = generalized estimating equation.
ICU = intensive care unit; NPV = negative predictive value; PPV = positive predictive value.

The use of continuous long-term EEG (cEEG) in critical care critical for use as a real-time alarm, while record-level de-
settings continues to rise, supported by consensus recommen- tection is relevant for triaging studies for manual EEG review.
dations and emerging evidence that it improves outcomes.1,2 The study presented here provides a systematic and unbiased
Manually reviewing large volumes of EEG data is labor intensive, analysis of the automated seizure detection performance of
necessitating an alternative method of quickly interpreting cEEG. Persyst in inpatient long-term continuous EEG monitoring.
Nearly all neurophysiologists use quantitative EEG, and up to half Our primary research question sought to evaluate the sensi-
do not review all pages of EEG.3 Automated seizure detectors tivity, specificity, and positive predictive value (PPV), and
have the potential to facilitate the efficiency of cEEG review negative predictive value (NPV) of this tool at both the in-
through triaging low-risk EEGs and allows the timely identifica- dividual seizure and record levels.
tion and treatment of seizures.3-6 The use of automated seizure
detection systems is becoming increasingly widespread, but data
proving their reliability and how best to apply them are limited. Methods
Standard Protocol Approvals, Registrations,
Persyst is the most widely used commercially available auto- and Patient Consents
mated seizure detection software with Food and Drug Ad- This study was approved by the institutional review board at
ministration clearance. It is used by thousands of neurologists the University of Pennsylvania with waiver of informed
at hundreds of hospitals worldwide, including 48 of the 50 consent.
U.S. News & World Report’s top-ranked hospitals.7 While
Persyst offers quantitative spectral arrays to reflect EEG pat- Data Collection
terns, these quantitative visual tools have not been thoroughly For this study, we considered only 24-hour cEEG recordings
validated in adults, have demonstrated significant variability, performed on inpatients at the Hospitals of the University of
and require appropriate standardization before being applied Pennsylvania (the Hospital of the University of Pennsylvania,
routinely for clinical decision-making.3,8-11 However, the Presbyterian Medical Center, and Pennsylvania Hospital),
software does offer an automated seizure detection tool that excluding patients admitted to the EMU. In all cases, cEEG
asserts a yes or no interpretation of whether a sample of EEG had been clinically requested by the primary inpatient team to
is consistent with a seizure. Overall, there has been little in- evaluate for seizures. EEG electrodes were placed by trained
dependent assessment of the Persyst algorithms; the largest and registered EEG technologists using the international 10-
and most rigorous studies have been performed in affiliation 20 system and with eye leads. EEG data were collected at a
with the company itself.12-14 Patient selection for validation is minimum sampling rate of 256 Hz using Natus (Natus Inc,
also a potential issue that could influence reports of the Pleasanton, CA) XLtek equipment running Natus Neuro-
software performance: the accuracy of the Persyst automated works version 8.5. Persyst version 12 or 13 (Persyst Inc,
seizure detection tool has been studied primarily in epilepsy Solana Beach, CA), henceforth labeled P12 or P13, was run
monitoring units (EMUs)4,15 and ambulatory EEGs in on each EEG, either at the time of data capture or later (as
adults.16 Critical care patients present an additional challenge outlined below). EEGs were interpreted by trained epi-
for automated seizure detection due to abnormal background leptologists credentialed to interpret EEGs at the University
rhythms and unusual ictal patterns that may confound auto- of Pennsylvania. EEG reports were generated with custom-
mated algorithms.17 Yet, automated seizure detection is ar- ized software that stores EEG interpretations in a searchable
guably most relevant in the intensive care unit (ICU), where SQL database. Searchable fields include demographic data
seizures are common, rapid treatment is desired, and manual and terms from the American Clinical Neurophysiology So-
review is often not immediately available.6,18,19 ciety ICU nomenclature (including background amplitude,
organization, symmetry, rhythmic and periodic patterns, and
We studied the performance of Persyst automated seizure details on seizures). Because these fields are required by our
detection in a large sample of continuous ICU EEG record- EEG system for report generation, there were no missing data.
ings. We measured the performance of automated detections For this retrospective study, 2 EEG datasets were generated.
at the level of individual seizures (whether each seizure was
correctly detected) and the accuracy of automated detections We first identified a pilot cohort of cEEG recordings that
at the level of EEG records (whether records were correctly contained seizures. We selected ICU EEG reports coded as
identified as containing seizures, even if individual seizures containing seizures, recorded from 2015 to 2019, and ran-
were not accurately detected). The ability of automated de- domly selected 23 EEG recordings from 23 different patients.
tection software to correctly identify individual seizures is Because the prevalence of seizures among inpatient cEEGs

Neurology.org/N Neurology | Volume 98, Number 22 | May 31, 2022 e2225


Copyright © 2022 American Academy of Neurology. Unauthorized reproduction of this article is prohibited.
records at tertiary care centers can vary widely (8%–48%)1 and We extracted 552,623 comments from this dataset, focusing on
because we did not know the exact prevalence of seizures in our detection events that included the terms “RhythmicBurst” or
population before the outset of this study, we chose a large “SeizureDetected.” These detection events were labeled by the
study at a center similar to ours that demonstrated that 27% of authors as automated seizure detections by Persyst. The soft-
inpatient records contained seizures to guide our sample size.20 ware used to extract comments was developed with permission
On the basis of this study, we randomly selected 62 additional from Natus Medical Inc using algorithms from their software
cEEGs without seizures mentioned in the EEG reports to developers kit. For this dataset, we queried report information,
roughly imitate the prevalence of seizures in the ICU pop- including background details and presence of focal vs gener-
ulation at large. All 85 EEGs were manually reviewed by 2 alized seizures. Precise seizure timing is not a queriable field in
expert reviewers (T.M.G., J.P.) blinded to any prior annota- the report database.
tions in the EEG records. Seizures were defined by standard-
ized, widely accepted criteria for electrographic seizures.21 The Statistical Analysis
onset and offset times of all seizures were marked in- Statistical analysis was performed in R (version 3.6.0, R
dependently, and differing marks were adjudicated via joint Foundation for Statistical Computing, Vienna, Austria). The
review by both reviewers. These studies were then run through clinical characteristics of patients with and without seizures
the Persyst 13 analyzer using default parameters, and seizure were compared by Mann-Whitney U test for continuous
markings were extracted from the EEG comments. variables (age) and χ 2 or Fisher exact test for categorical
variables (sex, neurologic injury, mental status, for a total of 7
In addition to seizures, the following features were extracted from tests, Bonferroni-corrected α = 0.007).
each EEG report and coded as present or absent: (1) low voltage
(<20 μV); (2) posterior dominant rhythm; (3) artifact severe First, we analyzed the pilot cohort for the accuracy of automated
enough to obscure identification of epileptiform discharges; (4) seizure detections at the level of individual seizures. To account
status epilepticus; (5) lateralized periodic discharges; (6) focal for clustered data with multiple seizures within each EEG, we
slowing; (7) severe diffuse slowing; (8) state changes; (9) normal used generalized estimating equations (GEEs) with exchange-
sleep architecture; and (10) poor background organization de- able working correlation structure to estimate the sensitivity and
fined as any background that was not described in report as good PPV of Persyst-detected seizures compared to human-detected
or fair (this includes backgrounds coded as suppressed, absent seizures.22 The GEE estimates are different from the empirical
organization, discontinuous, and poorly organized). Additional sensitivity and PPV in that they can be interpreted as marginal or
clinical data extracted from the EEG report database included age, population-averaged quantities that account for within-cluster
sex, neurologic injury/condition, and mental status at the time of correlation. GEE models were implemented with the R package
recording. If an automated detection fell anywhere within the geepack.23 CIs for these estimates were then found with the
span of the seizure annotated by the human reviewers or within robust sandwich estimator. True negatives (no human-detected
10 seconds before or after this window, the detection was marked seizure, no Persyst-detected seizure) were not countable events,
as concordant with the human read. Any other automated seizure so specificity and NPV could not be calculated. Next, to de-
detection was coded as a false detection. For this study group, we termine whether background features of the EEG affect the
excluded Persyst seizure detections that occurred if a patient was accuracy of automated seizure detection, we stratified the EEGs
disconnected from EEG. We also manually rereviewed the EEG according to each of 10 background features and compared the
at every Persyst seizure detection to ensure that no seizures sensitivity of automated detections for records with presence vs
missed were by the human reviewers (this never occurred in our absence of each feature using Mann-Whitney U tests (10 tests,
dataset). Bonferroni-corrected α = 0.005).

We next identified an expanded cohort to study algorithm Next, we analyzed the accuracy of automated seizure de-
performance at scale. We examined all cEEGs recorded be- tection in the pilot cohort at the level of the EEG record rather
tween December 1, 2017, and October 30, 2020, and included than the individual seizure level. Each EEG record was coded
all cEEGs that were analyzed by P12 or P13 at the time of as having ≥1 human-detected seizures (yes/no) and as having
recording, a total of 7,924 cEEGs studies from 2,854 unique ≥1 Persyst-detected seizures (yes/no), regardless of when
patients. The report database was queried for human reported during the record these detections occurred. Sensitivity,
seizures and compared to the presence or absence of Persyst specificity, NPV, and PPV of Persyst-detected seizures
automated seizure detections at any point in the record. In compared to human-detected seizures were calculated.
this dataset, no effort was made to exclude patient discon- Adjustment for clustered data was not performed because
nections or excessive artifact as a cause of false automated each EEG record in this cohort was an independent ob-
detection, reflecting real-world practice. servation. Exact binomial CIs were calculated with the R
package epiR.24
To examine the seizure detections across a dataset of this size,
an EEG comment extractor was created in collaboration with For our expanded cohort of 7,924 EEG records, seizure de-
Natus Neuroworks that allowed us to directly read the EEG tection was again analyzed at the level of each entire EEG
files, including both human- and Persyst-generated comments. record. Similar to our methods for pilot cohort -level review, we

e2226 Neurology | Volume 98, Number 22 | May 31, 2022 Neurology.org/N


Copyright © 2022 American Academy of Neurology. Unauthorized reproduction of this article is prohibited.
coded records as having human-detected seizures (yes/no) Results
and/or Persyst-detected seizures (yes/no). To account for
clustered data with multiple EEG recordings from the same Characteristics of the pilot and expanded cohorts are shown in
individual, we again used GEEs with exchangeable working Tables 1 and 2, respectively. Patients with seizures vs those
correlation structure22 to estimate the sensitivity, specific- without seizures did not differ in their clinical characteristics
ity, NPV, and PPV of Persyst-detected seizures compared except for mental status in the expanded cohort, which
to human-detected seizures. CIs were found with the ro- reached marginal statistical significance but was not significant
bust sandwich variance. For subgroup analyses, we used when corrected for multiple comparisons (α = 0.007).
similar GEE models with a subgroup indicator variable to
enable comparisons. Subgroup comparisons were (1) re- In the pilot cohort, the 23 records with seizures contained a
cords with low background voltage vs records with normal total of 229 human-detected seizures (median 7, range 1–33
background voltage, (2) records analyzed by different seizures per record). Performance metrics are shown in
software versions (P12 vs P13), and (3) records that con- Table 3 and Figure 1. Persyst correctly detected 111 of 229
tained focal-onset seizures vs generalized-onset seizures. individual seizures (adjusted sensitivity 0.50, 95% CI 0.34,
This last subanalysis of seizure onset was limited to records 0.66). In these 23 studies, Persyst detected a total of 183
that contained human-detected seizures, so only sensitiv- seizures. Of these, 111 of 183 were true seizures according to
ities could be calculated. Statistical testing was performed human readers (adjusted PPV 0.60, 95% CI 0.42, 0.75).
with the Wald test.
We next tested whether specific EEG features were associated
Data Availability with successful Persyst detection of individual seizures, as
Anonymized data that support the findings of this study are shown in Figure 2. Sensitivity was significantly lower in re-
available from the corresponding author on reasonable request. cords with low-voltage background compared to records without

Table 1 Patient Characteristics, Pilot Cohort


Patient characteristic Patients with seizures (n = 23) Patients without seizures (n = 62) p Value

Age, median (range), y 61 (28–96) 63 (28–90) 0.45a

Female, n (%) 11 (48) 31 (50) 1.00b

Underlying neurologic injury, n (%) 0.63c

Epilepsy 3 (13) 7 (11)

CNS infection/inflammation 2 (9) 3 (5)

Brain tumor 3 (13) 11 (18)

Postneurosurgery 0 (0) 4 (7)

Hypoxic ischemic encephalopathy 1 (4) 9 (15)

Traumatic brain injury 4 (17) 3 (5)

Toxic/metabolic 3 (13) 5 (8)

Nontraumatic intracerebral hemorrhage 4 (17) 10 (16)

Ischemic stroke 2 (9) 7 (11)

Unexplained/other 1 (4) 3 (5)

Mental status, n (%) 0.84c

Awake 7 (31) 21 (34)

Somnolent or obtunded 6 (26) 19 (31)

Coma, sedated 4 (17) 11 (18)

Coma, unsedated 4 (17) 9 (15)

Unknown 2 (9) 2 (3)

a
Mann-Whitney U test.
b
The χ2 test.
c
Fisher exact test.

Neurology.org/N Neurology | Volume 98, Number 22 | May 31, 2022 e2227


Copyright © 2022 American Academy of Neurology. Unauthorized reproduction of this article is prohibited.
Table 2 Patient Characteristics, Expanded Cohort
Patient characteristic Patients with seizures (n = 353) Patients without seizures (n = 2,501) p Value

Age, median (range), y 63 (17–99) 63 (16–101) 0.43a

Female, n (%) 169 (48) 1,172 (47) 0.76b

Mental status, n (%) 0.02c

Awake 169 (45) 964 (39)

Somnolent or obtunded 104 (28) 773 (31)

Coma, sedated 90 (24) 562 (23)

Coma, unsedated 9 (2) 144 (6)

Unknown 2 (1) 37 (1)

a
Mann-Whitney U test.
b
The χ2 test.
c
Fisher exact test.

low-voltage background (Mann-Whitney U test, p = 0.001). No seizures were present in 786 of 7,924 records (10%). Auto-
other EEG feature was significantly associated with successful mated seizure detections were present in 6,079 of 7,924 re-
automated detection of individual seizures. cords (77%). The accuracy of Persyst at the record level is
shown in Figure 1 and Table 3. Results in this expanded
Because detection of some but not all seizures within an EEG cohort overall showed trends similar to our pilot cohort.
record may be adequate for triaging EEG, we assessed the Persyst detected seizures in 723 of 786 records that contained
performance of Persyst in identifying seizures at the record level. seizures (adjusted sensitivity 0.91, 95% CI 0.88, 0.93). Persyst
That is, does the automated seizure detection identify the also detected seizures in 5,356 of 7,138 records that did not
presence of a seizure anywhere in the record? In our pilot cohort contain seizures (adjusted false alarm rate 0.74, 95% CI 0.72,
of 85 EEG records, Persyst detected seizures in 18 of 23 records 0.75). The PPV of Persyst detections was low (0.08, 95% CI
that contained seizures (sensitivity 0.78, 95% CI 0.56, 0.93). 0.07, 0.09), meaning that a human reader would have to read
Persyst also detected seizures in 27 of 62 records that did not 12.5 EEGs in which automated detections occurred to find 1
contain seizures (false alarm rate 44%, 95% CI 31%, 57%). record with true seizures. On the other hand, the NPV was
high (0.97, 95% CI 0.96, 0.98), meaning that if Persyst did not
On the basis of our finding that Persyst performed particularly detect seizures, there was only a 3% chance that true seizures
poorly at the individual seizure level for EEGs with low- were present. To account for potential bias from repeated
voltage backgrounds, we performed a post hoc analysis in EEGs from single individuals, we limited the analysis to only
which we removed the low-voltage EEGs and then repeated the first EEG from each subject (2,854 unique individuals/
the record-level analysis. In the remaining 18 EEGs contain- EEGs). Results were similar, as demonstrated in eTable 1,
ing seizures and without low-voltage background, Persyst links.lww.com/WNL/B938. Assuming that each record in
correctly detected seizures in all 18 of 18 at the record level, which there were no human or automated seizure detections
corresponding to both a sensitivity and an NPV of 100%. was ≈24 hours (total 44,280 hours), Persyst accurately
identified 42,768 hours of EEG as being seizure-free.
We expanded these analyses to a dataset of 7,924 EEG records
from 2,854 unique individuals to determine the performance We performed several subgroup analyses in this expanded
of Persyst at the record level at scale. Human-detected dataset, using the same GEE model as previously described

Table 3 Performance of Automated Seizure Detection


Cohort Total, n Sensitivity Specificity PPV NPV

Seizure level, pilot cohort 229 0.50 (0.34, 0.66) — 0.60 (0.42, 0.75) —

Record level, pilot cohort 85 0.78 (0.56, 0.93) 0.56 (0.43, 0.69) 0.40 (0.26, 0.56) 0.88 (0.73, 0.96)

Record level, expanded cohort 7,924 0.91 (0.88, 0.93) 0.26 (0.25, 0.28) 0.08 (0.07, 0.09) 0.97 (0.96, 0.98)

Abbreviations: NPV = negative predictive value; PPV = positive predictive value.


Values in parentheses are 95% CIs.

e2228 Neurology | Volume 98, Number 22 | May 31, 2022 Neurology.org/N


Copyright © 2022 American Academy of Neurology. Unauthorized reproduction of this article is prohibited.
Figure 1 Performance of Automated Seizure Detection

(A–D) Seizure level refers to detection of individual


seizures. Because the absence of both human-
and Persyst-detected seizures (true negatives)
was not a countable event, specificity and nega-
tive predictive value (NPV) were not calculated at
the seizure level. Record level refers to detection
of a seizure anywhere in the EEG record. PPV =
positive predictive value.

Figure 2 Sensitivity of Automated Seizure Detection Stratified by EEG Features

EEGs with low-voltage backgrounds had significantly lower sensitivities of automated seizure detection at the level of individual seizures compared to EEG
records with normal-voltage backgrounds (p = 0.001). All other features we assessed did not significantly affect the sensitivity of automated seizure detection.
LPD = lateralized periodic discharge; PDR = posterior dominant rhythm.

Neurology.org/N Neurology | Volume 98, Number 22 | May 31, 2022 e2229


Copyright © 2022 American Academy of Neurology. Unauthorized reproduction of this article is prohibited.
Table 4 Performance of Persyst 12 vs Persyst 13 Automated Seizure Detection
Software version Total, n Sensitivity Specificity PPV NPV

12 1,310 0.91 (0.80, 0.96) 0.24 (0.21, 0.27) 0.05 (0.04, 0.07) 0.98 (0.95, 0.99)

13 6,605 0.91 (0.88, 0.93) 0.27 (0.25, 0.28) 0.09 (0.08, 0.10) 0.97 (0.96, 0.98)

Abbreviations: NPV = negative predictive value; PPV = positive predictive value.


Values in parentheses are 95% CIs.

but with a subgroup indicator variable to enable comparisons. In prompting immediate manual EEG review or treatment of the
this expanded cohort, the estimated sensitivity of automated de- patient. Our data argue against the use of Persyst for this
tections was not significantly different in EEGs with low-voltage function in clinical practice. Relying on automated detections
backgrounds than in those with normal voltage (p > 0.05), in would have missed more than half of all true seizures in our
contrast to the hypothesis generated by our pilot cohort. We then pilot cohort. In addition, the false positives pose a challenge,
examined the difference between versions P12 and P13, both of as evidenced by the low PPVs (i.e., the likelihood that an
which performed similarly (Table 4). In addition, we examined automated detection is a true seizure) at the level of both
the difference between focal and generalized seizures. Among the individual seizures and entire records across our cohorts. An
786 records that contained human-detected seizures, 653 records ideal seizure alarm will require a substantially higher sensi-
contained only focal-onset seizures, and 111 records contained tivity (i.e., miss very few true seizures) and lower false alarm
only generalized-onset seizures. Sensitivity of automated de- rate than the values seen here.
tection was higher for focal-onset seizures than for generalized-
onset seizures (0.93 vs 0.85, p = 0.012, Table 5). A second potential role for automated seizure detection is to
reduce the volume of EEG for manual review. For example,
Last, to further account for potential bias from repeated EEGs EEG records with a high probability of seizures could be
for some individuals in the cohort, we limited the analysis to reviewed earlier or more often, whereas other records could be
only the first EEG recorded for each of 2,854 unique indi- reviewed less often or more briefly if the probability of seizures
viduals. Results were similar to those of the entire expanded were sufficiently low. The relevant metric here is the NPV,
cohort, suggesting that repeated measures was not an im- i.e., the probability that the absence of automated detections
portant source of bias. This study provides Class II evidence reflects the absence of true seizures. We found high NPVs at the
that an automated seizure detection program cannot accu- record level in both our pilot and expanded cohorts, up to 97%.
rately identify EEG records that contain seizures. This may be adequate to justify clinically useful triage of records
with low probability of seizures based on lack of automated
detections. It is important to note that our data support this
Discussion kind of triaging only at the level of entire EEG records, not at
In this study, we measured the accuracy of Persyst for auto- the level of individual automated detections. That is, if no
mated seizure detection in 24-hour critical care EEG records. At automated detections are present, one can be 97% confident
the level of individual seizures, we found modest performance of that the record contains no seizures; whereas if automated
automated seizure detections, with fewer than half of seizures detections were present, our data suggest that the entire record
detected. At the level of EEG records, in a large cohort of 7,924 should be reviewed for accurate seizure detection, rather than
records, we found that the presence of automated seizure de- limiting the review to only the individual detection events,
tections was a poor predictor of true seizures in the recording, which would be likely to miss true seizures.
while the absence of automated seizure detections was an ex-
cellent predictor that the record did not contain true seizures. Prior studies on the accuracy of Persyst for automated seizure
detection have shown mixed findings. In the ICU setting,
Automated seizure detection can play at least 2 different roles there has been exploration of the use of spectral array
in clinical care. First, it could serve as a real-time seizure alarm, analysis,3,6,8,11 particularly in the pediatric population, but the

Table 5 Sensitivity of Automated Seizure Detection for Focal vs Generalized Seizures


Seizure onset Records, n Sensitivity (95% CI) p Valuea

Focal 653 0.93 (0.90, 0.95) 0.012

Generalized 111 0.85 (0.76, 0.91)

a
Wald test.

e2230 Neurology | Volume 98, Number 22 | May 31, 2022 Neurology.org/N


Copyright © 2022 American Academy of Neurology. Unauthorized reproduction of this article is prohibited.
automated seizure detection tool has not been investigated.9 inpatient EEGs at large. In addition, the pilot cohort and
Most studies have focused the EMUs and ambulatory EEG expanded cohort reflected overlapping but not identical time
recordings.4,14-16 The most recent study of the Persyst seizure periods. However, we cannot identify any obvious con-
detection tool (P14) in EMU recordings showed noninferior founding factors related to when the EEG was performed. We
seizure detection performance compared to human experts.14 also applied the Persyst seizure detector at default settings
These studies may not be applicable to ICU settings because only; it is possible that adjusted detection parameters would
EMU recordings are often higher fidelity (due to frequent lead to different performance results. In our expanded cohort,
electrode maintenance) and seizures are more likely to occur as we relied on the original clinical interpretation EEG according
discrete events against normal or mildly abnormal back- to our reporting database. The clinicians reading the study
grounds. In contrast, patients in the ICU requiring cEEG would have had access to the Persyst detections at the time of
typically have a grossly abnormal EEG background, with their interpretation. In theory, this information could have
rhythmic or periodic abnormalities superimposed. Those improved identification of seizures in real time. It should be
background and interictal rhythms, including ictal-interictal noted that Persyst offers more than seizure markings, and this
continuum patterns, have been shown to confound automated study did not investigate the value of visualizing EEG time
algorithms, causing a decline in detection performance.17 series as compressed spectral trends.

Our study extends these prior findings in several ways. First, Automated seizure detection in critical care EEG is an im-
this is the largest study of the automated seizure detection portant need, and the true value of Persyst lies in triaging
accuracy on inpatient cEEGs outside the EMU, examining low-risk EEGs We found that its performance at detecting
performance at both the individual seizure and study levels. individual seizures was not sufficient for use as a seizure alarm
Our findings indicate lower performance than has been in clinical practice, given a poor PPV. At the level of entire
reported by previous studies in different patient populations. records, we found that the absence of automated seizure de-
This is important because critical care populations are a major tections could be useful to triage studies with low probability
source of continuous EEG recording and the need for of containing seizures. In our dataset, Persyst accurately tri-
automation-assisted interpretation is particularly acute. In aged up to 42,000 hours of EEG recordings as seizure-free. On
subgroup analyses, we found slightly higher automated sei- the basis of these findings, our institution plans to use auto-
zure detection sensitivity for focal rather than generalized mated seizure detection as a tool in our morning workflow
seizures, which, to the best of our knowledge, has not been and triage strategy but not as a seizure detector. Future studies
previously reported.11,16 should aim to improve the accuracy of automated seizure
detection and to measure its accuracy in different patient
The study has several limitations. Persyst version 14 (P14) populations. It is anticipated that future seizure detection al-
was not available at the time of data analysis in this report. gorithms will close the gaps identified here, but human efforts
However, we did not find significant differences between cannot be minimized at present.
versions P12 and P13, and neither of those versions achieved
adequate performance for fully automated seizure detection. Acknowledgment
We intend to compare P14 performance to these results in the B. Litt is supported by the following NIH grant from the
future. Second, we have not provided specificity or NPVs for National Institute of Neurological Disorders and Stroke:
seizure-level analyses because true negatives (no human- DP1NS122038.
detected seizure, no Persyst-detected seizure) were not
countable events. These were countable at the record level Study Funding
only. Third, the pilot cohort did not contain any EEG patterns No targeted funding reported.
classified as ictal-interictal continuum, and we cannot com-
ment on the accuracy of Persyst in that context (although the Disclosure
expanded cohort did contain such EEG patterns, we did not T.M. Ganguly, C. Ellis, D. Tu, R.T. Shinohara, and K.A. Davis
analyze that subgroup specifically). Although the pilot cohort report no disclosures relevant to the manuscript. B. Litt has
allowed assessment of individual seizures, the expanded co- licensed intellectual property through the University of Penn-
hort was too large for such analyses, particularly for correla- sylvania in exchange for equity in the following companies:
tion of individual automated markings with individual seizures NeuroPace, MC10, and Blackfynn. He is a consultant for 4
and for accounting for automated detections during periods of Catalayzer, including Liminal Neurosciences, Tesseract, Hy-
excess artifact. These differences likely accounted for the perfine, Detect, and AI Therapeutics. None of these entities
difference in PPVs between our pilot seizure-level and ex- have sponsored this work and their value is not affected by this
panded record-level cohorts. Our study was also limited by research. J. Pathmanathan reports no disclosures relevant to the
the number of manually reviewed EEGs, and a larger pilot manuscript. Go to Neurology.org/N for full disclosures.
cohort sample size may lead to more accurate results. The
expanded cohort, although with a much larger sample size, Publication History
was obtained from a single center. Performing this study Received by Neurology April 11, 2021. Accepted in final form
across multiple sites may more comprehensively represent February 8, 2022.

Neurology.org/N Neurology | Volume 98, Number 22 | May 31, 2022 e2231


Copyright © 2022 American Academy of Neurology. Unauthorized reproduction of this article is prohibited.
References
Appendix Authors 1. Herman ST, Abend NS, Bleck TP, et al. Consensus statement on continuous EEG in
critically ill adults and children, part I: indications. J Clin Neurophysiol. 2015;32(2):
Name Location Contribution 87-95.
2. Hill CE, Blank LJ, Thibault D, et al. Continuous EEG is associated with favorable
Taneeta Mindy Department of Neurology, Drafting/revision of the hospitalization outcomes for critically ill patients. Neurology. 2019;92(1):e9–e18.
Ganguly, MD University of Pennsylvania manuscript for content, 3. Swisher CB, Sinha SR. Utilization of quantitative EEG trends for critical care con-
Perelman School of including medical writing for tinuous EEG monitoring: a survey of neurophysiologists. J Clin Neurophysiol. 2016;
Medicine, Philadelphia content; major role in the 33(6):538-544.
acquisition of data; study 4. Kamitaki BK, Yum A, Lee J, et al. Yield of conventional and automated seizure
concept or design; analysis detection methods in the epilepsy monitoring unit. Seizure. 2019;69:290-295.
or interpretation of data 5. Herrera-Fortin T, Assi EB, Gagnon M-P, Nguyen DK. Seizure detection devices: a
survey of needs and preferences of patients and caregivers. Epilepsy Behav. 2021:
Colin A. Ellis, MD Department of Neurology, Drafting/revision of the 114(pt A):107607.
University of Pennsylvania manuscript for content, 6. Sackellares JC, Shiau DS, Halford JJ, LaRoche SM, Kelly KM. Quantitative EEG
Perelman School of including medical writing analysis for automated detection of nonconvulsive seizures in intensive care units.
Medicine, Philadelphia for content; analysis or Epilepsy Behav. 2011;22(suppl 1):S69-S73.
interpretation of data 7. Persyst Corporation. Careers. Accessed December 23, 2020. persyst.com/about/careers/.
8. Dericioglu N, Yetim E, Bas DF, et al. Non-expert use of quantitative EEG displays for
Danni Tu, MS Department of Biostatistics, Analysis or interpretation seizure identification in the adult neuro-intensive care unit. Epilepsy Res. 2015;109:
Epidemiology, & of data 48-56.
Informatics, University of 9. Din F, Lalgudi Ganesan S, Akiyama T, et al. Seizure detection algorithms in critically
Pennsylvania, Philadelphia; ill children: a comparative evaluation. Crit Care Med. 2020;48(4):545-552.
Penn Statistics in Imaging 10. Zafar SF, Amorim E, Williamsom CA, et al. A standardized nomenclature for spec-
and Visualization Endeavor trogram EEG patterns: inter-rater agreement and correspondence with common
(PennSIVE) Center of intensive care unit EEG patterns. Clin Neurophysiol. 2020;131(9):2298-2306.
Excellence, Center for 11. Goenka A, Boro A, Yozawitz E. Comparative sensitivity of quantitative EEG (QEEG)
Clinical Epidemiology and spectrograms for detecting seizure subtypes. Seizure. 2018;55:70-75.
Biostatistics, Perelman 12. Joshi CN, Chapman KE, Bear JJ, Wilson SB, Walleigh DJ, Scheuer ML. Semi-
School of Medicine, automated spike detection software Persyst 13 is noninferior to human readers when
Philadelphia, PA calculating the spike-wave index in electrical status epilepticus in sleep. J Clin Neu-
rophysiol. 2018;35(5):370-374.
Russell T. Department of Biostatistics, Analysis or interpretation 13. Scheuer ML, Bagic A, Wilson SB. Spike detection: inter-reader agreement and a
Shinohara, PhD Epidemiology, & of data statistical Turing test on a large data set. Clin Neurophysiol. 2017;128(1):
Informatics, Penn Statistics 243-250.
in Imaging and Visualization 14. Scheuer ML, Wilson SB, Antony A, Ghearing G, Urban A, Bagić AI. Seizure detection:
Endeavor (PennSIVE) interreader agreement and detection algorithm assessments using a large dataset.
Center of Excellence, J Clin Neurophysiol. 2021;38(5):439-447.
Center for Clinical 15. Fürbass F, Ossenblok P, Hartmann M, et al. Prospective multi-center study of an
Epidemiology and automatic online seizure detection system for epilepsy monitoring units. Clin Neu-
Biostatistics, Perelman rophysiol. 2015;126(6):1124-1131.
School of Medicine, and 16. González Otárula KA, Mikhaeil-Demo Y, Bachman EM, Balaguera P, Schuele S.
Center for Biomedical Automated seizure detection accuracy for ambulatory EEG recordings. Neurology.
Image Computing and 2019;92(14):e1540-e1546.
Analytics, University of 17. Zorlu M, Chuang D, Kettani H, Zarnegar R. Sensitivity of Persyst seizure detection for
Pennsylvania, Philadelphia different electrographic seizure patterns in patients with status epilepticus. Clin
Neurophysiol. 2018;129(suppl 1):e98.
Kathryn A. Department of Neurology, Major role in the 18. Laccheo I, Sonmezturk H, Bhatt AB, et al. Non-convulsive status epilepticus and
Davis, MD, MSCE University of Pennsylvania acquisition of data non-convulsive seizures in neurological ICU patients. Neurocrit Care. 2015;22(2):
Perelman School of 202-211.
Medicine, Philadelphia 19. Claassen J, Mayer SA, Kowalski RG, Emerson RG, Hirsch LJ. Detection of electro-
graphic seizures with continuous EEG monitoring in critically ill patients. Neurology.
Brian Litt, MD Department of Neurology, Major role in the 2004;62(10):1743-1748.
University of Pennsylvania acquisition of data 20. Westover MB, Shafi MM, Bianchi MT, et al. The probability of seizures during EEG
Perelman School of monitoring in critically ill adults. Clin Neurophysiol. 2015;126(3):463-471.
Medicine, Philadelphia 21. Chong DJ, Hirsch LJ. Which EEG patterns warrant treatment in the critically ill?
Reviewing the evidence for treatment of periodic epileptiform discharges and related
Jay Department of Neurology, Drafting/revision of the patterns. J Clin Neurophysiol. 2005;22:79-91.
Pathmanathan, University of Pennsylvania manuscript for content, 22. Genders TS, Spronk S, Stijnen T, Steyerberg EW, Lesaffre E, Hunink MG. Methods
MD, PhD Perelman School of including medical writing for for calculating sensitivity and specificity of clustered data: a tutorial. Radiology. 2012;
Medicine, Philadelphia content; major role in the 265(3):910-916.
acquisition of data; study 23. Højsgaard S, Halekoh U, Yan J. The R package geepack for generalized estimating
concept or design; analysis equations. J Stat Softw. 2005;15:1-11.
or interpretation of data 24. Nunes MS, Heuer C, Marshall J, et al. epiR: tools for the analysis of epidemiological data.
2021. Accessed July 21, 2021. cran.r-project.org/package=epiR.

e2232 Neurology | Volume 98, Number 22 | May 31, 2022 Neurology.org/N


Copyright © 2022 American Academy of Neurology. Unauthorized reproduction of this article is prohibited.

You might also like