Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Multi-Trajectory Modeling to Predict Acute Kidney Injury in Chronic Kidney

Disease Patients

Philipp Burckhardt, MSc2, Daniel Nagin, PhD1, Vijaya Priya Rama Vijayasarathy, MS1,
Rema Padman, PhD1
1The H. John Heinz III College of Information Systems and Public Policy, 2Department of

Statistics & Data Science, Carnegie Mellon University, Pittsburgh, PA, USA

Abstract
Risk-stratifying chronic disease patients in real time has the potential to facilitate targeted interventions and improve
disease management and outcomes. We apply group-based multi-trajectory modeling to risk stratify patients with
chronic kidney disease (CKD) and its major complications into distinct trajectories of disease development and
predict acute kidney injury (AKI), a serious, under-diagnosed outcome of CKD that is both preventable and treatable
with early detection. Utilizing Electronic Health Record data of 1,947 patients, we identify eight risk groups with
distinct trajectories and profiles. We observe that a higher estimated probability of AKI generally coincides with a
higher risk group. Overall, at least 75% of patients stabilize into their final groups within less than two years from
diagnosis of CKD Stage 3. Model calibration confirms that the estimated outcome probabilities are highly correlated
with AKI incidence, providing group-specific and individual level predictions to improve clinical management of AKI
in CKD patients.
Introduction
Illness trajectories have long been considered an important component of palliative care, such as in the management
of kidney failure1. More recently, there has been increasing focus on trajectories of disease progression for common
chronic conditions2-6. Tracking the progression of a chronic condition from its early stages, including the multiple
associated comorbidities and complications that develop contemporaneously, and stratifying patients into distinct risk
groups, has the potential to facilitate targeted interventions2,3,6. Furthermore, making predictions in real time for
individual patients about critical disease outcomes based on these trajectories and estimating confidence intervals
around the predictions may provide clinicians with the capability to anticipate the future trajectory of disease
progression, thereby improving timely and appropriate clinical decision-making7, 8. In this paper, we demonstrate the
application of group-based multi-trajectory modeling to risk stratify patients with chronic kidney disease (CKD) and
its significant complications into distinct trajectories of disease development and predict acute kidney injury (AKI), a
serious, under-diagnosed outcome of CKD that is both preventable and treatable with early detection9-12.

CKD is a costly, complex, and high mortality health condition affecting 26 million adults in the US, with another 73
million at increased risk of the disease9. It leads to a progressive deterioration of the kidney function over five stages.
Prevalence is estimated to be 8-16% worldwide13. CKD patients make up only 1.5% of Medicare population but cost
$30 billion annually, almost 10% of Medicare costs, due to the high incidence of co-morbidities and complications14,
particularly related to AKI, caused by a sudden decline in kidney function, and cardiovascular disease10-12. Even small
acute changes in kidney function can result in high morbidity and mortality, an increasing incidence of AKI with CKD
progression and associated hospitalizations10. These adverse outcomes are potentially preventable or can be mitigated
through early identification and treatment of individuals at risk as they progress through the five stages 10-12. Early
recognition and management of AKI is thus critical to delaying CKD progression and development of complications.
Using clinical and demographic characteristics of CKD patients, longitudinal data on laboratory markers of CKD as
well as its complications, and AKI diagnosis as the outcome measure, we identify distinctive groups of individual
trajectories within the patient population and provide trajectory-group-specific estimates and individual-level
predictions of the probability of AKI.

Data
Our data set comes from a leading nephrology practice in southwestern Pennsylvania. We extract patient data from
the Electronic Health Record for the years 2009 to 2013, with access to all laboratory measurements for this period.
Specifically, we include patients diagnosed with CKD Stage III between January 1, 2009 and November 19, 2012,
since patients are normally referred to a nephrologist when diagnosed to be in this stage. Patients who received a
kidney transplant on or after January 1, 2009 were removed from the analysis since their level of kidney function

1196
differs sharply from those who did not receive a transplant (keeping them would likely distort the identified disease
progression). The population is split almost evenly among female and male patients. Most patients are retirees, with
median age of 70 years. 94% of the patients are white. About half the patients have a diagnosis of CKD Stage III,
while the remaining patients progress to more advanced stages of CKD over the study period.
In prior research, we used the current data set for trajectory modeling of CKD and related complications without
consideration of an outcome variable2,3. We considered some of the main complications of CKD, including Anemia15,
Secondary Hyperparathyroidism16, Hyperphosphatemia17, and Metabolic Acidosis18. Our data set allows us to track
these conditions via corresponding laboratory measurements. For CKD, we use the estimated glomerular filtration
rate (eGFR), derived using serum creatinine laboratory measurements, and calculated by the CKD-EPI equation
according to best practice19. The respective laboratory measurements of Hemoglobin (HGB), parathyroid hormone
(PTH), phosphate (PO4), and carbon dioxide in bicarbonate form (HCO3) are used as markers for the considered
complications.
In the cohort, 1,367 patients develop Anemia, while 1,476 of them are diagnosed with Secondary
Hyperparathyroidism. 438 patients have Acidosis, but only a 100 suffer from Hyperphosphatemia. After data cleaning
and processing, 1,944 patients diagnosed with CKD Stage III on or after January 1, 2009 form the cohort for our
analysis.
As epidemiologic evidence has mounted, it is now recognized that acute kidney injury (AKI) and chronic kidney
disease (CKD) are not distinct phenomena, but closely interwoven1. As pointed out by Chawla et al., it has been
discovered that the risk factor most associated with AKI is a pre-existing CKD diagnosis, which increases risk by as
much as ten times11. We thus augment our previous analysis by including AKI as an outcome variable in our trajectory
model.
Over the study period, 39.65% of patients suffer from at least one instance of acute kidney injury (AKI), extracted
from the EHR using ICD9 codes 585.5, 6, 8 and 9, and ICD10 codes N17.0, 1, 2, 8, and 9. The median number of
diagnoses per patient is equal to four. However, most of these diagnoses are made in rapid succession after a patient
has been hospitalized, such that we only keep track of whether a patient has at least one recorded instance of AKI over
the entire study period. Nine percent of the patients had a prior history of AKI before the year 2009. Approximately
38% of these patients encounter another episode of AKI in the study period, which is not significantly different from
the overall AKI rate.
Methods
Single Trajectory Model
Let Yij define the multivariate response variable of individual i for the j-th marker, for example eGFR. Then,

Yij = (Yij1,, YijT ) is a vector of length T, which holds the quarterly lab results for the marker in question; T
denotes the total number of time periods. For a single response, the group-based trajectory model posited by Nagin20
assumes the following density for a sequence of longitudinal measurements y ij = ( yij1,, yijT ) :

f ( y ij ) =  pk f Yij ( y ij ∣ C = k ) ,
K
(1.1)
k =1

where K denotes the total number of groups, pk denotes the probability of belonging to group k , and
f Yij ( y ij ∣ C = k ) is the conditional density of the observed data vector given class k . The probabilities pk of this
mixture model are not estimated directly, but related via the SoftMax function to a k-dimensional vector  of class

1197
coefficients, and time-stable covariates x with associated weight vectors w k . Through the time-stable covariates, the
model permits the group memberships to vary by individual:
ek + xi w k
T
(1.2)
pik = K .
e  l + xi T w l

l =1

The response vector for each outcome is modelled as a multivariate normal random variable
Y j | C = k ~ N ( μ jk ,  2j I ) , (1.3)
where the elements of the mean vector are related to the period t (= time in quarters since diagnosis of CKD Stage III)
of the individual patient as follows:
 jkt =  jk 0 +  jk1 t +  jk 2 t 2 +  jk 3 t 3. (1.4)
As can be seen, our group-based trajectory model assumes that the trajectories in each group have a simple polynomial
form. From our experience, a polynomial order above three is rarely necessary, which is why we have constrained the
model to have cubic terms at most.

Multi-Trajectory Model

Multi-trajectory modeling is an extension of the single trajectory model that jointly models the trajectories of multiple
outcomes21. In this model extension, the density for J outcomes becomes

f (y i ) = f (y i , j =1,, J ) =  pik  f Yij ( y ij ∣ C = k ) .


K J
(1.5)
k =1 j =1

Model fitting and inference is carried out via the traj procedure from the Stata package of the same name 22, which
implements a Newton-Raphson optimization algorithm for maximum likelihood estimation. Following the suggestion
given by Jones et al., the Bayesian information criterion (BIC) is used to perform model selection and to determine
the number of groups K22.
Using the Multi-Trajectory Model to Predict the Probability of Acute Kidney Injury
We apply a recent extension of group-based trajectory modeling8, which estimates the joint distribution of the
trajectories and an outcome of interest. The extended model provides trajectory-group specific estimates of the
probability of the outcome, which is acute kidney injury (AKI). We denote these estimates by ˆ k . Combining the
ˆ k estimates with the posterior probability of group membership (PPGM), we calculate individual level predictions
for the probability of AKI at each time point as follows:

ˆit =  k =1 PPGMitk ˆ k
K
(1.6)

The posterior probability of group membership in group l can be computed using Bayes' rule as

pil  f Yij ( y ijt ∣ C = l )


J

(
PPMG lit = Pr C = l ∣ Yijt = y tij  )= j =1
, (1.7)

 p  f (y ∣ C = k)
j =1,, J K J
t
ik Yij ij
k =1 j =1

where the number of biomarkers J is equal to five for the purposes of this paper. For the conditional densities, it
follows from Equation 1.3 that
 yijt −  jkt 
(y ) =   
T
f Yij ∣C=k  , (1.8)
ij
j 
t =1  

in which  is the density function of the standard normal distribution. For the data used for model fitting, T=18
(measured in quarters). Notice that we can calculate the PPGMs without access to eighteen quarters of data per patient

1198
by conditioning only on the measurements yijt that have been observed so far after the initial CKD diagnosis. This
way, we can calculate posterior probabilities that take all currently available information into account and may serve
as a prognostic tool for clinicians to detect high-risk patients early on and not when it may already be too late.
Results
We fit a multi-trajectory model with eight groups. For comparison purposes, we chose the same number of groups as
in our previous analysis in which we used the Bayesian information criterion (BIC) as a model selection criterion to
pick the number of groups as well as the order of the polynomials. The updated model incorporates a binary indicator
for AKI occurrence as an outcome variable. The eight-group model with all five biomarkers as well as the AKI
outcome variable, but without any other covariates, forms our baseline model. The estimated trajectories for the
biomarkers of CKD and related complications are displayed in Figure 1. The results do not differ significantly from
the ones obtained previously3. The groups are roughly the same in size, with the most extreme groups, one and eight,
having slightly lower relative frequencies than the others. Table 1 reports the AKI probability estimates associated
with each trajectory. The results show clear differences in ˆ k across trajectory group, from a low of .12 for group 8
to a high of .78 for group 1.
While all patients in the cohort have a diagnosis of CKD Stage III and thus suffer from kidney damage, we see that
some of the identified groups are characterized by trajectories that show almost no change in eGFR values (groups 5-
8). On the other hand, groups one to four show a clear deterioration in kidney function after the initial diagnosis. We
ordered the groups decreasingly according to the estimated group-level probability of AKI ( ˆ k ). As can be seen, a
higher estimated probability of AKI generally coincides with a worse eGFR – apart from group seven, eGFR
trajectories are monotonically getting better from left to right in Figure 1. Patients assigned to group seven tend to
have very low phosphate, indicating that they may have developed severe hyperphosphatemia. However, low
phosphate is also associated with AKI, which may explain the larger AKI rate of group seven compared to group six.

Figure 2 shows for each group how the individual-level AKI probabilities develop over time via the quantiles of the
estimates. We can see that as time passes after the initial CKD diagnosis, the individual estimates in any group
converge towards the group-level estimates ˆ k , indicating that individuals are assigned to a single group with a
higher and higher probability. The probability estimates vary considerably across groups, confirming the sentiment
that AKI and CKD should not be looked at in isolation. In the next section, we formally investigate the performance
of the estimated model.

Figure 1. Fitted trajectories of the baseline eight-group multi-trajectory model for the five considered biomarkers,
ordered increasingly by the estimated probability of AKI ( ˆ k ). Group size proportions are displayed next to the group
label inside the parentheses.

1199
Group 1 Group 2 Group 3 Group 4 Group 5 Group 6 Group 7 Group 8
Average 0.717 0.531 0.513 0.449 0.353 0.299 0.227 0.168
Lower 0.647 0.465 0.453 0.393 0.292 0.245 0.184 0.124
Upper 0.777 0.596 0.571 0.507 0.419 0.36 0.276 0.224

Table 1. Table of the estimated group-level AKI probabilities ˆ k alongside 95% Wilsonian confidence intervals.

Figure 2. For each of the eight groups, the median ˆit of all individuals in the respective group is displayed as the
green line, with a confidence band of the group-specific AKI probabilities ˆ k overlaid in gray (the point estimates
and lower- and upper bounds for the final period are displayed in Table 1). Also displayed are lines for the 10 th and
90th percentile of ˆ it (in blue and red, respectively).

Model Evaluation
We used a five-fold cross validation scheme to obtain unbiased estimates of model performance. Randomly splitting
the data set into five equally sized folds, we estimated the baseline model in each case, using all data aside from the
fold in question for model training. The estimated model coefficients were then used to calculate the PPGMs of every
individual in the held-out fold and subsequently their outcome probabilities ˆ it at the various time-periods. In this
scheme, the data of every individual is used once to calculate individual-level predictions, without his or her data
having been used to train the model from which the predictions were obtained.
Besides the baseline model, which fits trajectories for all five biomarkers jointly with the outcome, we have also
included time-stable covariates, namely demographic variables and indicators for the existence of diabetes and
hypertension. Since a patient may have AKI more than once over their lifetime, we have additionally estimated a
model with an additional risk factor: a binary indicator for an occurrence of AKI prior to the study period. This
coefficient turned out to be non-significant for all groups, which is why we report in this section the results of the
baseline model.
We have performed two sanity checks to ensure that the model is well calibrated. First, we confirm that the estimated
outcome probabilities are highly correlated with the incidence of AKI by regressing AKI occurrence for each patient
on the outcome probabilities ˆ it for varying times. The results are reported in Table 2. If the probabilities are well
calibrated, the resulting regression should have an intercept value close to 0 and a slope close to 1, thus forming a 45-

1200
degree line through the first quadrant. This is indeed what we find for T=6, 12, and 18. All intercept estimates are near
0 and the slope estimates are very close to 1, particularly for T=12 and 18.
Table 2. Displayed are coefficients for regressions of AKI occurrence on
estimated outcome probabilities at time 6, 12, and 18.
Outcome
(1) (2) (3)
T=6 0.912***
(0.070)
T=12 0.963***
(0.069)
T=18 0.995***
(0.068)
Constant 0.037 0.016 0.002
(0.030) (0.029) (0.029)
Observations 1,947 1,947 1,947
Residual Std. Error (df = 1945)0.469 0.467 0.465
***
Notes: Significant at the 1 percent level.

Second, we show that, at various fixed points in time, the average of the predicted outcome probabilities, ˆ it , closely
corresponds to the actual incidence rates of individuals with ˆit inside one of several initially chosen bins of ˆ it . For
this check, we binned the predicted outcome probabilities into intervals with a width of 0.1, which resulted in six ˆit
bins overall. The results of this test are reported in Table 3.
The endpoints of the six bins are displayed in the LL and UL columns of the table. Inspection of the table shows a
very close correspondence between the binned averages of ˆ it and incidence of AKI. The predicted probabilities line
up well with the actual incidence rates in each bin, even when only the first six quarters are considered. At a
significance level of 5%, we do not observe a significant difference.

Table 3. Displayed are the observed rates of AKI in bins of the estimated outcome probabilities ˆit in the y-column.
The alpha column shows the average of the ˆ it coefficients in the respective bin at the given time, with N being the
number of patients assigned to the bin.

Table 4 is intended to examine whether the outcome probabilities ˆit have prognostic value. Because an instance of
AKI can occur anytime over the course of the 18-quarter observation window, the tabulations of the AKI incidents
are limited to those occurring after a specified value of t. The first panel is for t=0 and thus includes all instances of
AKI. Here again we see that the AKI probability estimates are well calibrated. The next two panels are for t=6 and
t=12. Observed AKI rates (y) in each bin are now lower than the average of the estimated probabilities in that bin
(alpha) because incidents of AKI prior to t are excluded from the calculation of y. Note, however, that a strong
monotonic relationship between y and alpha still exists, which implies that larger estimates of ˆ it imply higher future
risk of AKI.

1201
Table 4. Compared to Table 3, the observed rates of AKI in this table are calculated by only taking those patients into
account who have an instance of AKI after the considered time epoch.

We further investigate the observation that the predicted probabilities line up with the actual incidence rates. Figure 3
shows the time taken until trajectory group assignment stabilizes for a patient, where we define PPGMs as stabilized
once the highest probability class starts to match that of the final probabilities at the study horizon of 18 quarters. As
we can see from the graph, patients are assigned to one of the risk groups early on, with almost 50% of patients being
assigned to their final group already by the third quarter and 76.2% by the seventh quarter (less than two years), as
indicated by the overlaid dashed line. Looking at the individual bars, we see that most patients stabilize in periods
one, two, and three, though groups differ quite a bit in how soon patients are finally assigned to them, with group five
being exceptional insofar as many patients are assigned to it already in the first period.

Figure 3. The overlaid line shows the cumulative percentage of patients for whom the PPGM stabilizes at the given
time, which is displayed on the x-axis. Patient’s trajectories are said to be stabilized when their highest probability
class starts to match that of the final probabilities after 18 quarters, which is the study horizon.
Table 5. Displayed are demographic variables and biomarker averages (over all periods) of the detected risk groups.
Lab Markers - Group Profile
Age Gender Race BMI Hypertension Diabetes
Group EGFR PTH HGB HCO3 PO4
Mean SD % Female % Black Mean SD Yes Yes Mean SD Mean SD Mean SD Mean SD Mean SD
1 69.1 12.4 51% 8% 31.68 7.76 99% 61% 23.83 5.39 145.68 77.52 10.95 0.86 23.55 2.09 4.08 0.49
2 76.7 9.1 65% 6% 31.72 8.12 99% 52% 32.60 5.67 98.07 39.11 11.68 0.86 28.78 2.27 3.58 0.45
3 74.2 9.6 65% 3% 31.01 7.50 99% 52% 32.98 5.77 59.06 20.03 10.79 0.61 24.48 1.94 3.76 0.43
4 72.1 10.7 42% 5% 30.12 6.46 97% 43% 32.43 5.16 84.22 33.25 12.86 0.75 23.73 1.74 3.56 0.42
5 70.0 10.2 57% 1% 31.42 6.57 98% 60% 38.23 6.05 28.69 9.54 12.47 0.87 26.21 2.33 3.70 0.42
6 71.7 10.3 54% 11% 31.44 7.15 97% 41% 48.39 5.08 67.12 22.60 12.17 0.74 26.71 2.26 3.47 0.40
7 70.9 10.1 22% 1% 31.23 6.33 96% 41% 39.29 5.00 63.33 23.01 14.33 1.03 27.31 2.21 3.23 0.37
8 65.6 11.2 24% 7% 31.72 5.99 91% 40% 55.79 9.12 45.10 19.78 14.39 1.06 26.62 2.47 3.21 0.40
Total 71.5 10.8 46% 5% 31.26 6.98 97% 48% 38.06 10.65 72.90 46.03 12.60 1.54 26.07 2.78 3.54 0.49

1202
Table 5 shows the group profiles of the eight detected risk groups in terms of their demographic variables and average
biomarker values (calculated over all time periods). As can be seen, the groups differ considerably with regards to the
percentage of African-Americans and whether their patients have diabetes, with the average biomarker values
reflecting the relationships displayed in Figure 1.
We see the usefulness of our model not primarily in purely predictive purposes, but rather as an analytical tool that
may aid the work of clinicians in screening at-risk patients. Hence, we eschew some of the more traditional evaluation
metrics for predictive models. However, while a black-box model with careful feature engineering and a broader set
of predictors might return better predictions, it would not have the easy interpretability of the trajectory model, which
manages to condense a variety of information into a single group variable. And in fact, predictions obtained by the
model are reasonable when you consider that only a single variable (the group indicator) is used to predict whether a
patient experiences AKI. When looking at the Receiver Operating Characteristic (ROC), the area under the curve
(AUROC) stands at approximately 0.7 on the held-out data. As it stands, we believe the model to be good enough to
discriminate between patients and thus to serve as a useful tool for medical practitioners. Using it, they may obtain a
quick overview of a patient’s risk of AKI and how that risk is connected to the various biomarker values. AUROC
values are slightly better than those of a logistic regression fitted using the lab values available at any given time, as
displayed in Table 6. Additional markers, particularly that of proteinuria, will likely improve the performance of our
model.
Table 6. Areas under the curve of the Receiver Operating Characteristic (ROC) for both our multi-trajectory model
and for comparison purposes a logistic regression at various time points. The logistic regression was estimated using
the marker values available at the given point in time.
Model \ Time t=6 t=12 t=18
Logistic Regression 0.677 0.67 0.672
Multi-Trajectory Model 0.672 0.686 0.692

Figure 4 highlights the potential utility of our approach for medical practitioners based on two example patients that
were chosen for illustrative purposes. For both patients, the figure displays both the development of the PPGMs as
well as their AKI predictions. While patient 186 starts out being assigned to group four, after a few periods he is
placed with high probability in high-risk group one, which coincides with a high prediction for AKI. By contrast,
patient 24 emerges as a low risk patient due to the progression of his biomarker values. In this example, neither of the
patients ends up experiencing AKI over the course of the study period. It would be up to a physician to conclude
whether patient 186 would warrant more scrutiny going forward given his high-risk status as determined by the model.

Figure 4. For two patients, posterior probabilities of group membership (PPGM) are displayed on the left-hand side,
alongside 95% confidence intervals. On the right-hand side, the estimated individual-level AKI probabilities are
displayed, again with error bars representing a 95% confidence interval.

1203
Conclusion and Discussion
Building upon our prior work on modeling disease progression of Chronic Kidney Disease (CKD) and related
complications via the use of group-based multi-trajectory modeling, we extend the previous analysis by incorporating
the occurrence of Acute Kidney Injury (AKI) as an outcome variable into the trajectory model. Current research
suggests that CKD and AKI are not as distinct as once assumed, but rather interrelated, with CKD now not only being
regarded as one of the main risk factors for AKI, but also a potential long-term consequence of an episode of AKI.
Since AKI is an under-diagnosed event that is both preventable and treatable with early detection, there is a clear need
for prognostic models that may equip clinicians with tools to anticipate disease progression and to inform timely and
appropriate clinical decision-making. Using the estimated glomerular filtration rate (eGFR) as a biomarker for CKD
and Hemoglobin (HGB), parathyroid hormone (PTH), phosphate (PO4), and carbon dioxide in bicarbonate form
(HCO3) as markers for some of the main complications of CKD (Anemia, Secondary Hyperparathyroidism,
Hyperphosphatemia, and Metabolic Acidosis), we have estimated an eight-group CKD progression model over a time
horizon of 18 quarters, which jointly estimates the trajectories of these markers as well as the relationship between the
individual groups and our outcome of interest, AKI. The eight groups identified by the group-based trajectory model
(GBTM) show distinct trajectories for all biomarkers and differ sharply in terms of their estimated AKI probabilities,
with the high-risk group having an upper estimate of 0.78, whereas the group-level estimates for the lowest-risk group
start at 0.12. Confirming the strong relationship between AKI and CKD, we see that a higher group-level AKI
probability goes hand in hand with a worse eGFR value. Several calibration checks have confirmed that the estimated
model is well calibrated and has prognostic value.
The usefulness of GBTMs for clinicians is enhanced by the fact that group membership can be predicted when data is
available only for a subset of all time periods. Predictions can be updated once additional data becomes available.
While singular events such as an episode of AKI might cause the estimated group probabilities to change significantly,
it is of interest to investigate the overall likelihood that patients change group assignments as new data becomes
available. We have demonstrated that groups tend to stabilize early, with 46% of patients being assigned to the
maximum a posteriori group after only three quarters of data, and more than 75% within two years. Since it is not too
common for patients to switch groups over time, such a model could be utilized by clinicians without the fear of
having to work with information that is likely to become obsolete soon.
By showing how group assignments and estimated AKI probabilities of two patients develop over time, we
demonstrate how individual-level patient predictions together with the group-level information in the form of AKI
probabilities and trajectory estimates may aid clinicians in early-detection and disease management. To move from
this illustration to deployment of our methods in a clinical care setting, several existing limitations will have to be
overcome. Our model may be sub-optimal due to a lack of mortality data and missing marker values for proteinuria,
which is a critical risk factor for AKI besides CKD. Given that patients in our data set are quite homogeneous in terms
of demographic variables such as race and age, it remains an open question how well results will generalize to a
broader patient population.
Since the biomarkers are only infrequently observed when patients go to appointments with their doctors, diagnostic
delays are inevitable. Further research should be undertaken to investigate what factors cause AKI or a deterioration
in one or more of the biomarkers whenever they occur. Should prediction of the exact times at which AKI occurs be
a goal, different model formulations would have to be explored. Furthermore, we did not take account of the fact that
AKI is not a singular event, and that some patients might suffer from several episodes of AKI over time. A finer-
grained analysis that takes the times and frequency of AKI episodes into account may provide further insights that our
model does not currently address.
One might also question the model assumption that conditional on group membership, the observations of a trajectory
at different times are uncorrelated with each other. However, this assumption is not as restrictive as it might sound:
the biomarker trajectories are modelled to be conditionally independent only at the group and not the population level.
Specifically, the model assumes that conditional on the latent group membership, the Gaussian noise added to the
trend line is drawn from the same distribution at all time periods. These limitations aside, we believe that group-based
trajectory modeling constitutes a simple yet powerful tool for risk stratification that provides easy-to-visualize-and-
interpret developmental trajectories to clinicians. The ability to jointly estimate the trajectories together with an
outcome of interest allows one to obtain easily interpretable risk profiles of patients. Equipped with this information,
clinicians may be better able to anticipate the future trajectory of disease progression. This in turn could lead to an
improvement in timely and appropriate clinical decision-making.

1204
Acknowledgment
We would like to thank the physicians and staff of the nephrology practice that shared the data for this study and
their knowledge about CKD and AKI.
References
1. Murtagh, F.E., Murphy, E., and Sheerin, N.S. 2008. “Illness trajectories: an important concept in the management
of kidney failure”, Nephrology Dialysis Transplantation, 3746-3748.
2. Padman R, Nagin DS, Xie Q. Disease Progression and Risk Prediction for Chronic Kidney Disease: Analysis of
Electronic Health Record Data using Group-Based Trajectory Models. In: Proc. Work. Inf. Syst. Technol.
Auckland; 2014. .
3. Burckhardt P, Nagin DS, Padman R. Multi-Trajectory Models of Chronic Kidney Disease Progression. AMIA .
Annu Symp proceedings. 2016;2016:1737–46.
4. Greene, TH. 2012. Longitudinal Progression Trajectory of GFR Among Patients With CKD. American journal
of kidney diseases, 504-512.
5. Perotte A, Ranganath R, Hirsch JS, Blei D, Elhadad N. Risk prediction for chronic kidney disease progression
using heterogeneous electronic health record data and time series analysis. J Am Med Informatics Assoc.
2015;22(4):872–880.
6. J. Futoma, Sendak M, Cameron C, Heller K. Scalable joint modeling of longitudinal and point process data for
disease trajectory prediction and improving management of chronic kidney disease. In UAI, to appear, 2016.
7. Echouffo-Tcheugui JB, Kengne AP. Risk models to predict chronic kidney disease and its progression: a
systematic review. PLoS Med, 2012, 9, (11), pp. e1001344.
8. Nagin DS, Jones BL, Elmer J. Using Group-Based Trajectory Models to Inform Prognostication. Forthcoming.
9. Collins AJ, Foley RN, Gilbertson DT, Chen SC. United States Renal Data System public health surveillance of
chronic kidney disease and end-stage renal disease. Kidney Int Suppl (2011), 2015, 5, (1), pp. 2-7.
10. Lameire NH, Bagga A, Cruz D, et al. Acute kidney injury: an increasing global concern. Lancet 2013; published
online May 31. http://dx.doi.org/10.1016/S0140-6736(13)60647-9.
11. Chawla LS, Eggers PW, Star RA, Kimmel PL. Acute Kidney Injury and Chronic Kidney Disease as
Interconnected Syndromes. N Engl J Med. 2014;371(1):58–66.
12. Singh P, Rifkin DE, Blantz RC. Chronic Kidney Disease: An Inherent Risk Factor for Acute Kidney Injury?. Clin
J Am Soc Nephrol 5: 1690–1695, 2010. doi: 10.2215/CJN.00830110.
13. Jha V, Garcia-Garcia G, Iseki K, Li Z, Naicker S, Plattner B, Saran R, Wang AY, Yang CW. 2013. Chronic
kidney disease: global dimension and perspectives. Lancet, 382(9888):260-72.
14. Jencks, S.F., Williams, M.V., and Coleman, E.A. 2009. “Rehospitalizations among Patients in the Medicare Fee-
for-Service Program”, New England Journal of Medicine, 360:1418-28.
15. O’Mara NB. Anemia in Patients With Chronic Kidney Disease. Diabetes Spectr. 2008;21(1):12–9.
16. Tomasello S. Secondary Hyperparathyroidism and Chronic Kidney Disease. Diabetes Spectr. 2008;21(1):19–25.
17. Hruska K, Mathew S, Lund R, Qiu P, Pratt R. Hyperphosphatemia of chronic kidney disease. Kidney Int.
2008;74(2):148–57.
18. Kraut JA, Kurtz I. Metabolic acidosis of CKD: Diagnosis, clinical characteristics, and treatment. Vol. 45,
American Journal of Kidney Diseases. 2005. p. 978–93.
19. Levey AS, Stevens LA. Estimating GFR Using the CKD Epidemiology Collaboration (CKD-EPI) Creatinine
Equation: More Accurate GFR Estimates, Lower CKD Prevalence Estimates, and Better Risk Predictions. Am J
Kidney Dis. 2010;55(4):622–627.
20. Nagin DS. Group-Based Modeling of Development. Cambridge, MA: Harvard University Press; 2005.
21. Nagin DS, Jones BL, Lima Passos V, Tremblay RE. Group-Based Multi-Trajectory Modeling. Statistical
Methods in Medical Research · October 2016. DOI: 10.1177/0962280216673085.
22. Jones BL, Nagin DS. 2013. A Note on a Stata Plugin for Estimating Group-based Trajectory Models. Sociological
Methods Research, vol. 42, issue 4, pp. 608-613.

1205

You might also like