Professional Documents
Culture Documents
UVA Transplant Project
UVA Transplant Project
SYS 4021
Project 3:
Design Improvements for
the University of Virginia Transplant
Center
Donald E. BROWN
brown@virginia.edu
Laura BARNES
lb3dp@virginia.edu
Summary
This study considers the number of kidney and liver transplants at UVA and comes up with an evaluation for
these organic transplants with the MCV and Duke center overall and in different ethnic group especially for
minorities. UVA has the smallest trend on the number of kidney transplants overall and in non-white group
as compared to the two centers over the period 1988 2012. The t-test shows that there is a difference
between the number of transplants at UVA and the other centers at 5% level. The 95% bootstrap confidence
interval of the mean difference also indicates that I can reject the null hypothesis of mean difference is zero.
Time series linear model is constructed to predict the mean difference between two centers in 2013.The
results from Bootstrap and Monte-Carlo simulation reveals that the 95% prediction confidence interval does
not contain zero, meaning that there is a difference between the prediction numbers of kidney transplants.
The negative confidence interval tells that the predicted number of kidney transplants at UVA overall and for
non-whites in 2013 is less than the predicted number of kidney transplants at MCV and Duke. This suggests
UVA to do better at recruiting people overall and at recruiting people from other ethnicities. For liver
transplants, it is hard to conclude that building the new Roanoke center in 2005 has increased the number of
liver transplants at UVA. Linear model and Poisson model to model the number of liver transplants show
contradict results. Based on linear model with time series, the p-value of Roanoke variable is 0.014 and is
less than 0.05. With this model, I can reject the null hypothesis at 5% and conclude that building the
Roanoke center has increased the number of liver transplants. Meanwhile based on Poisson model with time
series, Roanoke variable does not affect the number of liver transplants and is not significant at 5% level.
This suggests UVA to do more research on liver transplants and it may be interesting to collect data at UVA
C-ville and UVA Roanoke center.
Honor pledge: On my honor, I pledge that I am the sole author of this paper and I have accurately cited all
help and references used in its completion.
Imran A. Khan
December 7, 13
1. Problem Description
1.1. Situation
Organ transplantation replaces diseased or damaged organs with functioning organs from either deceased or
living donors. The complexity of these procedures requires highly skilled teams of physicians, nurses, and
support staff as well as facilities for the surgery and recovery. According to research from the United States,
the need for organ donation has become a growing concern over the last decade as the gap between organ
donors and those awaiting transplants widens [4].
The University of Virginia has conducted organ transplantation for more than 30 years and now provides
services for kidney, pancreas, liver, islet, heart, and lung transplantation [2]. The UVA Health System is
consistently ranked as one of the Top 100 Hospitals in America [5]. The availability of first-rate
transplantation services is a component of these rankings. The Transplant Center desires to continue to
increase the number of transplants in all categories but needs guidance on how to achieve this goal [2].
Organ transplantation processes have five primary steps [2]:
1. Referral from a primary care physician;
2. Determination of eligibility and placement on a waiting list;
3. Matching of donor organ with the patient;
4. Acceptance of the organ by the transplant center;
5. Transplantation surgery and recovery.
Figure 1 displays the total number of kidney transplants and donors from the 11 geographic regions. It shows
that the number of transplants is always higher than the number of donors. The center with the most recent
kidney transplants is MCV and the center with the least kidney transplants is UVA and UNC in the last few
years. Duke has done more kidney transplants in 2000 2005 than all other centers.
Figure 1. Plot Kidney Transplants
Table 1 shows that MCV center has the highest number of kidney transplants in a single year while UVA
and UNC center has the lowest number of kidney transplants in a single year. On average, DUKE center has
significantly more kidney transplants than other centers.
Table 1. Summary statistics on number of kidney transplants by center/region
Statistics
UVA
UNV
MCV
Duke
R11Donor
107
90
137
121
1337
24
24
29
43
456
68
90
136
74
1304
65.8
54.64
75.64
81.76
956.52
4.16
3.3
7.24
4.88
57.08
MCV appears to have the largest trend in kidney transplants and UVA appears to have the smallest trend in
kidney transplants over the period 1988 2012. UVA and Duke have nearly parallel trend in kidney
transplants over the period of time (Figure 2).
Figure 2. Kidney transplants trend over the period of time by centers
Taking into account the ethnic group, I can observe that UVA performs kidney transplants better for white
people and MCV performs better for minorities or non-whites (Figure 3). This means that UVA center has to
improve its performance of kidney transplants for non-whites. Meanwhile, Duke center performs better in
the beginning of the year for white people but then at the end of the year more kidney transplants for nonwhite people than for white people. Comparing the number of kidney transplants of non-whites between the
other centers, I can see that UVA has the poorest performance with the smallest trend and MCV has the
greatest performance with the largest trend (Figure 4).
Figure 5. Scatterplot matrix for kidney transplants between centers and region
Figure 5 shows that the total number of kidney transplants, donors, and the number of kidney transplants in
each center has a strong positive correlation. This can be easily explained as all these numbers have been
increasing over the last 10 years.
Unlike kidney transplants, there are more liver donors than liver transplants as shown in Figure 6. In general,
the four centers fluctuate between decreasing and increasing over the year. UVA reaches the highest liver
transplants in 2009 but then it drops in 2010. In addition, all centers perform well for liver transplants of
white people than non-white people (Figure 7 & 8).
Figure 6. Plot of liver transplants
Table 2 summarizes the statistics on number of liver transplants. On average, UVA has more liver
transplants than the other centers. Also, it has the highest liver transplants in 2012.
Table 2. Summary statistics on number of liver transplants
Statistics
UVA
UNV
MCV
Duke
87
73
66
67
16
11
68
31
60
67
46.52
38.68
45.24
38.08
4.575
4.398
2.815
2.542
Based on the above situation, I can see that MCV and Duke do better job than UVA for the overall kidney
transplants. Also, UVA has the poorest performance of kidney transplants of non-whites as compared to the
other centers. This motivates me to compare kidney transplants at UVA with MCV, since MCV has the
largest trend or a stable increase overall. Also, MCV has similarity in demographics as UVA, so MCV is a
good choice to compare the kidney transplants overall and in non-white group. In addition, it is also my
interest to compare between UVA and Duke since they both have similar trend over the period of time.
There is no need to compare the number of liver transplants between UVA and the other centers since UVA
has shown a good performance over the period of time. But it should be noted that UVA built a new center in
Roanoke in 2005. As shown in Figure 6, I see that the number of liver transplants at UVA tends to increase a
lot from 2005. So it is my interest to see if building the new center has increased the number of liver
transplants.
1.2. Goal
The aim of this study is to come up with a new design that could potentially increase the number of kidney
and liver transplants at UVA. In terms of kidney transplants, the goal is to figure out on how to increase or
improve the number of transplants overall and in non-white group by comparing to MCV and Duke centers.
In terms of liver transplants, the goal is to analyze the efficiency in increasing the number of transplants
when the new center Roanoke was built in 2005.
1.3. Metrics
For kidney transplants analysis, the difference between the number of kidney transplants at UVA and MCV
and at UVA and Duke are used as a response variable in linear regression model. For liver transplants
analysis, the number of liver transplants at UVA is used as a response variable in linear regression model.
Adjusted R2, AIC, and MSE are considered to compare the performance between the fitted models. In
addition to liver transplant analysis, Poisson model is also considered. The Diagnostic plots (Normal Q-Q
plot, Residuals vs. Fitted, Residuals vs. Leverage, Scale-Location), Autocorrelation function (ACF), and
Partial Autocorrelation function (PACF) are used to investigate if autoregressive term is needed in the
model. Bootstrapping method is also applied to estimate the regression parameters with B=2000 as the
number of bootstrap samples. This includes constructing 95% percentile and BCa confidence interval.
I use significance level of 5% for the analysis throughout this study. If the confidence level (p-value) is less
than 0.05, then my (null) hypothesis is rejected in favor of the alternative. Alternatively, if p-value is greater
than 0.05, then my null hypothesis should not be rejected.
1.4. Hypotheses
Hypothesis 1:
H0: There is no difference between number of kidney transplants at UVA and MCV in 2013
H1: There is a difference between number of kidney transplants at UVA and MCV in 2013
Hypothesis 2:
H0: There is no difference between number of non-white kidney transplants at UVA and MCV
H1: There is a difference between number of non-white kidney transplants at UVA and MCV
Hypothesis 3:
H0: There is no difference between number of kidney transplants at UVA and Duke in 2013
H1: There is a difference between number of kidney transplants at UVA and Duke in 2013
Hypothesis 4:
H0: There is no difference between number of non-white kidney transplants at UVA and Duke
H1: There is a difference between number of non-white kidney transplants at UVA and Duke
Hypothesis 5:
H0: Building the Roanake center does not increase the number of liver transplants at UVA
H1: Bulding the Roanake center increases the number of liver transplants at UVA
2. Approach
2.1. Data
The data for this study comes from the Organ Procurement and Transplantation (OPTN) [1] for 4 transplant
centers and two region US, region 11 (Kentucky, North Carolina, South Carolina, Tennessee and Virginia).
Each of regions has number of transplants performed at center/ region, background of the patients (age and
ethnic group) There are 15 databases related to organ transplant data in csv (comma-separated values)
format, i.e. USdonor.csv, UStransplant.csv, R11donor.csv, R11xplant.csv, UVAxplant.csv, Dukexplant.csv,
MCVxplant.csv, UNC.csv, MCVage.csv, MCVethnic.csv, R11age.csv, R11ethnic.csv, UVAage.csv, and
UVAethnic.csv.
In total, there are 26 observations in each data showing the number of organ transplants from 1988 to 2013.
In this study, I only consider the number of kidney and liver transplants at both region and center. The series
of number kidney and transplants over the last 26 years are tabulated in Table A1 and A2 (Appendix A).
There are no missing values in the data but it should be noted the 26th observation (2013) is not included in
the analysis since the data are incomplete for that year. For liver transplant data at UVA center, there is one
additional variable, i.e. binary variable, indicates before (0) and after (1) building the Roanoke center.
2.2. Analysis
2.2.1.
To predict the difference of kidney transplant between UVA and the other centers in 2013, linear regression
model is built by using R software. Before I start with model building, I make a time series plot of kidney
transplants for each center and also a time series plot of the difference of kidney transplant at UVA and the
other centers (MCV and Duke). A classical paired t-test is also performed to see if there is a difference of
kidney transplants from 1988 to 2012. This test works best on normally distributed data, thus if the
assumption of normality is violated, bootstrapping method is then applied.
In order to reject my hypotheses related to kidney transplants, I build 4 linear regression models by using
r11donor variable to control for Region 11:
Model 1: ( ) = 0 + 1 11_ +
Model 2: ( ) = 0 + 1 11_ +
Model 3: ( ) = 0 + 1 11_ +
Model 4: ( ) = 0 + 1 11_ +
where:
- is the number of kidney transplants at UVA,
- is the number of kidney transplants at MAC,
- is the number of kidney transplants at Duke,
- is the number of kidney transplants of non-whites at UVA,
7
In order to answer my hypothesis related to liver transplants, I start with performing a classical two sample ttest to see if there is a difference between the number of liver transplants before (1988 2004) and after
(2005 2012) building the Roanoke center. A non-parametric t-test is also performed, .i.e. Wilcoxon test, in
case normal assumption is violated under t-test. Time series of liver transplants is also considered in this
analysis. The residuals of the time series model is then used in t-test and Wilcoxon test to compare the
number of liver transplants before and after building the Roanoke center. Furthermore, I consider linear
model and Poisson model to test whether the number of liver transplant increases after building the Roanoke
center by controlling for Region 11 in the model.
The stages of building linear model are as follows:
1. Consider a linear model with Region 11 and Roanoke center as predictor variables :
Model 5: = 0 + 1 11 + 2 +
1,
where = {
.
0,
3. Evidence
3.1. Kidney Transplants
Figure 9 shows the series of kidney transplants at UVA, MCV, and Duke from 1988 to 2012 and their
corresponding ACF and PACF plots. The series plot shows non-stationary and the correlogram (ACF) shows
that the series are correlated since lags 1-2 are significant for kidney transplants at UVA, lags 1-5 are
significant for kidney transplants at MCV, and lags 1-3 are significant for kidney transplants at Duke.
Similar result is observed for series of kidney transplants of non-whites as shown in Figure 10: ACF plot
shows sinusoidal decay and PACF plot cuts off after lag 1.
Figure 9. Kidney transplants series and its corresponding ACF and PACF plots
Figure 10. Kidney transplants series of non-whites and its corresponding ACF and PACF plots
10
A classical paired t-test is performed between the number of kidney transplants at UVA and other centers
from 1988 to 2012. The result is summarized in Table 3. On average, the number of kidney transplants at
UVA is smaller than the other centers and thus I have a negative mean difference. The test shows that I can
reject the null hypothesis, meaning there is a difference between the number of kidney transplants of nonwhite people at UVA and MCV and at UVA and Duke at 5% level. The p-values are smaller than 0.0001.
But for the overall number of kidney transplants at UVA and MCV, the test shows that the difference is zero
because I do not have a strong evidence to reject the null hypothesis at 5% level (p-value = 0.079). One of
the reason I cannot reject the null hypothesis is that the sample size is very small, only 25 observations. Also,
t-test may not be valid to the data because it is a parametric method that relies on an assumption of normal
distribution of the data. The histogram and the QQ plot as shown in Figure B1 (Appendix B) shows that the
difference between kidney transplants at UVA and other centers are not normal.
Table 3. Paired t-test on number of kidney transplants at UVA and other centers
Meana
Meanb
Mean
Difference
t-test
DF
p-value
65.8
75.64
-9.84
-1.834
24
65.8
81.76
-15.96
-3.828
17.08
49.88
-32.80
17.08
40.32
-23.24
95% CI
Lower
Upper
0.079
-20.913
1.233
24
0.001
-24.565
-7.355
-7.751
24
<0.0001
-41.534
-24.066
-8.918
24
<0.0001
-28.618
-17.862
Therefore, I apply bootstrap method to the data and the result is shown in Table 4. Unlike previous result, the
95% confidence interval of the difference of kidney transplant at UVA and MCV now does not contain zero.
This means that there is a difference between overall number of kidney transplants at UVA and MCV at 5%
level. The histogram of the bootstrap estimate on the difference of kidney transplants is normally distributed
(Figure B2-B5, Appendix B).
Table 4. Bootstrap estimate on kidney transplants difference and confidence interval
Standard
Original
95% Percentile CI
error
95% BCa CI
-9.84
5.317
(-20.440, -0.001 )
(-21.138, -0.253 )
-15.96
4.061
(-24.12, -8.28 )
(-24.26, -8.41 )
-32.80
4.164
(-41.20, -25.08 )
(-41.85, -25.24 )
-23.24
2.561
(-28.60, -18.24 )
(-28.72, -18.44 )
Figure 11 depicts the above analysis. I can observe that there is a difference between kidney transplants at
UVA and other centers over the year. The difference between kidney transplants of non-whites at UVA and
MCV and Duke tend to decrease over the year. Meanwhile, the difference based on the overall number of
kidney transplants fluctuates between increasing and decreasing.
11
Figure 11. Plot of difference between kidney transplants at UVA and other centers
Figure 12 displays the ACF and PACF plot of difference of number of kidney transplants at UVA and MCV
and at UVA and Duke. It shows that the difference series are correlated and it may be necessary to consider
autoregressive model.
Figure 12. ACF and PACF plot of difference of number of kidney transplants
Table 5 summarizes the estimate, standard error, and p-value of the regression parameters for Model 1 4.
The overall model is significant at 5% except Model 2. Diagnostic plot indicates that the regression
assumptions are violated. I can observe the model has non-constant variance and lack of fit based on the
residual v.s fitted plot. Also, the residuals do not follow Gaussian based on the QQ plot (Figure 13-16).
12
Table 5. Estimate and standard error of linear model with mean difference of kidney transplants as the response
0
1
Overall
model
Model 1
Estimate
P-value
(se)
43.079
0.012
(15.789)
-0.055
0.002
(0.016)
F-statistic: 12.19 on 1
and 23 DF, p-value:
0.001967
Model 2
Estimate
P-value
(se)
-7.042
0.644
(15.054)
-0.009
0.543
(0.015)
F-statistic: 0.3809 on
1 and 23 DF, p-value:
0.5432
Model 3
Estimate
P-value
(se)
0.642
0.886
(4.446)
-0.262
<0.001
(0.031)
F-statistic: 73.13 on
1 and 23 DF, pvalue: 1.34e-08
Model 4
Estimate
P-value
(se)
-9.281
0.051
(4.514)
-0.109
0.002
(0.031)
F-statistic: 12.36 on 1
and 23 DF, p-value:
0.001857
13
ACF and PACF plots of residuals for Model 1 4 show that the series correlated and it may be necessary to
consider AR(1) to model the residuals since the sample PACF plot has insignificant peak after lag 1,
especially for Model 1 and 3 (Figure 17). AIC plot for the different number of lag as shown in Figure 18
indicates that it may be adequate to consider AR(1) for residuals of Model 1 and 3 and AR(4) for residuals of
Model 2. I dont need to consider AR term for residuals of Model 4 since PACF plot show no significant lag.
14
15
Table 6. Estimate and standard error of linear model after accounting autoregressive terms
0
1
1
Model 1
Estimate
P-value
(se)
58.061
<0.001
(12.022)
-0.07
<0.001
(0.012)
0.708
<0.001
(0.157)
2
3
4
Overall
Model
F-statistic: 25.69 on 2
and 21 DF, p-value:
2.279e-06
Model 2
Estimate
P-value
(se)
-2.046
0.906
(17.092)
-0.013
0.425
(0.016)
0.577
0.024
(0.229)
-0.284
0.310
(0.271)
0.072
0.791
(0.267)
-0.473
0.045
(0.217)
F-statistic: 3.79 on 5
and 15 DF, p-value:
0.0203
Model 3
Estimate
P-value
(se)
0.523
0.911
(4.65)
-0.262
<0.001
(0.031)
0.317
0.154
(0.214)
Model 4
Estimate
P-value
(se)
-9.281
0.051
(4.514)
-0.109
0.002
(0.031)
F-statistic: 36.17 on
2 and 21 DF, pvalue: 1.577e-07
F-statistic: 12.36 on 1
and 23 DF, p-value:
0.001857
Table 6 summarizes the estimated parameters of time series regression models. The overall model is
significant at 5% level for the four models. The estimated parameters for Model 4 are still the same as in
Table 5 since no autoregressive term is considered. Diagnostic plots for Model 1 3 after accounting AR
terms show moderate violation of constant variance and Gaussian distribution. ACF plot shows no serial
correlations. It appears that I have accounted for everything based on those plots as shown in Figure 19-22.
Figure 19. Diagnostic plot of model 1 after accounting AR terms
16
Figure 22. ACF and PACF plots of residuals of model 1-3 after accounting AR terms
Model 2
Model 1
Model 3
17
I also perform bootstrapping method to the time series regression models. I have similar results as obtained
in Table 6. The confidence interval for the regression parameters of Model 1 do not contain zero as shown in
Table 7. This is similar as t-test for the time series regression model with p-values < 0.0001, meaning to
reject both the null hypothesis that the coefficient of r11donor and AR(1) equal to zero.
Table 7. Bootstrap results on time series regression model
Original
Bias
Std. Error
95% Percentile CI
95% BCa CI
0 *
58.06
-0.0921
10.767
(37.0800, 77.770 )
(36.120, 77.310 )
1 *
-0.07
0.0002
0.011
(-0.0906, -0.0500 )
(-0.0913, -0.0505 )
1 *
0.71
0.0007
0.151
( 0.4172, 1.0109 )
( 0.4210, 1.0171 )
0 *
-2.05
0.0399
14.321
(-30.449, 25.506 )
(-31.879, 23.051 )
1 *
-0.01
0.0000
0.013
(-0.0384, 0.0124 )
(-0.0398, 0.0119 )
2 *
0.58
0.0050
0.196
( 0.2057, 0.9814 )
( 0.2123, 0.9888 )
3 *
-0.28
-0.0068
0.234
(-0.7543, 0.1728 )
(-0.7549, 0.1686 )
4 *
0.07
0.0089
0.227
(-0.3731, 0.5344 )
(-0.3936, 0.5158 )
5 *
-0.47
-0.0047
0.185
(-0.8412, -0.1029 )
(-0.8115, -0.0784 )
0 *
0.52
0.0593
4.275
(-7.8556, 9.0951 )
(-7.5756, 9.3440 )
1 *
-0.26
-0.0002
0.029
-0.3215, -0.2049 )
(-0.3185, -0.2028 )
1 *
0.32
0.0046
0.200
(-0.0877, 0.7176 )
(-0.0979, 0.7044 )
0 *
-9.28
0.0439
4.315
(-17.351, -0.625 )
(-16.651, 0.167 )
1 *
-0.11
0.0001
0.030
(-0.1672, -0.0522 )
(-0.1672, -0.0521 )
Model 1
Model 2
Model 3
Model 4
Based on adjusted R2, AIC, MSE criteria, regression model after accounting AR terms performs better than
the model before accounting AR terms. Adjusted R2 is higher for time series regression model. It means
modeling the residuals with autoregressive model has improved the fitted model. AIC and MSE values are
also smaller for model after accounting AR terms (Table 8).
Table 8. Model assessment based on adjusted R2, AIC, and MSE
Before accounting AR terms
adj. R2
AIC
MSE
adj. R2
AIC
MSE
Model 1
0.318
229.76
451.50
0.682
204.23
208.17
Model 2
-0.026
227.38
410.44
0.411
183.78
189.95
Model 3
0.750
192.77
102.82
0.754
185.88
96.89
Model 4
0.321
193.54
106.02
0.321
193.54
106.02
18
My final model in Table 6 can be used to predict the mean difference of kidney transplants in 2013. To do
this, I need to forecast the r11 donors and the model residuals. This is done by forecasting r11donor series in
2013 and then I use this point forecast to predict the time series regression model. Prediction result with
bootstrap method is shown in Table 9. I also use Monte-Carlo simulation for improved CI and the result is
shown in Table 10. Comparing the two results, i.e. bootstrap and simulation prediction, I can see that MonteCarlo simulation prediction tends to have wider confidence interval. Also, in general the two methods show
that I am doing better than my estimate already since the predication estimate from bootstrap and simulation
is smaller (Figure 23). The bootstrap and simulation prediction do not show any deviation from normal
distribution (Figure 24-25). The 95% confidence interval for the predicted mean difference of kidney
transplants at UVA and MCV and at UVA and DUKE do not contain zero. This means there is indeed a
difference between the two centers at 5% level. Similar results are also obtained for non-whites kidney
transplants. Therefore, I can reject my null hypotheses of no mean difference in 2013 with p<0.025.
Table 9. Prediction of mean difference at UVA and other centers in 2013 using bootstrap method
Point
Forecast of
r11donor
Prediction of
mean difference
in 2013
Std.
Error
Bootstrap Prediction
95% Percentile
95% BCa
CI
CI
1264.4
-45.42
5.534
(-57.58, -35.30 )
(-58.43, -36.20 )
1264.4
-32.59
5.410
(-43.78, -22.10 )
(-44.04, -22.26 )
210.4
-50.44
4.113
(-58.30, -42.41 )
(-58.07, -42.25 )
210.4
-32.31
3.180
(-38.28, -25.74 )
(-37.77, -25.10 )
Table 10. Prediction of mean difference at UVA and other centers in 2013 using simulation method
Point
Forecast of
r11donor
Prediction of
mean difference
in 2013 (Median)
Std.
Error
Simulation Prediction
95% Percentile
95% BCa
CI
CI
1264.4
-44.77
10.970
(-68.14, -25.20 )
(-29.93, -12.62 )
1264.4
-32.66
5.885
(-45.82, -21.97 )
(-45.37, -21.90 )
210.4
-50.40
8.970
(-69.24, -32.84 )
(-72.17, -35.75 )
210.4
-31.95
4.674
(-42.67, -24.50 )
(-29.23, -19.60 )
19
20
3.2.Liver Transplants
Figure 26 plots the number of liver transplants before and after building the Roanoke center. It appears that
there are more liver transplants when the Roanoke center is built. Starting from year 2005, there is an
increasing trend although there is a drop in year 2011. The right panel of Figure 26 shows the distribution of
liver transplants which is non-normal. Table 11 confirms my plot that the mean of liver transplants is higher
(68.13) after the Roanoke center is built. T-test indicates that there is a difference between the number of
liver transplants before and after the Roanoke center (p-value = 0.0013). I also utilize Wilcoxon test because
the sample size is very small and also because the data are not normally distributed (Figure B6, Appendix B).
The result is in line with t-test that I can reject the null hypothesis of the difference is equal to zero at 5%
level.
Figure 26. Plot and histogram of number of liver transplants
21
Table 11. T-test and Wilcoxon test on number of liver transplants before and after the Roanoke center
Mean before
T-test
Wilcoxon test
Mean after the
the Roanoke
Roanoke center
center
t-stat
DF
p-value
W-stat
p-value
36.35
68.13
-4.090
12.749
0.0013
12.5
0.0013
ACF plot of liver transplants indicates that the series are correlated and it may be necessary to consider
AR(1) based on PACF plot. AIC plot in Figure B7 (Appendix B) also supports this. The residuals of AR(1)
model is no longer correlated as shown in bottom panel of Figure 27.The residuals are then used to compare
if there is a difference between number of liver transplants before and after building the Roanoke center.
Table 12 reveals that there is no difference between the two since the p-value from both t-test and Wilcoxon
test is greater than 0.05. However, this test may not be valid because it is important to control for Region 11.
Figure 27. ACF and PACF of liver transplants series and residuals of AR(1)
Table 12. T-test and Wilcoxon test on residuals of AR(1) before and after the Roanoke center
T-test
Wilcoxon test
t-stat
DF
p-value
W-stat
p-value
-1.812
11.750
0.0957
32
0.0523
Linear model for liver transplants with Region 11 and Roanoke variables as predictor show that the
regression assumptions are violated based on diagnostic plot in Figure 16. The residual from the fitted linear
model indicates that the series are correlated. I then consider AR(1) to model the residuals since the PACF
plot cuts off after lag 1 (Figure 29). Accounting AR term in the linear model has improved the fitted model
based on diagnostic plot in Figure 30. The residuals vs. fitted plot shows constant variance and the QQ plot
shows the residuals are approximately normal. Moreover, the ACF and PACF plots indicate the residual
series are no longer correlated and no significant lags in both plot (Figure 31). The estimated parameters and
22
its corresponding standard errors for linear model and linear model with time series are summarized Table
13.
Figure 28. Diagnostic plot for linear regression model on number of liver transplants
Figure 29. Plot of residuals from linear regression model and its corresponding ACF and PACF plot
Figure 30. Diagnostic plot for linear regression model with time series on number of liver transplants
Figure 31. Plot of residuals from linear model with time series and its corresponding ACF and PACF plot
23
Table 13. Results for linear model and linear model with time series
Linear model with time
Linear model (Model 5)
series
Estimate (se)
p-value
Estimate (se)
p-value
0
31.482 (12.79)
0.022
42.972 (12.23)
0.002
0.012 (0.029)
26.846 (14.38)
0.690
-0.011 (0.027)
0.704
0.075
34.011 (12.62)
0.014
0.428 (0.178)
0.026
1
Adj. R2
0.391
0.516
AIC
219.9
203.0
MSE
280.7
F-statistic: 8.691 on 2 and 22
DF, p-value: 0.001653
182.3
F-statistic: 9.18 on 3 and 20
DF, p-value: 0.0005048
Overall model
Both models are significant overall at 5% level. Linear regression model with time series performs best
based on adjusted R2, AIC, and MSE criteria. The adjusted R2 is larger and AIC is smaller for linear model
after accounting AR term in the model. Also, it fits better because it has decreased the MSE value. My final
linear model for liver transplants can be written as follows:
= 42.972 0.011 11 + 34.011 + 0.428 1
Bootstrapping the regression for Roanoke center (2 *) shows that the confidence interval of the parameter
does not contain zero (Table 14). This means that I can be 95% confident that there is a difference between
the number of liver transplants before and after the Roanoke center is built. Thus, I can reject my null
hypothesis at 5%, meaning building Roanoke center helps to improve the number of liver transplants at
UVA. The histogram and QQ-plot for the bootstrap estimate of 2 * shows no violation of normality (Figure
32).
Table 14. Bootstrap estimate for linear model with time series
Std.
Original
Bias
95% Percentile CI
95% Bca CI
Error
0 *
42.972
0.128
11.071
(20.08, 63.80 )
(18.71, 62.57 )
1 *
-0.011
0.000
0.025
(-0.0582, 0.0408 )
(-0.0552, 0.0431 )
2 *
34.011
-0.120
11.376
(11.30, 56.26 )
(10.28, 55.20 )
1 *
0.428
0.002
0.165
( 0.1148, 0.7610 )
( 0.1077, 0.7586 )
24
Considering Poisson model to model the number liver transplants at UVA as a response variable, the
dispersion of Model 6.1 is 8.32. Since this number is not close to 1 therefore I use Quasi Poisson model
instead. The diagnostic plot for the residuals of this model shows that the tail distribution lack of fit Gaussian
in normal QQ plot. The residual vs. fitted plot shows non-constant variance (Figure 33). The PACF of
residuals in Figure 34 appears to be insignificant after lag 1 then I consider one autoregressive term in the
Poisson model (Model 6.2). The model looks better now as shown in the diagnostic plot (Figure 35). Also,
the residual series are no longer correlated (Figure 36). This is also true for Model 6.3 where Roanoke center
is included as a predictor variable (Figure 37-38). The comparison between the estimated parameters from
Poisson model, Quasi Poisson model, and Quasi Poisson model with time series are summarized in Table 15.
Figure 33. Diagnostic plot for Quasi Poisson model (Model 6.1)
25
Figure 34. Plot of residuals from Quasi Poisson model (Model 6.1) and its corresponding ACF and PACF plot
Figure 35. Diagnostic plot for Quasi Poisson model with time series (Model 6.2)
Figure 36. Plot of residuals from Quasi Poisson model with time series (Model 6.2) and its corresponding ACF and
PACF plot
Figure 37. Diagnostic plot for Quasi Poisson model with time series and Roanoke center (Model 6.3)
26
Figure 38. Plot of residuals from Quasi Poisson model with time series and Roanoke center (Model 6.3) and its
corresponding ACF and PACF plot
The model utility test shows that I can reject my null hypothesis and prefer the full model. The QuasiPoisson model for Roanoke center (Model 6.3) does not show any significant effect of building Roanoke
center to the response variable. The Roanoke center variable has p-value 0.078 greater than significant level
0.05.
Table 15. Results for (Quasi) Poisson model
Poisson Model (6.1)
Estimate
(se)
3.105
(0.085)
0.001
(0.0001)
0
1
p-value
< 0.0001
< 0.0001
Quasi Poisson
Model with time
series (6.2)
Quasi Poisson
Model (6.1)
Estimate
(se)
3.105
(0.246)
0.001
(0.0004)
Estimate
(se)
3.173
(0.197)
0.001
(0.0003)
p-value
< 0.0001
0.003
p-value
< 0.0001
0.001
2
0.083
0.003
(0.024)
Deviance: 117.74,
DF:2, P-value <
6.66E-06
1
Model
utility test
Dispersion
Deviance: 96.195,
DF:1, P-value < 2.2E16
8.32
Deviance: 96.195,
DF:1, Pvalue:0.00674
20
82.261
Df
Deviance
Pr(>Chi)
14.473
0.06256
My final Quasi Poisson model with time series for liver transplants at UVA can be expressed as follows:
= 3.616 + 0.00005 11 + 0.545 + 0.069 1
The model is significant overall at 5% level. Test of Roanoke center (1 ) with chi squared confirms that
Roanoke is not a significant factor and can be dropped from the model. In other words, I cannot reject my
27
null hypothesis of building the Roanoke center does not increase the number of liver transplants at UVA at
5% level. This contradicts the result obtained from linear model with time series.
4. Recommendation
It is evidence that modeling the difference of kidney transplants at UVA and MCV and at UVA and Duke
with time series linear regression model fits better than linear model without time series based on adjusted
R2, AIC, and MSE criteria. The final model selected is significant overall at 5% level. By using the bootstrap
and Monte Carlo simulation on prediction the time series model for the difference number of kidney
transplants, at level p<0.05, I get the negative confidence interval of the prediction in 2013 that does not
contain zero. This means that I can be 95% sure that there is indeed a large difference between the two
centers in 2013. The negative confidence interval indicates that the predicted number of kidney transplants at
UVA overall and for non-whites in 2013 is less than the predicted number of kidney transplants at MCV and
Duke. A classical paired t-test also shows a significant result at 5% level comparing the difference of kidney
transplants of non-whites at UVA and the other two centers. This tells me that UVA center needs to do better
at recruiting people overall and at recruiting people from other ethnicities (non-whites).
For liver transplants, two models are considered to model the number liver transplants at UVA, i.e. linear
model and Poisson model. Similar as previous analysis, accounting AR terms in the model does improve the
fitted model. The results from the two models seem to contradict in terms of showing the difference number
of liver transplants before and after building the Roanoke center. Based on linear model with time series, the
p-value of Roanoke variable is 0.014 and is less than 0.05. With this model, I can reject my null hypothesis
that building the Roanoke center does not increase the number of liver transplants at 5%. The bootstrapping
for time series model confirms this since the 95% confidence interval of the Roanoke center does not contain
zero. Meanwhile based on Poisson model with time series, Roanoke variable is not significant at 5% level
and should be removed from the model based on chi square test. This tells me that I cannot conclude
anything about the effect of Roanoke center on the number of liver transplants at UVA. Therefore, I suggest
doing more research on liver transplants. It may be interesting to collect the data at UVA C-ville and UVA
Roanoke center.
5. References
[1] D. E. Brown and L. Barnes, Project 3: Design Improvements for the UVA Transplant Center,"
November 2013, assignment in class SYS 4021.
[2] D. E. Brown and L. Barnes, Project 3: Design Improvements for the UVA Transplant Center template,"
November 2013, assignment in class SYS 4021.
[3] OPTN: Organ Procurement and Transplantation Network. http://optn.transplant.hrsa.gov.
[4] ScholarlyEditions, Issue in Neurology Research and Practice 2011 Edition, 2012, Scholarly Editions,
Atlanta, Georgia.
[5] University of Virginia School of Law, Hospitals, Clinics, and Outpatient Services, [assessed on
12/05/2013], http://www.law.virginia.edu/html/insider/health_hospitals.htm
28
Appendix A
Table A 1. Number of kidney transplants from 1988 to 2013
r11donor
UVA
MCV
Duke
1988
456
24
41
1989
493
39
1990
618
1991
Year
r11donor
UVA
MCV
Duke
NW
NW
NW
NW
61
254
46
20
24
17
39
22
47
49
265
42
28
11
20
27
36
13
56
57
43
363
53
43
13
22
35
28
15
638
54
34
62
344
57
43
11
14
20
40
22
1992
625
53
38
47
325
62
48
14
24
23
24
1993
659
43
43
47
352
83
32
11
15
28
30
17
1994
720
34
37
68
390
73
24
10
13
24
47
21
1995
719
53
29
59
373
72
42
10
15
14
36
23
1996
806
68
38
72
411
81
54
14
11
27
30
42
1997
838
74
53
78
402
94
54
20
18
35
38
40
1998
889
69
53
70
408
91
49
20
21
32
27
43
1999
937
63
52
70
418
101
46
17
17
35
33
37
2000
949
55
64
86
451
93
44
11
25
39
42
44
2001
1005
58
68
111
432
111
45
12
22
46
67
44
2002
1049
63
82
96
456
109
50
13
33
49
53
43
2003
1105
68
99
121
434
126
52
16
34
65
69
52
2004
1166
73
96
119
501
135
52
21
27
69
60
59
2005
1284
94
112
100
549
167
73
21
26
86
51
49
2006
1337
107
107
95
653
203
73
33
38
69
42
53
2007
1268
102
106
84
611
209
75
27
36
70
40
44
2008
1237
103
108
108
615
206
74
29
32
76
43
65
2009
1281
79
137
105
606
251
55
24
43
94
41
64
2010
1255
69
130
104
608
254
45
24
35
95
37
67
2011
1275
76
124
115
600
251
52
24
48
76
43
72
2012
1304
68
136
74
687
219
42
26
41
95
41
33
2013
791
44
87
56
401
155
26
18
24
63
23
33
29
R11donor
UVA
MCV
Duke
1988
148
21
11
1989
182
17
18
22
1990
253
54
16
31
1991
261
51
27
34
1992
270
36
31
33
1993
344
66
37
21
1994
386
62
33
38
1995
372
54
39
37
1996
433
37
66
37
1997
439
24
60
32
1998
498
23
53
48
1999
510
23
60
25
2000
550
37
45
34
2001
550
40
46
36
2002
571
29
46
38
2003
569
28
57
35
2004
643
36
57
41
2005
757
40
54
41
2006
884
58
55
46
2007
833
83
47
30
2008
838
86
54
44
2009
883
87
55
57
2010
819
78
48
51
2011
781
45
46
63
2012
811
68
60
67
2013
523
44
36
38
30
Appendix B
Figure B 1. Histogram and QQ-Plot on mean difference of kidney transplants at UVA and other centers
31
Figure B 6. Histogram and QQ plot of number of liver transplants before and after the Roanoke center
32
33