Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Enhancing Immune Response to HIV Infection Using MPC-Based Treatment Scheduling

Ryan Zurakowski and Andrew R. Tee1 D e p a r t m e n t of Electrical and Computer Engineering University of California at Santa Barbara Santa Barbara, CA 93106 ryanzQece.ucsb.edu teel@ece.ucsb.edu
Abstract Using recently developed models of the interaction of the human immune system and the Human Immunodeficiency Virus (HIV), we have developed a model predictive control (MPC) based method for determining optimal treatment interruption schedules. These schedules use interruptions of Highly Active AntiRetroviral Therapy (HAART) to induce immune control of the virus without the need for continued treatment, as suggested by the models. In this paper, we discuss the medical motivation for this work, introduce the MPC-based method and show simulation results, and discuss future work necessary to implement the method. icant drop in deaths due to opportunistic infections, it has at the same time revealed a number of problematic side effects [SI. These range from the relatively benign, such as lipodystrophy, to the life-threatening, such as hepatitis co-infection. Liver failure is now the leading cause of death among HIV positive patients in the United States, and HAART has been implicated as one of the factors [ll].Also, for HAART to be effective a t suppressing HIV infection, exact adherence to a complicated treatment schedule must be maintained, a task which becomes increasingly difficult over the course of the infection. Therefore, the possibility of replacing lifelong adherence to HAART with a short-term schedule inducing drug-free management of the infection is worth pursuing. The model proposed by Wodarz et al. suggests this possibility, but the treatment schedules yielding this result are complicated and non-intuitive, requiring multiple, precisely timed treatment interruptions. We have developed a Model Predictive Control (MPC) based method for finding these schedules. This method is well-suited to the problem for a number of reasons: It is adaptable, which will allow for various improved models to be integrated as they are developed. It inherits from the MPC framework a certain robustness to disturbances and model inaccuracies which is important, since the model suffers from these. It allows us to finetune the treatment using medically intuitive notions of cost. Finally, the long time-scales of the model allow us to overcome the computation time issues which normally plague MPC-based methods. This paper is organized as follows: in Section 2, we introduce the modified Wodarz-Nowak model of HIV infection. In Section 3, we develop the MPC-based technique for determining treatment schedules. In Section 4, we show and discuss various simulation results. In Section 5, we discuss various open questions and future work necessary for implementation of this method. In Section 6 , we summarize and evaluate the promise of this method in HIV treatment.

1 Introduction
One of the enigmas in the study of Human Immunodeficiency Virus (HIV) infection has been the phenomenon of clinical long-term non-progressors (LTNPs). These patients are measurably infected with the virus, but do not exhibit the common long-term degeneration of Helper-T lymphocytes characteristic of most patients. Recent models proposed by Wodarz et al. suggest that the clinical LTNPs are not examples of patients with uniquely responsive immune systems, but rather patients with ordinary immune systems whose successful containment of the virus was made possible by the initial conditions of their infection. The model further predicts that a well timed pattern of treatment interruptions for an HIV-positive patient undergoing Highly Active Anti-Retroviral Therapy (HAART) can induce the initial conditions necessary for the patient to develop a successful immune response like that of a clinical LTNP [Zl]. The possibility of inducing such an immune response is exciting for a number of reasons. While HAART is highly effective a t suppressing viral load, it does not eradicate the disease, so treatment must be continued for the life of the patient [GI. While the widespread use of HAART in the United States has resulted in a signif'Research supported in part hy AFOSR grant number F49620-00-1-0106 and NSF grant number ECS 9988813.

0-7803-7896-2/03/$17.00 IEEE 02003

1182

ProceedingsOf the American Control Conlerence Denver. Colorado June 46,2003

Authorized licensed use limited to: Mihail Ioan Abrudean. Downloaded on June 07,2010 at 15:08:30 UTC from IEEE Xplore. Restrictions apply.

2 Modified Wodarz-Nowak Model

The model of HIV infection we use in this paper was introduced in [2l]> further developed in (181.This and is a five state nonlinear ODE model focusing on the Cytotoxic Lymphocyte (CTL) response to HIV infection, as mediated by Helper-T cells. The model is

CTL control o the virus as well [15],[4],[14],[16],[7]. f The successful use of the model to induce immune control in SIV and the corresponding anecdotes in human patients lend hope to the possibility of similar success in HIV-infected patients.

3 MPC Treatment Scheduling Using control techniques to plan treatments for HIV In is not a new idea (see [10],(5],[3],[17]). particular, Kirschner et al. use optimal control to argue for early initiation of treatment, now standard practice. However, the models used in these approaches do not accurately reflect the interactions between the helper-T and CTL systems, and thus do not predict the possibility of induced immune control. Also, these approaches allow for continuous variation of the level of treatment, which is both difficult to achieve, given drug uptake dynamics, and potentially dangerous, given the possibility of drug resistance emerging under partial application of treatment. Model Predictive Control is a technique in which the control is determined by solving, at each sampling period, a finite-horizon optimal control problem. For a discrete system of the form
Xk+l

x = X-dx-P(l-?Ju)xy y = P(1 - V)XY - a y - m il = C i Z i Y - biz1 W = c~xyw-c2qyw-bzw iz = czqyw- hzz

w - PZZZY
(1)

and the states describe concentrations of: x, healthy helper-T lymphocytes, y , HIV-infected helper-T lymphocytes, z l , helper-independent CTL, w , CTL precursors, and 2 2 , helper-dependent CTL. The variable U represents our control input, the application of HAART therapy, and ?J is the therapys effectiveness. We consider the region where all states are positive, since only this region has physical meaning. This region is forward invariant. The model recognizes the dependence of the CTL immune response on the helper-T system, and distinguishes between the helper-T mediated CTL response, which persists even a t low antigen levels, and the helper-T independent response, which dies out at low antigen levels. The control input recognizes the effect of HAART, which shuts down viral replication, preventing new infection. In order to avoid increased risk of the emergence of drug resistant viral strains, we restrict u ( t ) to be either 0 (no treatment) or 1 (full treatment). For a more complete description of the 21 11 states and their interaction, see 1 1 and 1 8 . The steady-state behavior of this model has many possible bifurcations due to parameter changes, which are discussed in [18].In this paper, however, we consider only the case where the model has, in the absence of treatment ( U = 0), two stable steady states: one describing a progressive infection leading to AIDS and one describing the establishment of a successful immune response. This is the case when the quantity [ez(X - dq) - 6 2 @ - 44c2qdbz is not negative. The steady-state values corresponding to the successful immune response are

=f(xk,uk)
x k

(3)

we find a length N sequence U = { u k , U k + l r ..., U k + N - l } which minimizes a cost function of the form and a current state
k+N-l

v(xk,

U) =
i=k

Ui)

+F(XktN)

(4)

where 1 is the stage cost and F is the terminal cost. The resulting optimal control sequence is applied for one sampling period, and at the next sampling period a new optimal control is calculated. Under certain conditions, this procedure can guarantee both stability of a desired set point and robustness to disturbances. A thorough overview of the history of MPC and its various incarnations can be found in 1121. This framework is uniquely well-suited for use in scheduling treatment interruptions for HAART. The finite set of possible control values causes problems for many control design techniques, but it actually helps in MPC by making the optimization easier to solve. The stage and terminal costs flow in a natural way from the medical notions of treatment objectives and systemic cost. Finally, the application of MPC is frequently limited by the computational cost of the optimization; cost functions are restricted to those which lead to optimization problems which can be solved within the sampling time. In this application, however, the sampling times are measured in weeks, so computational time is cheap and the possibilities for cost functions are expanded. The objective of our treatment scheduling is to drive the patient to a state where the immune system will

This model is normalized, i.e., the values of the states have not beer] adjusted to correspond to measured data. The basic behavior of the model has been observed in experiments on Simian Immunodeficiency Virus (SIV) infection in apes [19],and treatment interruptions in HIV patient,s have been associated with

1183

Proceedings of the American Control Conference Denver, Colorado June 4-6, 2003

Authorized licensed use limited to: Mihail Ioan Abrudean. Downloaded on June 07,2010 at 15:08:30 UTC from IEEE Xplore. Restrictions apply.

suppress the virus without continued treatment. We would like to achieve this while both maximizing the CTL response and minimizing the decrease in helper-T concentration. With this in mind, we choose our stage cost to be l(Xi,Ui) = ( Y 1 ( X i - 2 , ) 2 + C Y Z ( W i ~ w a ) 2 + O L Q l U i l
(5)

4.1 Stability

where ayj are positive weighting constants and z.,w,,X, are the steady-state values of their respective states at the desired equilibrium. The work in 1 1 9 shows that a stage cost from which the desired steady state satisfies an observability condition (roughly. lk 0 ==+I& X,1 0) is sufficient to ensure asymptotic stability, given a sufficiently long horizon N.However, we have great flexibility with regard to our choice of cost functions. If we add a terminal cost F which is a local control Lyapunov function of the desired equilibrium, [lZ]cites work asserting that we can use any positive definite stage cost 1 for which at each x k there exists at least one uh satisfying the inequality F ( x ~ + ~ ) - F ( 5 ~ l)( x k , uk),and closed loop asympx totic stability is guaranteed, given a long enough horizon N. With this in mind, we will also implement a terminal cost of the form
~

Our treatment scheduling method successfully steered the system to the desired equilibrium for every initial condition we tried in simulation. In Figure 1, an even weighting of the elements of our stage cost yielded a schedule which is neither the shortest possible, minimizing the drug usage, nor the fastest to converge, maximizing the immune responsiveness, hut rather a compromise of these goals. The initial condition was chosen because it represents the condition of a patient who has been on HAART for a long time. Figure 2 shows the performance of the method on a different initial condition, this one chosen to represent a patient whose immune system has been compromised by the infection. As for Figure 1, the schedule chosen is a good compromise, which leads to stability.

F(Xk+N) = CY4(Xk+N - X O ) ~ P ( & + N x o )

(6)

where F is a local quadratic Lyapunov function at the desired equilibrium. The great flexibility we have in choosing stage and terminal costs mean that we can easily adapt them to adjust performance or incorporate different objectives. We are constrained to control values of 0 and 1, and cannot take measurements nor change our control at intervals of faster than one week. We therefore choose a sampling time of one week. Consequently, we do not worry about creating an explicit discretization of our differential equation; we simply use a numerical simulator to approximate our discretization. Also, a finite horizon and a finite control space mean that we have, for each horizon, a finite number of possible control sequences. Since time is not an issue, we solve our optimizations by exhaustively searching this space. 4 Simulation R e s u l t s We implemented the algorithm described in Section 3 in MATLAB. In this section, we show simulation results which illustrate the algorithms performance over a variety of conditions. For simplicity and readability, we plot only the healthy helper-T cells, viral load, CTL ) memory and control states (x,y,w,and U . The control is plotted as a shaded area. In Section 4.1, we show that the method stabilizes the desired equilibrium, inducing immune control from various initial conditions. We show in Section 4.2 that the method continues to work in spite of state measurement error and modeling error. Finally, in Section 4.3, we show a ways in which we can fine-tune the performance hy varying the cost function.
1184

Figure 1: Stabilization The values of the parameters


are N = 6, a, = 1, a = 1, a3 = 1, aI = 0, X = 1, z d = 0.1, 0 = 1, TJ = 0.9799, a = 0.2, p i = 1, p z = 2, CI = 0.03, c1 = 0.06, bi = 0.1, b2 = 0.01, q = 0.5, h = 0.1. Initial condition is x = 10, y = 0.01, z1 = 0.01, w = 0.01,
22

= 0.01.

Figure 2: Another Initial Condition The values of the parameters are as in Figure 1. Initial condition is x = 0.291,
y = 3.333,
81 =

0.913, w = 0.001, zz = 0.001.


Proceedings of the American Control Conference Denver, Colorado June 46.2003

Authorized licensed use limited to: Mihail Ioan Abrudean. Downloaded on June 07,2010 at 15:08:30 UTC from IEEE Xplore. Restrictions apply.

4.2 Robustness The main benefits to using a closed-loop control method are noise rejection and robustness. Our MPCbased method grants us a certain degree of robustness to measurement and modeling errors, which we demonstrate through simulation. In Figure 3, we introduced into the state measurement a random noise signal, which would add or subtract from each state as much as 10% of the state value. The only change to the resulting schedule was that the HAART was discontinued a week earlier; stability was preserved.

In Figure 4, we introduced a modeling error. We changed the values of certain key parameters in the system, but didn't change the corresponding parameters in the treatment scheduling program. Consequently, the expected behavior of the system was slightly different from the actual behavior. The resulting schedule was considerably more complicated than it needed to be, but stabilized the system nonetheless.
*
O J

urc

Figure 4: Modeling Error Error has been introduced into the model parameters. The values of the parameters that have been changed are 0 = 0.8, q = 0.7799, a = 0.25, c2 = 0.05. Initial condition is as in Figure 1.

......... .:---........ .. ..'...... .,. ........ ..

of the stage cost. By emphasizing the term associated with helper-T concentration while de-emphasizing the terms associated with drug usage and CTL memory growth, we got a schedule which converged to the desired equilibrium more slowly and used more drugs than in Figure 1, but did so while maintaining a higher average helper-T cell concentration. This is just one example of the many ways in which we can adjust the cost functions to achieve more desirable results.

............... ...............

F i g u r e 3: State Measurement Error The values of the parameters are as in Figure 1. Initial condition is as in Figure 1. Random Measurement Noise has been added to each state, up to 10% of the state value.
4.3 Varying the Cost Function

One of the most useful things about MPC-based methods is the ability to fine-tune the performance of the system by adjusting the cost functions. In Figures 5 and 6, we have changed the cost functions in order to demonstrate the flexibility of our scheduling technique. In Figure 5 , we added a terminal cost to the method, using a local control Lyapunov function. This seemed t o have no effect on the resulting schedules, suggesting that the goals imposed by the terminal cost agree with those already imposed by the stage cost. However, adding a control Lyapunov function as a terminal cost relaxes the requirements placed on our stage cost, and we have included it as an example of the flexibility of the cost function's structure. In Figure 6, we adjusted the weightings of the elements

Figure 5: CLF Terminal Cost The values of the parameters are as in Figure l, except that a4 = 10. Initial condition is as in Figure 1.
5 Future Work

The potential for MPC-based treatment scheduling for HIV infection is exciting, but a great deal more work is necessary before it can be implemented. Robustness of the method to state and parameter disturbances still needs to be explored. We need to determine whether current measurement techniques can give us sufficiently accurate state measurements, and, if not, observerbased methods need to be explored. The model will 1185
Proceedingsof the American Control Conference Denver, Colorado June 4-6, 2 3 w

Authorized licensed use limited to: Mihail Ioan Abrudean. Downloaded on June 07,2010 at 15:08:30 UTC from IEEE Xplore. Restrictions apply.

plication. In [lS] and [l], the authors model the difference between active and resting T cells. This is one variation likely to he useful in our application, since the sum of these two states more closely approximates what we can currently measure. Also, in 1 8 ,a state 11 representing an alternate, resistant strain of the virus is introduced. It would he worth exploring under what conditions a treatment schedule could restore helperT functionality and induce an effective CTL immune response in the presence of a HAART-resistant strain before the presence of the HAART selects the resistant strain to dominance and control is lost. The possible correlation between treatment interruptions and the emergence of drug resistance must also be considered. Although studies have been done showing no correlation between short treatment interruptions and the emergence of drug resistant strains [7],[13], there is still concern that the transient increases in viral load brought on by the treatment interruptions will increase the risk of drug resistant mutations occurring. If this turns out to he a valid concern, it would be worth investigating a variation on the cost functions that p o nalizes treatment interruptions or viral rebound levels in such a way as to minimize the risk. Also, the possibility of escape mutants, which are resistant to an already established immune response, also needs to be considered, The criteria with which we designed our MPC method are undoubtedly naive from a medical standpoint. We are neither doctors nor biologists. It is our hope that the results in this paper will encourage researchers in those fields to refine our criteria, pointing out what weve overlooked, leading to more useful cost functions and a more realistic control design.
6 Conclusions

F i g u r e 6: Changed Cost Function The values of the 2 parameters are as in Figure 1 except that a, = 100, a = 0.5, a = 0.1. Initial condition is as in Figure 1. 3 need to be revised to better match measured patient data. The issues of drug resistance and mutation need to be considered. Finally, the cost functions and optimization objectives need to be refined to better achieve the medical aims of the treatment. We have thus far assumed that we have an error-free way to observe every state a t each sampling instant. In reality, this is not the case. Most hlood-sample measurements come with a significant margin of error. Model Predictive Control ensures a certain amount of robustness to state disturbances, hut it remains to be seen whether the robustness will he sufficient in this case. Furthermore, the existing measurements do not line up perfectly with the model. It is possible to measure the concentration of healthy helper-T cells, for instance, but the state x in the model actually represents a certain subset of the helper-T cells (see [20],[18],[19]), and the relationship between this subset and the measurable set is not clear. This can he resolved through the development of new measurement techniques, or by creating an observer to deduce the unmeasurable states from the measured states. We have also assumed that the model dynamics are entirely accurate. This too is not the case; we expect a certain amount of parameter variation from patient to patient, and while this might be resolved through the use of an identification scheme, there are also unmodeled dynamics, which will show up as parameter uncertainty or drift. The amount of variation and drift we can expect is unknown; in order to discover this, the model parameters need to be matched to patient data. We must determine whether our method can give us sufficient robustness to these problems. Several variations on the model used in our method have been considered in [21],[18],[19],[1],and [Z]. We must determine whether the additional features modeled in these papers are useful or important in our ap-

Enhancing the immune response to HIV through scheduled interruptions of HAART, leading to control of the viral load without further treatment, is an exciting prospect. However, the complexities involved in finding an effective interruption schedule demand a control technique. Model Predictive Control is a uniquely appropriate framework for this problem. It provides needed stability and robustness, its structure accommodates the problems specific implementation reqnirements, and its flexibility allows for an intuitive integration of the problems objectives. In this paper, we have implemented such an MPCbased treatment scheduling technique. We have simulated its performance while varying initial conditions, model parameters and cost functions. The results demonstrated the methods effectiveness as well as its flexibility. While the results so far are very promising for the future of MPC-based treatment scheduling, we acknowledge that a great deal of work remains to he done. We
1186
Proceedings of the American Control Conference

Denver, Colorado June 4-6,ZW3

Authorized licensed use limited to: Mihail Ioan Abrudean. Downloaded on June 07,2010 at 15:08:30 UTC from IEEE Xplore. Restrictions apply.

have sketched a few avenues of research we expect t o be worthwhile. Further collaboration between disciplines will be necessary to bring this work t o implementation; it is our hope that this paper will work t o that end. References [I] H. K. Altes, D. Wodarz, and V. Jansen. The dual role of CD4 t helper cells in the infection dynamics of HIV and their importance for vaccination. Journal of Theoretical Biology, 214, 2002. (21 R. Arnaout, M. Nowak, and D. Wodarz. HIV1 dynamics revisited: biphasic decay by cytotoxic T

Reactivation of hepatitis B virus replication accompanied by acute hepatitis in patients receiving highly active antiretroviral therapy. Clinical Infectious Diseases, 32, 2001.
[12] D. Mayne, J. B. Rawlings, C. V. Rao, and P. Scokaert. Constrained model predictive control: Stability and optimality. Autornatica, 36, 2000.

lymphocyte killing? Pmceedings of the Royal Society of London in Biology, 267, 2000. [3] M. Brandt and G. Chen. Feedback control of a biodynamical model of "-1. IEEE '2hnsaction.s on Biomedical Engineering, 48(7), 2001.
[4] R. Davey, N. Bhat, C. Yoder, T.-W. Chun, J. Metcalf, R. Dewar, V. Natarajan, R. Lempicki, J. Adelsberger, K. Miller, J. Kovacs, M. Polis, R. Walker, J. Falloon, H. Masur, D. Gee, M. Baseler, D. Dimitrov, A. Fauci, and H. C. Lane. HIV-1 and T cell dynamics after interuption of highly active antiretroviral therapy (HAART) in patients with a history of sustained viral suppression. Proceedings of the National Academy of Sciences, 96(26), 1999. [5] F. M. C. de Souza. Modeling the dynamics of HIV-1 and CD4 and CD8 lymphocyctes. IEEE Engineering in Medicine and Biology Magazine, 18(1),1999.

[13] A. Neumann, R. Tubiana, V. Calvez, C. Robert, T . 2 . Li, H. Agut, B. Autran, and C. Katlama. HIV-1 rebound during interruption of highly active antiretroviral therapy has no deleterious effect on reinitiated treatment. AIDS, 13, 1999. [14] G. Ortiz, D. Nixon, A. Trkola, J. Binley, X. Jin, S. Bonhoeffer, P. Kuebler, S. Donahoe, M.-A. Demoitie, W. Kakimoto, T. Ketas, B. Clas, J. Heymann, L. Zhang, Y. Cao, A. Hurley, J. Moore, D. Ho, and M. Markowitz. HIV-1-specific immune responses in subjects who temporarily contain virus replication after discontinuation of highly active antiretroviral therapy. Journal of Clinical Immunology, 104, 1999. 1151 E. Papasavvas, G. Ortiz, R. Gross, J. Sun, E. C. Moore, J. Heymann, M. Moonis, J. K. Sandberg, L. A. Drohan, B. Gallagher, J. Shull, D. F. Nixon, J. Kostman, and L. Montaner. Enhancement of human immunodeficiency virus type 1-specific CD4 and CD8 T cell responses in chronically infected patients after longterm viral suppression. Journal of Infectiom Diseases, 182, 2000.
1161 E. Rosenberg, M. Altfeld, S. Poon, M. Phillips,

D. Finzi, J. Blankson, J. Siliciano, J. Margolick, K. Chadwick, T. Pierson, K. Smith, J. Lisziewicz, F. Lori, C. Flexner, T. Quinn, R. Chaisson, E. Rosenberg, B. Walker, S. Gange, J. Gallant, and R. Siliciano. Latent infection of CD4+ T cells provides a mechanism for lifelong persistence of HIV-1, even in patients on effective combination therapy. Nature Medicine, 5 ( 5 ) , 1999.
161

B. Wilkes, R. Eldridge, G. Robbins, R. D'Aquila,


P. Goulder, and B. Walker. Immune control of HIV-1 after early treatment of acute infection. Nature, 407, 2000. [17] L. Wein, S. A. Zenios, and M. A. Nowak. Dynamic multidrug therapies for HIV: A control theoretic approach. Journal of Theoretical Biology, 185, 1996. [18] D. Wodarz. Helper-dependent vs. helperindependent CTL responses in HIV infection. Journal of Theoretical Biology, 213, 2001. [19] D. Wodarz, R. Arnaout, M. Nowak, and J. Lifson. Transient antiretroviral treatment during acute simian immunodeficiency virus infection facilitates long-term control of the virus. Philosophical Transactions of the Royal Society of London in Biology, 355, 2000.
[ZO] D. Wodarz and V. Jansen. The role of T cell help for anti-viral CTL responses. Journal of Theoretical Biology, 211, 2002.

[7] F. Garcia, M. Plana, G. Ortiz, S. Bonhoeffer, A. Soriano, C. Vidal, A. Cruceta, M. Arnedo, C. Gil, G. Pantaleo, T . Pumarola, T. Gallart, D. Nixon, J. Miro, and J. Gatell. The virological and immunological consequences of structured treatment interruptions in chronic HIV-1 infection. AIDS, 15(9), 2001.

[8] T. Gegeny. Simply stated: Surviving with HIV in the HAART era: emerging challenges. Research lnitiative / Peatment Action!, 6(3), 2000.
[9] G. Grimm, M. J. Messina, A. R. Teel, and S. E. Tuna. Model predictive control when a local control lyapunov function is not available. In Proceedings of the American Control Conference, June 2003 (Submitted).
[lo] D. Kirschner, S. Lenhart, and S. Serbin. Optimal control of the chemotherapy of HIV. Journal of

Mathematical Biology, 35, 1997.


( 1 C. Manegold, C. Hannoun, A. Wywiol, M. Diet11 rich, S. Polywka, C. B. Chiwakatal, and S. Gunther.

[ Z l ] D. Wodarz and M. A. Nowak. Specific therapy regimes could lead to long-term immunological control of HIV. Proceedings of the National Academy of Sciences, 96(25), 1999.

1187

Proceedingsof the American Control Conference Denver. Colorado June 4-6,2@93

Authorized licensed use limited to: Mihail Ioan Abrudean. Downloaded on June 07,2010 at 15:08:30 UTC from IEEE Xplore. Restrictions apply.

You might also like