Full Chapter Foundations of Average Cost Nonhomogeneous Controlled Markov Chains Xi Ren Cao PDF

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 53

Foundations of Average-Cost

Nonhomogeneous Controlled Markov


Chains Xi-Ren Cao
Visit to download the full and correct content document:
https://textbookfull.com/product/foundations-of-average-cost-nonhomogeneous-contr
olled-markov-chains-xi-ren-cao/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Understanding Markov Chains Examples and Applications


2nd Edition Nicolas Privault

https://textbookfull.com/product/understanding-markov-chains-
examples-and-applications-2nd-edition-nicolas-privault/

Markov Chains and Mixing Times Second Edition David A.


Levin

https://textbookfull.com/product/markov-chains-and-mixing-times-
second-edition-david-a-levin/

Proceedings of ELM2019 Jiuwen Cao

https://textbookfull.com/product/proceedings-of-elm2019-jiuwen-
cao/

Proceedings of ELM 2018 Jiuwen Cao

https://textbookfull.com/product/proceedings-of-elm-2018-jiuwen-
cao/
Developing sustainable supply chains to drive value
Volume 1 Management issues insights concepts and tools
Foundations Melnyk

https://textbookfull.com/product/developing-sustainable-supply-
chains-to-drive-value-volume-1-management-issues-insights-
concepts-and-tools-foundations-melnyk/

Study of Bacteriorhodopsin in a Controlled Lipid


Environment Vivien Yeh

https://textbookfull.com/product/study-of-bacteriorhodopsin-in-a-
controlled-lipid-environment-vivien-yeh/

Controlled Release of Pesticides for Sustainable


Agriculture Rakhimol K. R.

https://textbookfull.com/product/controlled-release-of-
pesticides-for-sustainable-agriculture-rakhimol-k-r/

A Reference Grammar of Chinese 1st Edition Chu-Ren


Huang (Ed.)

https://textbookfull.com/product/a-reference-grammar-of-
chinese-1st-edition-chu-ren-huang-ed/

Knot Your Average Beta FatedVerse Book 2 1st Edition


Tana Rose

https://textbookfull.com/product/knot-your-average-beta-
fatedverse-book-2-1st-edition-tana-rose/
SPRINGER BRIEFS IN ELEC TRIC AL AND COMPUTER
ENGINEERING  CONTROL, AUTOMATION AND ROBOTICS

Xi-Ren Cao

Foundations
of Average-Cost
Nonhomogeneous
Controlled Markov
Chains
SpringerBriefs in Electrical and Computer
Engineering

Control, Automation and Robotics

Series Editors
Tamer Başar, Coordinated Science Laboratory, University of Illinois at
Urbana-Champaign, Urbana, IL, USA
Miroslav Krstic, La Jolla, CA, USA
SpringerBriefs in Control, Automation and Robotics presents concise sum-
maries of theoretical research and practical applications. Featuring compact,
authored volumes of 50 to 125 pages, the series covers a range of research, report
and instructional content. Typical topics might include:
• a timely report of state-of-the art analytical techniques;
• a bridge between new research results published in journal articles and a con-
textual literature review;
• a novel development in control theory or state-of-the-art development in
robotics;
• an in-depth case study or application example;
• a presentation of core concepts that students must understand in order to make
independent contributions; or
• a summation/expansion of material presented at a recent workshop, symposium
or keynote address.
SpringerBriefs in Control, Automation and Robotics allows authors to present
their ideas and readers to absorb them with minimal time investment, and are
published as part of Springer’s e-Book collection, with millions of users worldwide.
In addition, Briefs are available for individual print and electronic purchase.
Springer Briefs in a nutshell
• 50 – 125 published pages, including all tables, figures, and references;
• softcover binding;
• publication within 9–12 weeks after acceptance of complete manuscript;
• copyright is retained by author;
• authored titles only – no contributed titles; and
• versions in print, eBook, and MyCopy.
Indexed by Engineering Index.
Publishing Ethics: Researchers should conduct their research from research
proposal to publication in line with best practices and codes of conduct of relevant
professional bodies and/or national and international regulatory bodies. For more
details on individual ethics matters please see: https://www.springer.com/gp/
authors-editors/journal-author/journal-author-helpdesk/publishing-ethics/14214

More information about this subseries at http://www.springer.com/series/10198


Xi-Ren Cao

Foundations of Average-Cost
Nonhomogeneous Controlled
Markov Chains

123
Xi-Ren Cao
Department of Automation
Shanghai Jiao Tong University
Shanghai, China
Department of Electronic and Computer Engineering
and Institute of Advanced Study
The Hong Kong University of Science and Technology
Hong Kong, China

ISSN 2191-8112 ISSN 2191-8120 (electronic)


SpringerBriefs in Electrical and Computer Engineering
ISSN 2192-6786 ISSN 2192-6794 (electronic)
SpringerBriefs in Control, Automation and Robotics
ISBN 978-3-030-56677-7 ISBN 978-3-030-56678-4 (eBook)
https://doi.org/10.1007/978-3-030-56678-4
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, expressed or implied, with respect to the material contained
herein or for any errors or omissions that may have been made. The publisher remains neutral with regard
to jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface

Many real-world systems are time-dependent, and therefore, their state evolution
should be modeled as time-nonhomogeneous processes. However, performance
optimization of time-nonhomogeneous Markov chains (TNHMCs) with the
long-run average criterion does not seem well presented in the existing books in the
literature.
In a TNHMC, the state spaces, transition probabilities, and reward functions
depend on time; and notions such as stationarity, ergodicity, periodicity, connec-
tivity, recurrent and transient states, no longer apply. These properties are crucial to
the analysis of time-homogeneous Markov chains (THMCs). Besides, the long-run
average suffers from the so-called under-selectivity, which roughly means that it
does not depend on the rewards received in any finite period. Dynamic program-
ming may not be very suitable to address the under-selectivity problem because it
treats every time instant equally. Therefore, new notions and a different opti-
mization approach are needed.
In this book, we provide comprehensive treatment for the optimization of
TNHMCs. We first introduce the notion of confluencity of states, which refers to
the property that two independent sample paths of a Markov chain starting from two
different initial states will eventually meet together in a finite period. We show that
confluencity captures the essentials of the performance optimization of both
TNHMCs and THMCs. With confluencity, the states in a TNHMC can be classified
into different classes of confluent states and branching states, and single-class and
multi-class optimization can be implemented.
We apply the relative optimization approach to the problems. It is simply based
on comparing the performance measures of a system under any two policies, and so
is also called a sensitivity-based approach. The development of this approach
started from the early 80s, with the perturbation analysis of discrete-event dynamic
systems. The approach is motivated by the study of information technology sys-
tems, which usually have large sizes and unknown system structures or parameters.
Because of the complexity of the systems, the analysis is usually based on sample
paths, so the approach is closely related to reinforcement learning. The approach
has been successfully applied to the performance optimization of THMCs.

v
vi Preface

With confluencity and relative optimization, the problems related to the opti-
mization of TNHMCs can be solved in an intuitive way. State classification is
carried out with confluencity; the optimization conditions for the long-run average
of single-class and multi-class TNHMCs, bias, Nth-bias, and Blackwell optimality
are derived by the relative optimization approach; and the under-selectivity is
reflected in the optimality conditions. The optimization of THMCs becomes a
special case.
In summary, the book distinguishes itself by a new notion, the confluencity, a
new approach, the relative optimization, and the new results related to TNHMCs
developed based on them. The book shows that confluencity captures the essentials
of the performance optimization for both THMCs and TNHMCs, and that relative
optimization fits TNHMCs better than dynamic programming.
The book is a continuation of two of my previous books, Stochastic Learning
and Optimization—A Sensitivity-Based Approach, Springer, 2007, and Relative
Optimization of Continuous-Time and Continuous-State Stochastic Systems,
Springer, 2020. The readers may refer to these two books for more about the
relative optimization approach and its applications.
Most parts of my works in this book were done when I was with the Institute of
Advanced Study, The Hong Kong University of Science and Technology, and the
Department of Finance and the Department of Automation, Shanghai Jiao Tong
University. I would like to express my sincere appreciation to these two excellent
institutes for the nice research environment and financial support. Lastly, I would
like to express my gratitude to my wife, Mindy, my son and daughter in law, Ron
and Faye, and my two lovely grandsons, Ryan and Zack, for the happiness that they
have brought to me, which has made my life colorful.

Hong Kong Xi-Ren Cao


June 2020 xrcao@sjtu.edu.cn
eecao@ust.hk
Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .... 1
1.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .... 4
1.1.1 Discrete-Time Discrete-State Time-Nonhomogeneous
Markov Chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.2 Performance Optimization . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Main Concepts and Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Review of Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 Confluencity and State Classification . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1 The Confluencity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.1 Confluencity and Weak Ergodicty . . . . . . . . . . . . . . . . . . . 13
2.1.2 Coefficient of Confluencity . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2 State Classification and Decomposition . . . . . . . . . . . . . . . . . . . . 18
2.2.1 Connectivity of States . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2.2 Confluent States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2.3 Branching States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2.4 State Classification and Decomposition Theorem . . . . . . . . 25
3 Optimization of Average Rewards and Bias: Single Class . . . . . . . . 29
3.1 Preliminary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Performance Potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.2.1 Relative Performance Potentials . . . . . . . . . . . . . . . . . . . . 32
3.2.2 Performance Potential, Poisson Equation,
and Dynkin’s Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3 Optimization of Average Rewards . . . . . . . . . . . . . . . . . . . . . . . . 37
3.3.1 The Performance Difference Formula . . . . . . . . . . . . . . . . 37
3.3.2 Optimality Condition for Average Rewards . . . . . . . . . . . . 38
3.4 Bias Optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.4.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.4.2 Bias Optimality Conditions . . . . . . . . . . . . . . . . . . . . . . . 48

vii
viii Contents

3.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.6 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4 Optimization of Average Rewards: Multi-Chains . . . . . . . . . . . . . . . 59
4.1 Performance Potentials of Multi-class TNHMCs . . . . . . . . . . . . . . 59
4.2 Optimization of Average Rewards: Multi-class . . . . . . . . . . . . . . . 64
4.2.1 The Performance Difference Formula . . . . . . . . . . . . . . . . 64
4.2.2 Optimality Condition: Multi-Class . . . . . . . . . . . . . . . . . . 65
4.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.4 Appendix. The “lim inf” Performance and Asynchronicity . . . . . . 71
5 The Nth-Bias and Blackwell Optimality . . . . . . . . . . . . . . . . . . . . . . 79
5.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.2 The Nth-Bias Optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.2.1 The Nth Bias and Main Properties . . . . . . . . . . . . . . . . . . 82
5.2.2 The Nth-Bias Difference Formula . . . . . . . . . . . . . . . . . . . 86
5.2.3 The Nth-Bias Optimality Conditions I . . . . . . . . . . . . . . . . 89
5.2.4 The Nth-Bias Optimality Conditions II . . . . . . . . . . . . . . . 93
5.3 Laurent Series, Blackwell Optimality, and Sensitivity
Discount Optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
5.3.1 The Laurent Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
5.3.2 Blackwell Optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.3.3 Sensitivity Discount Optimality . . . . . . . . . . . . . . . . . . . . 104
5.3.4 On Geometrical Ergodicity . . . . . . . . . . . . . . . . . . . . . . . . 106
5.4 Discussions and Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Series Editors’ Biographies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Chapter 1
Introduction

This book deals with the performance optimization problem related to the discrete-
time discrete-state time-nonhomogeneous Markov chains (TNHMCs), in which the
state spaces, transition probabilities, and reward functions, depend on time. There are
many excellent books in the literature on performance optimization of the discrete-
time discrete-state time-homogeneous Markov chains (THMCs) (see, e.g., Altman
(1999), Bertsekas (2007), Cao (2007), Feinberg and Shwartz (2002), Hernández-
Lerma and Lasserre (1996, 1999), Puterman (1994)). This book, however, provides
comprehensive treatment for the optimization of TNHMCs, in particular, with the
long-run average performance.
The analysis of TNHMCs encounters a few major difficulties: (1) Notions such as
stationarity, ergodicity, periodicity, connectivity, recurrent and transient states, which
are crucial to the analysis of THMCs, no longer apply. (2) The long-run average-
reward criterion is under-selective; e.g., the long-run average does not depend on the
decisions in any finite period, and thus dynamic programming, which treats every
time instant separately and equally, is not very suitable for the problem.
Therefore, we need to formulate new notions and apply a different approach,
other than dynamic programming, to the problem. In this book, we introduce the
“confluencity” of states and show that it is the essential property for performance
optimization. Confluencity refers to the property that two independent sample paths
of a Markov chain starting from any two different initial states will eventually meet
together in a finite period. This notion enables us to develop properties that may
replace the ergodicity, stationarity, connectivity, and aperiodicity, etc., used in the
optimization of THMCs. With confluencity, we may classify the states into different
classes, and develop optimality conditions for the long-run average performance for
both uni-chain and multi-class TNHMCs.
Furthermore, we apply the relative optimization approach, also called the direct-
comparison-based or the sensitivity-based approach, instead of dynamic program-
ming, to the problem. The approach is based on a formula that gives the difference
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 1
X.-R. Cao, Foundations of Average-Cost Nonhomogeneous Controlled
Markov Chains, SpringerBriefs in Control, Automation and Robotics,
https://doi.org/10.1007/978-3-030-56678-4_1
2 1 Introduction

between the performance measures of any two policies. The formula contains global
information about the performance difference in the entire time horizon, and the
approach provides a new view to performance optimization and also may solve the
issue of under-selectivity.
Although the principles discussed in the book apply to other performance criteria,
such as the finite period and discounted total reward, as well, we will focus only on
the average-cost related criteria, including the long-run average, bias, N th-bias, and
Blackwell optimality. In economics, a total discounted reward is often used as the
performance criterion. However, in computer and communication networks, because
of the rapid transition and quick decision-making, cost-discounting does not make
sense; for example, there is no point in discounting the waiting time of a packet. As
pointed out by Puterman, “Consequently, the average-reward criterion occupies a
cornerstone of queueing control theory especially when applied to control computer
systems and communication networks.” (Puterman 1994).
The principles presented in this book apply to THMCs naturally. Therefore, the
state confluencity and the relative optimization indeed form a foundation of the
performance optimization of Markov systems, TNHMCs and THMCs.
There are many excellent results on the optimization and control of time-
homogeneous systems (see, e.g., Altman (1999), Arapostathis et al. (1993), Åström
(1970), Bertsekas (2007), Bryson and Ho (1969), Cao (2007), Feinberg and Shwartz
(2002), Hernández-Lerma and Lasserre (1996, 1999), Meyn and Tweedie (2009),
Puterman (1994)); every such result should have its counterpart in nonhomogeneous
systems. In this sense, the results of this book have wide applications. Here are some
simple examples.

Example 1.1 A communication server with fixed-length packets can be modeled by


a discrete-time birth-death process (Kleinrock 1975; Sericola 2013). Time is divided
into intervals with a fixed length, denoted by k = 0, 1, . . .. In the homogeneous case,
a packet arrives with probability 1 < λ < 1 at every time k, and the server transmits
a packet in a time slot with probability 0 < μ(n) < 1, where n is the number of
packets in the server. The server has a buffer with a maximum size of N , i.e., if the
number of packets is N , then any arriving packet will be rejected and lost; so 0 ≤
n ≤ N . The arrival and service processes are independent. Let pn := λ[1 − μ(n)],
qn := μ(n)[1 − λ], 0 < n < N , μ(0) = 0, p N = 0, and q N = μ(N ); and pn are the
birth rate and qn are the death rate, 0 ≤ n ≤ N . The number of packets in the server
can be modeled with a discrete birth-death process X = {X 0 , X 1 , . . .}, X k ∈ S :=
{0, 1, . . . , N }, with the following transition probabilities: for 0 < n < N ,

⎨ pn i f X k = n, X k+1 = n + 1,
P(X k+1 |X k ) = qn i f X k = n, X k+1 = n − 1,

1 − ( pn + qn ) i f X k = X k+1 = n;

and P(N − 1|N ) = q N , P(N |N ) = 1 − q N , P(1|0) = λ, and P(0|0) = 1 − λ.


With the control policy μ(n), the performance criterion to be optimized is the
long-run average (discounting makes no sense due to rapid transmissions)
1 Introduction 3

 1  K−1 

η := lim E f (X k ) X 0 = 0 ,
K →∞ K
k=0

with the reward function f (n) = n, for 0 < n < N , and f (N ) = − N2 , which rep-
resents a penalty for losing packets when X k = N . Under some mild conditions,
X is ergodic, and the problem can be solved by the standard Markov decision pro-
cess (MDP) approach in the literature, e.g., Hernández-Lerma and Lasserre (1996),
Puterman (1994).
In general, the arrival rate λk may depend on time k, so do the transition prob-
abilities Pk (X k+1 |X k ), representing the time-dependent nature of the demand. The
Markov chain X is no longer ergodic or stationary, and X may wander around the
state space S forever and never settles down (in distribution). Then, the question is,
what can we do for these nonhomogeneous systems? This question is addressed in
general in the rest of the book. 

Example 1.2 Consider a one-dimensional linear quadratic Gaussian (LQG) control


problem (Bertsekas 2007; Bryson and Ho 1969; Cao 2007). The system dynamic is
described by a linear equation. In the time-homogeneous case, it is

X k+1 = a X k + bu k + w, (1.1)

where X k ∈ R is the system state, u k is the control variable, w is a Gaussian dis-


tributed random noise, at time k, and a and b are constants. The reward function is
in a quadratic form:

 1 K−1
η = lim E[(q X k2 + r u 2k )|X 0 = x] , (1.2)
K →∞ K
k=0

It is proved that a linear feedback control u k = −d X k , k = 0, 1, . . ., achieves the


optimal performance (Bryson and Ho 1969; Cao 2007). Under this linear policy,
the system equation (1.1) becomes X k+1 = cX k + w, with c = a − bd. The system
is stable with c < 1. This problem can be easily solved within the standard time-
homogeneous MDP framework.
In the time-nonhomogeneous case, the system parameters are time dependent:
X k+1 = ak X k + bk u k + wk , the reward function becomes f k (x, u) = qk x 2 + rk u 2 .
Because ak and bk are different at different k  s, the system may move arbitrarily on
R. The Markov chain X may not reach a stationary status, and the performance
defined in (1.2) may not exists. Then, the question is, how can we analyze such
nonhomogeneous systems in a general way? 

The simple model of LQG can be used in the more advanced study in the mean-field
game theory (Caines 2019; Bensoussan et al. 2013), which was originally motivated
by a power control problem in a wireless communication network. Other applications
include the study of the transient behavior of any physical system; for example, in
4 1 Introduction

launching a satellite, it will go around the earth a few times before reaching its final
orbit; so the system is nonhomogeneous before it reaches the stationary status (see
Chap. 5). Time-nonhomogeneous Markov models are also widely used in other fields
such as economics and social networks (Bhawsar et al. 2014; Briggs and Sculpher
1998; Proskurnikov and Tempo 2017).

1.1 Problem Formulation

1.1.1 Discrete-Time Discrete-State Time-Nonhomogeneous


Markov Chains

Consider a discrete-time discrete-state TNHMC X := {X k , X k ∈ S , k = 0, 1, . . .},


where S = {1, 2, . . . , S} denotes the state space, the space of the states at all time
instants k = 0, 1, . . ., with S = |S | being the number of states in S (S may be
infinity).
A time-nonhomogeneous Markov chain may stroll around in the state space, which
means that at different times k, k = 0, 1, . . ., the Markov chain may be at different
subspaces of S . We allow new states to join the chain at any time k = 0, 1, . . .. Let
S k ⊆ S be the state space at time k = 0, 1, . . .. It includes the states joining the
chain at k. Let Pk (y|x) denote the transition probability from state x to state y at
time k; then, we have

S k = {y ∈ S : Pk (z|y) > 0, f or some z ∈ S }, (1.3)

and it is also called the input set of the Markov chain at k, k = 0, 1, . . .; and we
further define

S k,out = {y ∈ S : Pk (y|z) > 0 f or some z ∈ S k } (1.4)

as the output set of the chain at k, k = 0, 1, . . .. For the Markov chain to run properly,
it requires that
S k,out ⊆ S k+1 , k = 0, 1, . . . . (1.5)

When S k,out ⊂ S k+1 , there are states joining the chain at time k + 1. Let Sk := |S k |
be the number of the states in the state space, or the input set, at k, k = 0, 1, . . .. We set
Pk := [Pk (z|y)] y∈S k ,z∈S k+1 to be the transition probability matrix at k and it may not
be a square matrix, and set P = {P0 , P1 , . . . Pk , . . .}, S = {S 0 , S 1 , . . . , S k , . . .}.
P is called a transition law of the process X.
An example of possible state spaces and transitions are shown in Fig. 1.1. In
the figure, S 2,out = {x4 , x5 , x7 } ⊂ S 3 = {x4 , x5 , x6 , x7 }; states x6 joins the chain
at time 3. Similarly, states x3 joins the chain at time 2. The states at different times
1.1 Problem Formulation 5

Fig. 1.1 State spaces and state transitions of a TNHMC: circles represent states, arrows represent
transitions, and state spaces S1 , S4 , and S6 are indicated

may be completely different, although they are put on the same horizontal line for
convenience.

Example 1.3 All the time-homogeneous Markov chains, including irreducible


chains, uni-chains, or multi-chains, are special cases of time-nonhomogeneous
Markov chains with Pk ≡ P and S k = S for all k = 0, 1, . . .. 

1.1.2 Performance Optimization

Let (Ω, , P) denote the probability space of the Markov chain generated by the
transition law P and the initial probability measure of X 0 ; sometimes, we may denote
this probability measure by P P = P. Let E P = E be the corresponding expectation.
We have P(X k = y|X 0 = x) > 0 for some y ∈ S k−1,out and x ∈ S 0 (we use the
condition {•|X 0 = x} to denote the initial state).
First, we denote the reward function at time k by f k (x), x ∈ S k , k = 0, 1, . . .,
and we denote the reward vector at time k by f k = ( f k (1), . . . , f k (i), . . . , f k (Sk ))T ,
and also set ff = { f 0 , f 1 , . . . , f k , . . .}.
The theory for finite-horizon performance optimization problems for TNHMCs is
almost the same as that for the THMCs because the finite-sum performance measure
K
E{ l=k f (X l )|X k = x} also depends on time k for THMCs. Therefore, in this book,
we focus on infinite horizon problems. The long-run average reward, starting from
X k = x, is defined by

1  
k+K −1 

ηk (x) = lim inf E fl (X l ) X k = x . (1.6)
K →∞ K l=k
6 1 Introduction

If the limit in (1.6) exists, then we have

1  
k+K −1 

ηk (x) = lim E fl (X l ) X k = x . (1.7)
K →∞ K l=k

The discounted performance is defined by


 

υβ,k (x) := E β l f k+l (X k+l )X k = x , (1.8)
l=0

0 < β < 1 is called a discount factor. In (1.6)–(1.8), ηk (x) or ηβ,k (x) is called a
performance measure, or a performance criterion.
While the long-run average measures the infinite horizon behavior, there is another
performance measure called “bias”, which measures the transient behavior of the
system. For bias, we assume that the limit (1.7) exists; and the bias is defined by


∞ 

gk (x) = E [ fl (X l ) − ηl (X l )]X k = x . (1.9)
l=k

Other performance measures such as the N th biases, N = 1, 2, . . ., and the Black-


well optimality, will be discussed in the book, and they will be precisely defined in
Chap. 5. These definitions essentially take the same forms as those for THMCs in the
literature (Altman 1999; Cao 2007; Jasso-Fuentes and Hernández-Lerma 2009a, b;
Feinberg and Shwartz 2002; Hernández-Lerma and Lasserre 1996, 1999; Hordijk
and Yushkevich 1999a, b; Lewis and Puterman 2001, 2002; Puterman 1994).
To simplify the discussion, we make the following assumption:
Assumption 1.1 (a) The sizes of state spaces S k are bounded, i.e., Sk := |S k | ≤
M1 < ∞ for all k = 0, 1, . . .; and
(b) The reward functions are bounded, i.e.,

| f k (x)| < M2 < ∞. (1.10)

Under this assumption, ηk (x) is finite for any x ∈ S k .


In an optimization problem, at time k = 0, 1, . . ., with X k = x ∈ S k , we may take
an action αk (x) chosen in an action set A k (x); the action determines the transition
probabilities at x, denoted as Pkαk (x) (y|x), y ∈ S k,out , and the reward at state x,
denoted by f kαk (x) (x). The mapping, or the vector, αk = αk (x), x ∈ S k , is called a
decisionrule, and the space of all the decision rules at time k can be denoted by
A k := x∈S k A k (x), where denotes the Cartesian product. A decision rule at
k determines a transition probability matrix at k, Pkαk = [Pkαk (x) (y|x)]x∈S k ,y∈S k,out ,
k = 0, 1, . . ., and a reward function (or vector) f kαk = ( f kαk (1) (1), . . . , f kαk (Sk ) (Sk ))T .
1.1 Problem Formulation 7

A policy is a sequence of decision rules at k = 0, 1, . . ., u := {α0 , α1 , . . .}, which


determines P := {P0α0 , P1α1 , . . .} and ff := { f 0α0 , f 1α1 , . . .}. We also refer to a pair
(P, ff ) =: u as a policy. As in the case of the standard Markov decision process, we
assume that the actions at different states and different k = 0, 1, . . ., can be chosen
independently; thus, in every Pk , k = 1, 2, . . ., we may independently choose every
row. We do not consider problems with constraints in which an action at state X k = x
may affect the action chosen at any other state X k  = y, k  = or = k.
A Markov chain and its reward are determined by a policy u = (P, ff ) that is
applied to the chain. We use a superscript u, or simply P when ff does not take a
role, to denote the quantities associated with the Markov chain controlled by policy
u. Thus, the Markov chain is denoted by X u , the input and output sets at time k
are denoted by S uk and S uk,out , k = 0, 1, . . ., respectively, and the decision rule is
denoted by αku ; and the average reward (1.6) is now denoted by

1 u   αlu u  u
k+K −1
ηku (x) = lim inf E fl (X l )X k = x .
K →∞ K
l=k

Let D denote the space of all admissible policies (to be precisely defined later
in Definitions 3.2, 4.1, and 5.1). To make the performance of all policies in D
comparable, we require that the average rewards starting from all states are well
defined under all policies. So we need the following assumption (cf. (1.5)).

Assumption 1.2 The state spaces at any k, k = 0, 1, . . ., are the same for all policies;

i.e., for any u, u  ∈ D, we have S uk = S uk =: S k , k = 0, 1 . . .. 

This assumption is very natural. As shown in Fig. 1.1, new states may be added
to the state space; thus, the assumption does not mean that all policies have the same
output space; it only requires that every policy assigns transition probabilities to all
possible states in S k . In other words, no state should stop the Markov chain from
running under any policy, so its performance does exist. In a THMC, this simply
says that all policies have the same state space.
The goal of performance optimization is to find a policy in D that maximizes the
performance criterion (if the maximum can be achieved); i.e., to identify an optimal
policy denoted by u ∗ :

u ∗ = arg max(ηku (x)) , x ∈ S k , k = 0, 1, . . . . (1.11)
u∈D

If the maximum cannot be attained, the goal is to find the value

ηk∗ (x) = sup (ηku (x)), x ∈ S k , k = 0, 1, . . . . (1.12)


u∈D

Similar definitions apply to the bias and N th-bias optimality as well; see the related
chapters.
8 1 Introduction

1.2 Main Concepts and Results

In discrete-time and discrete-state TNHMCs, the state spaces, transition probability


matrices, and reward functions at different time k = 0, 1, . . . may be different. As
such, notions such as stationarity, ergodicity, periodicity, connectivity, recurrent and
transient states, which are crucial to the analysis of THMCs, no longer apply.
Besides, the long-run average performance criterion possesses the under-
selectivity, which roughly refers to the property that the long-run average does not
depend on the actions in any finite period, and thus the optimality conditions do not
need to hold in any finite period. Under-selectivity makes dynamic programming not
suitable for the problem, because the dynamic programming principle treats every
time instant equally, resulting in the same conditions at all time instants.
Therefore, to carry out the performance optimization of TNHMCs, we need to
formulate new notions capturing the main features mentioned above. We also need to
apply a different approach other than dynamic programming. Optimality conditions,
necessary and sufficient, for long-run average, bias, N th biases, and Blackwell, etc.,
for single or multi-class TNHMCs, can then be derived.
The main results of the book are summarized as follows. Many of them have not
been covered by the books in the literature; this makes this book unique.
(1) (Confluencity) We discover that the crucial property in the performance opti-
mization, for both TNHMCs and THMCs, is the confluencity. Two states are
confluent to each other, if two independent sample paths of the Markov chain
starting from these two states, respectively, eventually meet at the same state
with probability one. We show that this notion plays a fundamental role in the
performance optimization and state classification, and optimization theory can
be developed without ergodicity, stationarity, connectivity, and aperiodicity, etc.
We also show that confluencity implies weak ergodicity, and explains their dif-
ferences and why the former is needed for performance optimization and state
classification of TNHMCs. These results are discussed in Sect. 2.1.
(2) (Relative optimization) We apply the relative optimization approach, which
has been successfully applied to many optimization problems, see Cao (2020b,
2007), Guo and Hernández-Lerma (2009) and the references therein. The
approach is based on a performance difference formula that compares the perfor-
mance measures of any two policies. While dynamic programming provides only
“local” information at any particular time and state, the performance difference
formula contains the “global” information about the performance comparison in
the entire horizon (see the discussion in Chap. 1 of Cao 2020a). The approach
is also called the sensitivity-based or direct-comparison-based approach in the
literature (Cao 2007).
(3) (Uni-chain optimization with under-selectivity) A TNHMC is said to be a
uni-chain if all the states in the chain are confluent to each other. We define the
performance potential for uni-chains, which serves as a building block of per-
formance optimization, and it extends the same notion of potentials of THMCs
(Cao 2007). With performance potentials, we may derive the performance differ-
1.2 Main Concepts and Results 9

ence formula, and then obtain the necessary and sufficient optimality conditions
for the long-run average rewards. From the difference formula, it is clear that
the optimality conditions may not need to hold in any finite-time period, so the
under-selectivity is reflected in these conditions. These results extend the well-
known optimization results for THMCs, see, e.g., Altman (1999), Arapostathis
et al. (1993), Bertsekas (2007), Feinberg and Shwartz (2002), Hernández-Lerma
and Lasserre (1996, 1999), Puterman (1994); they are discussed in Chap. 3.
(4) (State classification) We show that with confluencity, all the states in a TNHMC
can be classified into confluent states and branching states, and all the confluent
states can be grouped into a number of confluent classes. Independent sample
paths starting from the same confluent state meet together infinitely often with
probability one (w.p.1); the sample paths starting from different confluent states
in the same class, even at different times, will meet w.p.1 as well; the sample
paths from states in different confluent classes, however, will never meet; and
a sample path starting from a branching state will enter one of at least two
confluent classes and stay there forever. Confluent and branching are extensions
of (but somewhat different from) the notions of recurrent and transient (Puterman
1994), and the former is a better state classification than the latter, in the sense
that the confluent and branching are more appropriate to capture the relations of
the states in performance optimization. These results are discussed in Sect. 2.2.
(5) (Multi-class optimization) We derive both necessary and sufficient condi-
tions for the optimal policies of the long-run average reward of multi-chain
TNHMCs (consisting of multiple confluent classes and branching states). Sur-
prisingly, these results look very similar to those for THMCs (Cao 2007; Guo and
Hernández-Lerma 2009; Puterman 1994). These results also reflect the under-
selectivity; e.g., the conditions do not need to hold for any finite period. These
results are discussed in Chap. 4.
(6) (Bias optimality) The long-run average only measures the asymptotic behavior,
and it does not depend on the behavior in the initial period. The most important
measure of transient performance is the bias defined by (1.9). Bias optimality
refers to the optimization of bias in the space of all optimal long-run average
policies. This problem has been well-studied for THMCs in the literature (Guo
et al. 2009; Hernández-Lerma and Lasserre 1999; Jasso-Fuentes and Hernández-
Lerma 2009b; Lewis and Puterman 2000, 2001, 2002). With confluencity and
relative optimization, we extend the results to TNHMCs. Interestingly, under-
selectivity is also reflected in the bias optimization conditions. These results are
discussed in Sect. 3.4.
(7) (N th-bias optimality and Blackwell optimality) While the bias measures tran-
sient performance, we observe that a bias has a bias, which has a bias too, and so
on. In general, the (N + 1)th bias is the bias of the N th bias, N ≥ 1 (with the first
bias being the bias in (1.9)). The N th biases, N = 1, 2, . . ., describe the tran-
sient behavior of a Markov chain at different levels. The policy that optimizes all
the N th biases, N = 0, 1, . . ., is the one with no under-selectivity. A Blackwell
optimal policy is optimal for discounted performance measures (1.8) for all dis-
count factors near one. The discounted reward (1.8) can be expressed as a Laurent
10 1 Introduction

series in all the N th biases, and the policy optimizing all the N th biases is the
Blackwell optimal policy and vice versa. It is shown that the N th-bias approach
is equivalent to the sensitive discount optimality; These results are well estab-
lished for THMCs, see Cao (2007), Guo and Hernández-Lerma (2009), Hordijk
and Yushkevich (1999a, b), Jasso-Fuentes and Hernández-Lerma (2009a), Put-
erman (1994), Veinott (1969), and in this book, we extend them to TNHMCs.
The necessary and sufficient optimality conditions for the N th-bias optimal and
Blackwell optimal policies are derived. These results are discussed in Chap. 5.
The N th-bias optimality provides a complete solution to the under-selectivity of
long-run average performance. The approach does not resort to discounting.
All the optimality results are obtained by the relative optimization approach, based
on a direct comparison of the performance measures, or the biases, or N th biases,
under any two policies. They demonstrate the applications of relative optimization.

1.3 Review of Related Works

Many research works have been done regarding the issues discussed in this book,
dating back to Kolmogoroff (Kolmogoroff 1936) and Blackwell (Blackwell 1945).
Main results include
(1) Concepts similar to ergodicity and stationarity, called weak ergodicity and sta-
bility, were proposed. Weak ergodicity, or merging, refers to the property that
a Markov chain asymptotically forgets where it started (i.e., as time passes by,
the state probability distribution becomes asymptotically independent of the ini-
tial state (Hajnal 1958; Park et al. 1993; Saloff-Coste and Zuniga 2010, 2007).
Stability relates to whether the state probability distributions will converge, or
become close to each other, in the long term (Meyn and Tweedie 2009; Saloff-
Coste and Zuniga 2010). A technique called coupling is closely related to the
confluencity, but both have different connotations, and they are introduced for
different purposes. Coupling consists of two copies of a Markov chain running
simultaneously; they behave exactly like the original Markov chain but may
be correlated. It was used widely in studying stationarity, weak ergodicity, and
the rate of convergence, and many other problems (Cao 2007; Doeblin 1937;
Griffeath 1975a, b, 1976; Pitman 1976; Roberts and Rosenthal 2004).
(2) The decomposition–separation (DS) theorem (Blackwell 1945; Cohn 1989; Kol-
mogoroff 1936; Sonin 1991, 1996, 2008) was established, which claims that,
with probability one, any sample path of a TNHMC will, after a finite number
of steps, enter one of the “jets” (a sequence of subsets of the state spaces at time
k = 0, 1, . . .). The DS theorem can be viewed as a decomposition theorem of
sample paths, which is closely related to the state decomposition for TNHMCs
introduced in this book. In Platis et al. (1998), the computation of the hitting
time from a state to a subset of states in a time-nonhomogeneous chain was
1.3 Review of Related Works 11

discussed, and sufficient conditions for the existence of the mean hitting time
were provided.
(3) The under-selectivity of average rewards of TNHMCs were discussed, e.g., in
Bean and Smith (1990), Hopp et al. (1987), Park et al. (1993). To take the
finite-period performance into consideration, Hopp et al. (1987) introduced the
periodic forecast horizon (PFH) optimality, and proved that under some condi-
tions, PFH optimality implies average optimality. This roughly means that the
PFH optimal policy, as an average optimal policy, is the limit point, defined
in some metric, of a sequence of optimal policies for finite horizon problems.
Although not mentioned in Hopp et al. (1987), their results apply only to the
“uni-chain” case defined in this book.
(4) There are many excellent works on control and optimization of THMCs, and
we cannot cite all of them in this book; they include (Altman 1999; Bertsekas
2007; Cao 2007; Feinberg and Shwartz 2002; Hernández-Lerma and Lasserre
1996, 1999; Puterman 1994), to name a few. In addition, the optimization of
total rewards for TNHMCs was discussed in Hinderer (1970), and the results
are similar to the homogeneous chains (Hernández-Lerma and Lasserre 1996;
Puterman 1994).
(5) The N th-bias optimization and its relation with the Blackwell optimality were
presented in Cao (2007), Cao and Zhang (2008b) for discrete-time discrete-
state THMCs, and in Zhang and Cao (2009), Guo and Hernández-Lerma (2009)
for continuous-time discrete-state multi-chain THMCs. It is equivalent to the
sensitive discount optimality, which was first proposed by Veinott in Veinott
(1969), and has been studied by many researchers, see, e.g., Dekker and Hordijk
(1988), Guo and Hernández-Lerma (2009), Hordijk and Yushkevich (1999a),
Hordijk and Yushkevich (1999b), Jasso-Fuentes and Hernández-Lerma (2009a),
Puterman (1994). Also, there are many works on bias optimality for THMCs,
e.g., Guo et al. (2009), Lewis and Puterman (2001), Puterman (1994), Jasso-
Fuentes and Hernández-Lerma (2009a), Jasso-Fuentes and Hernández-Lerma
(2009b).
(6) Most of the contents of this book are from the authors’ research works reported in
Cao (2019a, 2016, 2015). In Cao (2015), the notion of confluencity is proposed,
and the necessary and sufficient optimality conditions for the long-run average
and bias are derived for uni-chain TNHMCs. In Cao (2016), confluencity is used
to classify the states, and the necessary and sufficient optimality conditions for
the long-run average are derived for multi-class TNHMCs. In Cao (2019a), the
N th-bias and Blackwell optimality are discussed with no discounting.
(7) The development of relative optimization started with the perturbation analysis
of discrete-event dynamic systems (Cao 2000; Cao and Chen 1997; Cassandras
and Lafortune 2008; Fu and Hu 1997; Glasserman 1991; Ho and Cao 1991).
The approach is motivated by the study of information technology systems and
is closely related to reinforcement learning (Cao 2007). Its applications to the
optimization of THMCs are summarized in Cao (2007). This approach is based
on first principles; it starts with a simple comparison of the performance measures
of any two policies. It provides global information on the entire horizon and
thus may be applied to problems to which dynamic programming may not fit
12 1 Introduction

very well, It has been successfully applied to many problems, e.g., event-based
optimization (Cao and Zhang 2008a; Xia et al. 2014), and partially observable
Markov decision problems with separation principle extended (Cao et al. 2014),
and stochastic control of diffusion processes (Cao 2020a, 2017), to name a few.
(8) There are many other excellent research works on TNHMCs in the current liter-
ature. For example, Benaïm et al. (2017), Madsen and Isaacson (1973) discussed
the ergodicity and strongly ergodic behavior of TNHMCs, Douc et al. (2004)
studied the bounds for convergence rates of TNHMCs; Connors and Kumar
(1989, 1988) worked on the recurrence order of TNHMCs and its application
to simulated annealing; Cohn (1989, 1972, 1976, 1982) considered a number
of important topics related to TNHMCs, including their asymptotic behavior
and the limit theorems; Zheng et al. (2018) showed that when the transition
probabilities of a TNHMC change slowly over time, various expectations asso-
ciated with it can be rigorously approximated by means of THMCs; Teodorescu
(1980) developed some tools for deriving TNHMCs with desired properties,
starting from observed data in real systems; Brémaud (1999) discussed eigen-
values associated with TNHMCs; and Alfa and Margolius (2008), Li (2010),
Margolius (2008) studied the transient behavior of TNHMCs with periodic or
block structures, with the RG factorization approach; and many more.
Chapter 2
Confluencity and State Classification

In this chapter, we introduce the notion of confluencity and show that with conflu-
encity, the states of a TNHMC can be classified into different classes of confluent
states and branching states (Cao 2019, 2016, 2015). Confluent and branching are
similar to, but different from, recurrent and transient in THMCs (Cao 2007; Hordijk
and Lasserre 1994; Puterman 1994). A branching state is transient, and a recurrent
state is confluent, but a transient state may be either a branching or a confluent state,
and a confluent state may be either recurrent or transient (see Table 2.1). We will
show that the classification of the confluent and branching states is more essential to
optimization than that of the recurrent and transient states.
The results of this chapter form a foundation for performance optimization, which
will be discussed in detail in the remaining chapters.

2.1 The Confluencity

In this section, we introduce the confluencity of states, discuss its relations with weak
ergodicity, and define a “coefficient of confluencity” as a criterion for confluencity.

2.1.1 Confluencity and Weak Ergodicty

2.1.1.1 The Confluencity

Consider two independent sample paths starting from time k, X := {X l , l ≥ k} and


X  := {X l , l ≥ k}, generated by the same transition laws {Pl , l = k, k + 1, . . .}, but

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 13


X.-R. Cao, Foundations of Average-Cost Nonhomogeneous Controlled
Markov Chains, SpringerBriefs in Control, Automation and Robotics,
https://doi.org/10.1007/978-3-030-56678-4_2
14 2 Confluencity and State Classification

with two different initial states X k = x, X k = y, x, y ∈ S k . For k = 0, 1, . . ., define


the confluent time of x and y at time k by

τk (x, y) := min{τ ≥ k, X τ = X τ } − k, x, y ∈ S k . (2.1)

τk (x, y) ≥ 0 is the time (a random variable) required for the two paths to first meet
after k. By definition, X τk (x,y) = X τ k (x,y) .
The underlying probability space generated by the two independent sample paths
can be denoted by Ω × Ω with a probability measure being an extension of P on Ω
to Ω × Ω. For simplicity and without much confusion, we also denote this extended
measure on Ω × Ω by P and its corresponding expectation by “E”.
Definition 2.1 A TNHMC X := {X k , k ≥ 0} is said to be confluent, or to satisfy
the confluencity, if for all x, y ∈ S k and any k = 1, 2, . . ., τk (x, y) are finite (< ∞)
with probability one (w.p.1); i.e.,

lim P(τk (x, y) > K |X k = x, X k = y) = 0. (2.2)


K →∞

A TNHMC is said to be strongly confluent, or to satisfy the strong confluencity, if

E{τk (x, y)|X k = x, X k = y} < M3 < ∞, (2.3)

for all x, y ∈ S k and k = 0, 1, . . .. We also say that the two states x and y in (2.2)
are confluent to each other. 
Confluencity of a Markov chain implies that the states in S k , k = 0, 1, . . ., are
in some sense connected through future states. A confluent TNHMC corresponds to
a uni-chain in THMCs (Puterman 1994). When confluencity does not hold for all
states, the states of the Markov chain may be decomposed into multiple confluent
classes of states and branching states, the states in each confluent class are confluent
to each other; these topics will be discussed in Sect. 2.2.
We coin the word “confluencity” because other synonyms such as “merging” and
“coupling” have been used for other connotations. The notion of confluencity is sim-
ilar to that of “coupling” in the literature (Doeblin 1937; Griffeath 1975a, b; Pitman
1976), but both have different connotations. Coupling refers to the construction of a
bivariate process (X (1) , X (2) ) on the state space S × S in the following way: both
X (1) and X (2) have the same probability measure P, but the two processes X (1)
and X (2) can be correlated; and if at some time k, X k(1) = X k(2) , then after k, X (1) and
X (2) remain the same forever. Coupling is a technique mainly used in studying weak
ergodicity and the rate of convergence of transition probabilities (Cao 2007; Doeblin
1937; Griffeath 1975a, b, 1976; Pitman 1976; Roberts and Rosenthal 2004). In Cao
(2007), coupling is used to reduce the variance in estimating the performance poten-
tials. Confluencity, on the other hand, refers to a particular property of a TNHMC
where the two sample paths run independently in a natural way, and no construction
is involved; it is used in state classification and performance optimization, as shown
in this book.
2.1 The Confluencity 15

2.1.1.2 Weak Ergodicity

The confluencity defined in (2.2) plays a similar role for TNHMCs as the connectivity
and ergodicity for THMCs. Indeed, (2.2) assures the weak ergodicity, which means
that as k goes to infinity, the system will forget the effect of the initial state, as defined
below:
Definition 2.2 A TNHMC is said to be weakly ergodic, if for any x, x  ∈ S k and
any k, k  = 0, 1, . . ., it holds that
  
 
lim max P(X K = y|X k = x) − P(X K = y|X k  = x  ) = 0. (2.4)
K →∞ y∈S K

Lemma 2.1 Under Assumption 1.1a, confluencity implies weak ergodicity.


Proof First, let k = k  . By the strong Markov property, for any x, x  ∈ S k , we have

P(X K = y|X k = x, τk (x, x  ) < K − k)


= P(X K = y|X k = x  , τk (x, x  ) < K − k), ∀y ∈ S K .

Furthermore,

P(X K = y|X k = x)
= P(X K = y|X k = x, τk (x, x  ) < K − k)P(τk (x, x  ) < K − k)
+ P(X K = y|X k = x, τk (x, x  ) > K − k)P(τk (x, x  ) > K − k).

Therefore,

|P(X K = y|X k = x) − P(X K = y|X k = x  )|


= |P(X K = y|X k = x, τk (x, x  ) > K − k)
− P(X K = y|X k = x  , τk (x, x  ) > K − k)|
× P(τk (x, x  ) > K − k)
≤ 2P(τk (x, x  ) > K − k), ∀y ∈ S K . (2.5)

Then, for k = k  , (2.4) follows from the confluencity condition (2.2).


Next, let k  < k. We have

|P(X K = y|X k = x) − P(X K = y|X k  = x  )|


  
 
= P(X K = y|X k = x) − P(X K = y|X k = z)P(X k = z|X k  = x  ) 
z∈S
   

≤ P(X K = y|X k = x) − P(X K = y|X k = z)P(X k = z|X k  = x  ) .
z∈S k

Then, for k  < k, (2.4) follows from (2.5) and the boundedness of Sk = |S k |. 
16 2 Confluencity and State Classification

2.1.2 Coefficient of Confluencity

Now, we define a coefficient of confluencity by


  
ν := 1 − inf min

{ [Pk (y|x)Pk (y|x  )]} .
k x,x ∈S k
y∈S k,out

Lemma 2.2 If ν < 1, then the strong confluencity (2.3) holds.



Proof First, let φk (x, x  ) := y∈S k,out [Pk (y|x)Pk (y|x  )]; it is the probability that
two sample paths with X k = x and X k = x  meet at time k + 1. Thus, the probability
that two sample paths meet at k + 1 is at least min x,x  ∈S k {φk (x, x  )} , and the
probability that any two sample paths do not meet at any time k is at most ν < 1.
Thus, P(τk (x, y) ≥ i|X k = x, X k = y) < ν i−1 , and

E{τk (x, y)|X k = x, X k = y}




= P(τk (x, y) ≥ i|X k = x, X k = y) < ∞. 
i=1

In the literature, the well-known Hajnal weak ergodicity coefficient is defined by


(in our notation) (Park et al. 1993; Hajnal 1958)
  

κ := sup 1 − inf

{ min[Pk (y|x), Pk (y|x )]} .
k x,x ∈S k
y∈S k,out

When S k,out is finite, this is


  
κ = 1 − inf min { min[Pk (y|x), Pk (y|x  )]} .
k x,x
y

Lemma 2.3 κ < 1 if and only if ν < 1.


 

Proof The “If” part ⇐=: Because y min[Pk (y|x), Pk (y|x )] ≥ y [Pk (y|x)
Pk (y|x  )], we have κ ≤ ν. Thus, κ < 1 if ν < 1.
The “Only if” part =⇒: We prove if ν = 1 then κ = 1. Equivalently, if
  
inf min

{ [Pk (y|x)Pk (y|x  )]} = 0, (2.6)
k x,x ∈S k
y∈S k,out

then   
inf min { min[Pk (y|x), Pk (y|x  )]} = 0. (2.7)
k x,x
y
2.1 The Confluencity 17

By (2.6), there is a sequence of integers k1 , k2 , . . . (not necessary in increasing


order), and a sequence of positive numbers εl , with liml→∞ εl = 0, such that

min

{ [Pkl (y|x)Pkl (y|x  )]} < εl .
x,x ∈S kl
y∈S kl ,out

Thus, there is a pair denoted by xl and xl such that



[Pkl (y|xl )Pkl (y|xl )] < εl .
y∈S kl ,out

So, Pkl (y|xl )Pkl (y|xl ) < εl for all y ∈ Skl ,out . Furthermore,

{min[Pkl (y|xl ), Pkl (y|xl )]}2 ≤ Pkl (y|xl )Pkl (y|xl ) < εl ,

or

min[Pkl (y|xl ), Pkl (y|xl )] < εl ,

for all y ∈ Skl ,out . Therefore,


 √
min[Pk (y|xl ), Pk (y|xl )] < M1 εl ,
y

where M1 is the bound of the sizes of S k , k = 0, 1, . . .; and thus,


 √
min { min[Pk (y|x), Pk (y|x  )]} < M1 εl .
x,x
y

Finally, there is a sequence, k1 , k2 , . . ., such that


 √
min { min[Pk (y|xl ), Pk (y|xl )]} < M1 εl → 0.
x,x
y

This implies (2.7). 

Therefore, Hajnal weak ergodicity with κ < 1 is equivalent to confluencity with


ν < 1, which implies strong confluencity. By Lemma 2.1, confluencity implies weak
ergodicity, but the next example shows that the opposite is not true; Fig. 2.1 shows
the relations.

Example 2.1 The state space S k consists of two states denoted by S k := {1k , 2k },
k = 0, 1, . . .. Let P(11 |10 ) = P(21 |10 ) = P(11 |20 ) = P(21 |20 ) = 0.5, P(1k+1
|1k ) = 1, k = 1, 2, . . ., and P(2k+1 |2k ) = 1, k = 1, 2, . . .; see Fig. 2.2. It is clear
that states 10 and 20 are not confluent to each other (two independent sample paths
18 2 Confluencity and State Classification

Fig. 2.1 Confluencity


versus weak ergodicity.
A: Weak ergodicity;
B: Confluencity; A B C
C: Confluencity with ν < 1
and weak ergodicity with
κ < 1.
A–C: ν = 1 and κ = 1

Fig. 2.2 The state 20 21 22 23 24


transitions in Example 2.1 0.5
1 1 1 1
0.5
0.5
1 1 1 1
10 0.5 11 12 13 14

starting from 10 and 20 have a probability of only 0.5 to meet), i.e., (2.2) does not
hold. However, the weak ergodicity condition (2.4) does hold for x = 10 and x  = 20 .
As expected, in this example, we have ν = κ = 1. 

2.2 State Classification and Decomposition

State classification is important in characterizing the relations among the states, and
the optimality conditions for different classes are usually different (cf. Puterman
1994; Cao 2007 for THMCs). However, notions such as connectivity, aperiodicity,
recurrent and transient states, which play important roles in state classification of
THMCs, no longer apply to TNHMCs. In this section, we show that with confluencity,
state classification can be carried out for TNHMCs and all the states in a TNHMC
can be grouped into a number of confluent and branching classes. This extends the
recurrent-transient state decomposition of THMCs.
Naturally, the confluencity-based classification also applies to THMCs. Confluent
and branching are similar to, but different from, recurrent and transient in THMCs
(cf. Example 2.5). We will see that, compared with recurrent and transient states, the
notion of confluent and branching states is more appropriate for state classification
and performance optimization.

2.2.1 Connectivity of States

Consider two independent sample paths starting from k, {X l , l ≥ k} and {X l , l ≥ k},


with the same transition law P = {Pl , l ≥ k} and state spaces S = {S l , l ≥ k}, but
with two different initial states X k = x, X k = y, x, y ∈ S k .
2.2 State Classification and Decomposition 19

Definition 2.3 (Connectivity of states) If Eq. (2.2) holds, i.e.,

lim P(τk (x, y) > K |X k = x, X k = y) = 0,


K →∞

for two states x, y ∈ S k at k, k = 0, 1, . . ., then these two states x and y at time k


are said to be connected, and this connectivity is denoted by x ≺ y; Two states x and
y at time k are said to be strongly connected, if in addition, we have E[τk (x, y)|X k =
x, X k = y] < ∞.1 

From the definition, connectivity is not a new notion, and it is introduced simply
for convenience; two connected states are confluent to each other.
Lemma 2.4 Connectivity is reflective, symmetric, and transitive; i.e.,
(a) x ≺ x,
(b) if x ≺ y then y ≺ x, and
(c) if x ≺ y and y ≺ z, then x ≺ z, for x, y, z ∈ S k , k = 0, 1, . . ..
Proof Items (a) and (b) follow directly from the definition. To prove c), we consider
y
three independent sample paths, {X lx , l ≥ k}, {X l , l ≥ k}, and {X lz , l ≥ k}, starting
y
from states X k = x, X k = y, and X k = z, respectively. First, we note that τk (x, y)
x z

in (2.1) is a random variable representing the confluent time of {X lx , l ≥ k} and


y y
{X l , l ≥ k}, and τk (y, z) is the confluent time of {X l , l ≥ k} and {X lz , l ≥ k}. By
y
definition, we have X τk (x,y) = X τk (x,y) . Because of the Markov property, the two ran-
x
y y
dom sequences {X τxk (x,y) , X τxk (x,y)+1 , . . .} and {X τk (x,y) , X τk (x,y)+1 , . . .} are identically
distributed. Therefore, conditioned on τk (y, z) ≥ τk (x, y) and τk (x, z) ≥ τk (x, y),
the random variables τk (x, z) and τk (y, z) have the same distribution; thus,

P(τk (x, z) > K |τk (y, z) ≥ τk (x, y))


= P[τk (x, z) > K |τk (y, z) ≥ τk (x, y), τk (x, z) ≥ τk (x, y)]
+ P[τk (x, z) > K |τk (y, z) ≥ τk (x, y), τk (x, z) < τk (x, y)]
= P[τk (y, z) > K |τk (y, z) ≥ τk (x, y), τk (x, z) ≥ τk (x, y)]
+ P[τk (x, z) > K |τk (y, z) ≥ τk (x, y), τk (x, z) < τk (x, y)]. (2.8)

Because lim K →∞ P(τk (y, z) > K ) = 0 and the probability that both τk (y, z) ≥
τk (x, y) and τk (x, z) < τk (x, y) is positive, we have

lim P[τk (y, z) > K |τk (y, z) ≥ τk (x, y), τk (x, z) < τk (x, y)] = 0.
K →∞

Next, given τk (x, z) < τk (x, y), the fact τk (x, z) > K implies τk (x, y) > K ; thus,

1 We also say that “these two states are confluent to each other,” but this statement may create a
bit of confusion. Precisely, in this sense, “the two states are confluent (to each other)” is different
from “the two states are confluent states” (see Example 2.3). Therefore, connectivity is not a new
concept, and it is introduced simply for clarity and convenience.
20 2 Confluencity and State Classification

P(τk (x, z) > K |τk (y, z) ≥ τk (x, y), τk (x, z) < τk (x, y))
≤ P[τk (x, y) > K |τk (y, z) ≥ τk (x, y), τk (x, z) < τk (x, y)].

By (2.2), we have

lim P(τk (x, z) > K |τk (y, z) ≥ τk (x, y), τk (x, z) < τk (x, y)) = 0.
K →∞

Therefore, from (2.8), we get

lim P(τk (x, z) > K |τk (y, z) ≥ τk (x, y)) = 0. (2.9)


K →∞

When τk (y, z) < τk (x, y), we exchange the role of x and z in the above analysis,
and obtain
lim P(τk (x, z) > K |τk (y, z) < τk (x, y)) = 0.
K →∞

From this equation and (2.9), we conclude that lim K →∞ P(τk (x, z) > K ) = 0, i.e.,
x ≺ z. 

Intuitively, we may say that two independent sample paths starting from two
connected states meet in a finite time w.p.1. We also say that two states at two
different times are connected if two independent sample paths starting from these
two states meet in a finite time w.p.1.

2.2.2 Confluent States

For any two connected states x and y, there is an issue whether the two sample
paths starting from each of them, respectively, will meet infinitely often, or will
never meet again after meeting a few times. Exploring this property leads to state
classification. First, we say a state y ∈ S k  is reachable from x ∈ S k , k  > k, if
P(X k  = y|X k = x) > 0.
Definition 2.4 A state x ∈ S k is said to be a confluent state if any two states at any
time k  > k that are reachable from x ∈ S k are connected; otherwise, it is said to
be a branching state. 
Intuitively speaking, two independent sample paths starting from a confluent state
meet infinitely often w.p.1; and two independent sample paths starting from two
connected states meet at least once. By convention, if, starting from a state, eventually
the chain can reach only one state at every time, this state is confluent and is called
an absorbing state, see Fig. 2.6 (cf. the same notion for THMCs Puterman 1994).
Example 2.2 States 10 (or 20 ) in Fig. 2.2 is not a confluent state because its down-
stream states 11 and 21 are not connected, or confluent, to each other. On the other
Another random document with
no related content on Scribd:
Franklinic or Static Electricity.—This form of electricity is now being
much used, especially abroad, in the treatment of nervous affections,
but does not appear to have been employed in the different
copodyscinesiæ, as but few reports of such treatment have found
their way into current literature. Romain Vigouroux111 states that he
has cured one case by statical electricity. Another case is reported
by Arthuis112 as rapidly cured by this treatment after many other
means, carried on during a period of five years, had failed; but his
brochure contains too many reports of cures of hitherto incurable
diseases to be relied upon.
111 Le Progrès médical, Jan. 21, 1882.

112 A. Arthuis, Traitement des Maladies nerveuses, etc., Paris, 1880, 3me ed.

Gymnastics and Massage.—As those suffering from copodyscinesia


are generally compelled by their vocation to be more or less
sedentary, exercise in the open air is indicated, inasmuch as it tends
to counteract the evil effects of their mode of life; but the use of
dumb-bells or Indian clubs, riding, rowing, and similar exercises do
not ward off the neuroses in question or diminish them when they
are present.

Such is not the case when rhythmical exercises and systematic


massage of all the affected muscles are employed, as marked
benefit has followed this method of treatment. The method employed
by J. Wolff, a teacher of penmanship at Frankfort-on-the-Main, which
consists of a peculiar combination of exercise and massage,
appears to have been wonderfully successful, judging from his own
statements and editorial testimonials of such eminent men as
Bamberger, Bardenleben, Benedikt, Billroth, Charcot, Erb, Esmarch,
Hertz, Stein, Stellwag, Vigouroux, Von Nussbaum, Wagner, and De
Watteville. The method is described by Romain Vigouroux113 and Th.
Schott,114 the latter claiming priority for himself and his brother, who
employed this method as early as 1878 or 1879. Wolff,115 however,
states that he had successfully treated this disease by his method as
early as 1875. Theodor Stein,116 having had personal experience in
Wolff's treatment, also describes and extols it: 277 cases of
muscular spasms of the upper extremities were treated; of these,
157 were cured, 22 improved, while 98 remained unimproved; these
comprised cases of writers', pianists', telegraphers', and knitters'
cramp.
113 Le Progrès médical, 1882, No. 13.

114 “Zur Behandlung des Schreibe- und Klavierkrampfes,” Deutsche Medizinal


Zeitung, 2 März, 1882, No. 9, Berlin; also “Du Traitement de la Crampe des Écrivains,
reclamation de Priorité, Details de Procedes, par le Dr. Th. Schott,” Le Progrès
médical, 1re Avril, 1882.

115 “Treatment of Writers' Cramp and Allied Muscular Affections by Massage and
Gymnastics,” N. Y. Med. Record, Feb. 23, 1884, pp. 204, 205.

116 “Die Behandlung des Schreibekrampfes,” Berliner klinische Wochenschrift, No.


34, 1882, pp. 527-529.

It must be borne in mind that Wolff, not being a physician, can refuse
to treat a case if he thinks it incurable; and in fact he does so, as he
has personally stated to the writer, so that his statistics probably
show a larger percentage of cures than otherwise would be the case.

His method may be described as follows: It consists of a combined


employment of gymnastics and massage; the gymnastics are of two
kinds: 1st, active, in which the patient moves the fingers, hands,
forearms, and arms in all the directions possible, each muscle being
made to contract from six to twelve times with considerable force,
and with a pause after each movement, the whole exercise not
exceeding thirty minutes and repeated two or three times daily; 2d,
passive, in which the same movements are made as in the former,
except that each one is arrested by another person in a steady and
regular manner; this may be repeated as often as the active
exercise. Massage is practised daily for about twenty minutes,
beginning at the periphery; percussion of the muscles is considered
an essential part of the massage. Combined with this are peculiar
lessons in pen-prehension and writing.
The rationale of this treatment is not easy, but any method which
even relieves these neuroses should be hailed with pleasure, as they
heretofore have been considered almost incurable.

The method employed by Poore, as mentioned under Electricity, of


rhythmical exercise of the muscles during the application of the
galvanic current, is worthy of further trial, as it combines the two
forms of treatment hitherto found most successful.

Internal and External Medication.—Generally speaking, drugs are of


comparatively little value in the treatment of these affections. This
statement does not apply to those cases where the symptoms are
produced by some constitutional disorder, or where there is some
other well-recognized affection present which does not stand in
relation to these neuroses as cause and effect.

In any case where an accompanying disorder can be discovered


which is sufficient in itself to depress the health, the treatment
applicable to that affection should be instituted, in the hope, however
unlikely it is to be fulfilled, that with returning health there will be a
decrease of the copodyscinesia. In the majority of cases no
constitutional disease can be detected, and it is in these that internal
medication has particularly failed.

The following are some of the remedies that have been employed:
Cod-liver oil, iron, quinine, strychnia, arsenic, ergot, iodoform, iodide
and bromide of potassium, nitrate of silver, phosphorus,
physostigma, gelsemium, conium, and some others.

Hypodermic Medication.—Atropia hypodermically, as first suggested


by Mitchell, Morehouse, and Keen117 in the treatment of spasmodic
affections following nerve-injury, has been used with good effect in
those cases where there is a tendency to tonic contraction; it should
be thrown into the body of the muscle. Vance118 speaks very
favorably of one-sixtieth of a grain of atropia used in this manner
three times a week. Morphia, duboisia, and arsenic in the form of
Fowler's solution have been used hypodermically with but little effect.
Rossander119 reports a cure in one month of a case of two years'
duration by the hypodermic use of strychnia. Onimus and Legros120
used curare in one case without effect.
117 Gunshot Wounds and Other Injuries to Nerves, Philada., 1864.

118 Reuben A. Vance, M.D., “Writers' Cramp or Scriveners' Palsy,” Brit. Med. and
Surg. Journal, vol. lxxxvii. pp. 261-285.

119 J. C. Rossander, Irish Hosp. Gazette, Oct. 1, 1873.

120 Loc. cit.

Local Applications.—The apparent benefit following the local


application of lotions, etc. to the arms in some cases appears to be
as much due to the generous kneading and frictions that accompany
them as to the lotions themselves. Onimus and Legros, believing the
lesion to be an excitability of the sensitive nerves at the periphery,
employed opiated embrocations, but report amelioration in one case
only.

When there are symptoms of congestion of the nerves or of neuritis,


then the proper treatment will be the application of flying blisters, or
the actual cautery very superficially applied to the points of
tenderness from time to time, so as to keep up a continual counter-
irritation. This treatment may be alternated with the application of the
galvanic current (descending, stabile, as previously mentioned) or
combined with it. As these conditions are often found in nervous
women, care should be taken lest this treatment be too vigorously
carried out.

Considerable relief has been reported from the use of alternate hot
and cold douches to the affected part—a procedure which is well
known to do good in some cases of undoubted spinal disease; the
application peripherally applied altering in some way, by the
impression conveyed to the centres, the nutrition of the spinal cord.

Tenotomy.—Tenotomy has been but little practised for the cure of


these affections. Stromeyer121 cut the short flexors of the thumb in a
case of writers' cramp without any benefit, but in a second case,
where he cut the long flexor of the thumb, the result was a cure.
Langenbeck122 quotes Dieffenbach as having performed the
operation twice without success, and states that there has been but
one observation of complete success, and that was the one of
Stromeyer. Aug. Tuppert123 has also performed this operation, and
Haupt124 advises it as a last resource.
121 “Crampe des Écrivains,” Arch. gén. de Méd., t. xiii., 1842, 3d Series, p. 97.

122 Ibid., t. xiv., 1842, 3d Series.

123 Quoted by Poore, loc. cit.

124 “Der Schreibekrampf,” rev. in Schmidt's Jahrbuch, Bd. cxv., p. 136, 1862.

Very few would be willing to repeat the experiment in a true case of


copodyscinesia after the failures above enumerated, for the
temporary rest given the muscle does not prove of any more service
than rest without tenotomy, which has failed in all the more advanced
cases, which are the only ones where tenotomy would be thought of.

Nerve-Stretching.—It is curious that no cases have been reported (at


least I have not been able to discover them) of nerve-stretching for
aggravated cases of copodyscinesia, as the operation has been
performed in several cases of local spasm of the upper extremity
following injuries to the nerves.

Von Nussbaum125 alone mentions the operation, and states that it


has been of no avail, but gives no references; he previously126
stretched the ulnar nerve at the elbow and the whole of the brachial
plexus for spasm of the left pectoral region and of the whole arm,
following a blow upon the nape of the neck; the patient made a good
recovery.
125 Aerztliches Intelligenzblatt, Munich, Sept. 26, 1882, No. 39, p. 35.

126 London Lancet, vol. ii., 1872, p. 783.


This operation, according to an editorial in the American Journal of
Neurology,127 has been performed seven times for spastic affections
of the arm with the following results: 2 cures (1 doubtful), 3 great
improvement, and 2 slight relief.
127 Am. Journ. of Neurology and Psychiatry, 1882.

This procedure would seem to be indicated in those cases of


copodyscinesia where spasms are present which have a tendency to
become tonic in their character, where other means of treatment
have failed. One such case has fallen under the writer's notice,
which, on account of its singularity and the rarity of the operation,
seems worthy of record. The patient is a physician in large practice,
and his account, fortunately, is more exact than it otherwise would
be:

—— ——, æt. 36. Paternal uncle had a somewhat similar trouble in


right arm, father died of paralysis agitans, and one brother has
writers' cramp. From nine to twelve years of age he was considered
an expert penman, and was employed almost constantly, during
school-hours, writing copies for the scholars. At the age of eleven he
began to feel a sense of tire in right forearm and hand when writing;
soon after this the flexors of right wrist and hand began to contract
involuntarily and become rigid only when writing. He remembers
being able to play marbles well for two years after the onset of the
first symptoms. The trouble gradually increased until every motion of
the forearm became involved. At the age of nineteen he became a
bookkeeper, using his left hand, but at the end of one year this
became affected also. Since then both arms have been growing
gradually worse, and at one time exhaustion would bring on pain at
the third dorsal vertebra. At the age of thirty a period of
sleeplessness and involuntary contractions of all the muscles of the
body came on, accompanied by difficulty in articulation from
muscular inco-ordination. After persistent use of the cold douche to
spine these symptoms ameliorated, but the general muscular
twitching sometimes occurs yet, and overwork brings on spasm of
the extensors of the feet. The condition of his arms in December,
1882, was as follows: At rest the right forearm is pronated, the wrist
completely flexed and bent toward the ulnar side, the thumb is
slightly adducted, and the fingers, although slightly flexed, are
comparatively free, enabling him to use the scalpel with dexterity.
This contraction can be overcome by forcibly extending the fingers
and wrist and supinating forearm, but if the arm be now placed in
supination the following curious series of contractions occur,
occupying from one to two minutes from their commencement to
their completion: gradually the little finger partially flexes, then the
ring, middle, and fore finger follow in succession; the wrist then
slowly begins to flex and to turn toward the ulnar side, and finally the
arm pronates, in which position it will remain unless disturbed. The
contraction is accompanied by a tense feeling in the muscles, but is
painless. The left arm behaves in a somewhat similar manner, and if
this is placed in supination a gradual pronation of the arm begins;
then follows the flexion of the fingers, commencing with the little
finger and ending with the thumb; the wrist also flexes, but not as
much as the right, although the flexion of the fingers is more marked.
There is no pain on pressure over muscles or nerves. The extensor
muscles of both arms, although weaker than normal, are not
paralyzed, those of the right responding more readily to both faradic
and galvanic currents than do the left. There is no reaction of
degeneration. The flexors respond too readily, the right showing the
greatest quantitative increase.

In 1879, while abroad, his condition being essentially as above


described, he consulted Spence of Edinburgh, who as an experiment
stretched the left ulnar nerve at the elbow; immediately after the
operation the muscles were paralyzed and the arm remained quiet;
in twenty-four hours the nerve became intensely painful, and
remained so, day and night, for three weeks; this gradually subsided,
and ceased with the healing of the wound two weeks later. Forty-
eight hours after the operation the spasm of the muscles returned,
and in a short time became as bad as ever, proving the operation to
have been a failure.
An interesting point to decide in this case is whether the symptoms
point to an abnormal condition of the nerve-centres, first manifesting
itself in difficulty of writing, or whether the constant writing induced a
superexcitability (for want of a better term) of the spinal cord in a
patient markedly predisposed to nervous troubles. This last
hypothesis I believe to be the correct one.

It might be considered at first sight that the symptoms presented by


this patient were due to a paralytic condition of the extensors, and
not a spasm of the flexors, or at least that the latter was secondary
to the former. While the extensors are somewhat weaker than
normal from want of use, a careful study of the mode of onset of this
affection and the symptoms presented later prove this idea to be
erroneous.

In regard to the operation and its results, it seems that a fairer test of
the efficacy of nerve-stretching in this case would have been made if
the median and not the ulnar nerve had been stretched, as the latter
only supplies in the forearm the flexor carpi ulnaris and the inner part
of the flexor profundus digitorum, while the former supplies the two
pronators and the remainder of the flexor muscles.

Of the mode of action of this operation we are still much in the dark,
but it would seem to be indicated in any case where the contractions
are very marked and tonic in their nature—not, however, until other
means have failed to relieve.

In the ordinary forms of copodyscinesia, it is needless to say, the


operation would be unjustifiable.

Mechanical Appliances.—Most of the prothetic appliances have


been devised for the relief of writers' cramp, the other forms of
copodyscinesia having received little if any attention in this direction.
The relief obtained by their use is usually but temporary, especially if
the patient attempts to perform his usual amount of work, which is
generally the case.
These instruments are of undoubted benefit when used judiciously in
conjunction with other treatment, as by them temporary rest may be
obtained, or in some cases the weakened antagonists of cramped
muscles may be exercised and strengthened. They all, without
exception, operate by throwing the work upon another set of
muscles, and failure is almost sure to follow their use if they alone
are trusted in, as the new set of muscles sooner or later becomes
implicated in the same way that the left hand is apt to do if the whole
amount of work is thrown upon it.

To accurately describe these instruments is out of place in this


article: those wishing to study this branch of the subject more fully
are referred to the article by Debout,128 where drawings and
descriptions of the most important appliances are given.
128 “Sur les Appareils prothetiques, etc.,” Bull. de Thérap., 1860, pp. 327-377.

Their mode of action may be considered under the following heads:

I. Advantage is taken of muscles as yet unaffected, which are made


to act as splints (so to speak) to those affected, greater stability
being thus given and cramp controlled when present.

Under this head may be mentioned the simple plan of placing a


rubber band around the wrist, wearing a tight-fitting glove, or
applying Esmarch's rubber bandage with moderate firmness to the
forearm. Large cork pen-holders, by distributing the points of
resistance over a larger surface, are thus much easier to hold than
small, hard pen-holders.

Two of the instruments devised by Cazenave—one, consisting of two


rings joined together in the same plane (to which the pen-holder is
attached), and through which the index and middle fingers are thrust
as far as the distal joints; and another consisting of two rings of hard
rubber, one above the other, sufficiently large to receive the thumb,
fore and middle fingers, which are thus held rigidly in the writing
position—act in this manner, and are used when the cramp affects
the thumb or fore finger.
II. The cramp of one set of muscles is made use of to hold the
instrument, the patient writing entirely with the arm movement.

The simple plan of grasping the pen-holder in the closed hand, as


previously described, or of thrusting a short pen-holder into a small
apple or potato, which is grasped in the closed hand, occasionally
affords relief and acts in this way. The instruments of Mathieu,
Velpeau, Charrière, and one by Cazenave are based upon this
principle. The first consists of two rings rigidly joined together about
one inch apart, one above the other, through which the fore finger is
thrust, and of a semicircle against which the tip of the thumb is
pressed; the pen-holder is attached to a bar adjoining the semicircle
and rings. Velpeau's apparatus consists of an oval ball of hard
rubber carrying at one extremity the pen-holder at an angle of 45°;
the ball is grasped in the closed hand, and the pen-holder allowed to
pass between the fore and middle fingers. Charrière's instrument is a
modification of the last, having in addition to the ball a number of
rings and rests for the fixation of the fingers. The latter has also
devised an instrument consisting of a large oval ball of hard rubber;
this is grasped in the outstretched palm, which it fills, and is allowed
to glide over the paper; the pen-holder is attached to one side.
Cazenave's instrument is simply a large pen-holder with rest and
rings to fix the fingers.

III. The instrument prevents the spasm of the muscles used in


poising the hand from interfering with those used in forming the
letters.

One of the instruments devised by Cazenave acts in this way: it


consists of a small board, moving upon rollers, upon which the hand
is placed; lateral pads prevent the oscillations of the arm due to
spasmodic action of the supinators. The pen-holder is held in the
ordinary manner.

IV. The antagonistic muscles to those affected by cramp are made to


hold the instrument, while the cramped muscles are left entirely free.
But one instrument acts in this manner—viz. the bracelet invented by
Von Nussbaum,129 which consists of an oval band of hard rubber to
which the pen-holder is attached. The bracelet is held by placing the
thumb and the first three fingers within it and strongly extending
them.
129 Aerztliches Intelligenzblatt, Munich, 1883, No. 39.

The inventor claims great success by its use alone, as the weakened
muscles are exercised and strengthened and the cramped muscles
given absolute rest.

Résumé of Treatment.—In the complicated cases of copodyscinesia


rest of the affected parts, as far as the disabling occupation is
concerned, must be insisted upon: this should be conjoined with the
use of the galvanic descending stabile current, combined with
rhythmical exercise of the affected muscles and of their antagonists,
and massage. Where there is evidence of a peripheral local
congestion or inflammation, this must be attended to; for instance, if
there is congestion of the nerves, or neuritis, flying blisters or the
actual cautery should be applied over the painful spots, followed by
the galvanic current. Where there is paralysis of one or more
muscles, with evidence of interference of nerve-supply, the faradic
current may be used with advantage.

Evidences of constitutional disease should lead to the employment


of the treatment suitable for those affections.
RÉSUMÉ OF 43 CASES OF TELEGRAPHERS' CRAMP.

Age at time of onset:


Under 20 years 7
Between 20 and 30 25
Between 30 and 40 7
Doubtful 4—43

Average age 23.94 years.


Average age of males 24.22 years.
Average age of females 20.66 years.

Average length of time before the disease appeared after the


commencement of telegraphing was—
In males 7.27 years.
In females 4 years.
The shortest period being 6 months.
The longest period being 20 years.

Left arm implicated in 21


(Not counting case with amputation of right arm) 1
Never used left arm 8
Left arm not affected, although used 12
Doubtful 1—43

Percentage with left arm implicated 63.63

No. of cases with difficulty in telegraphing only 9


No. of cases with difficulty in writing only 0
No. of cases with difficulty in telegraphing most 22
No. of cases with difficulty in writing most 11
No. of cases with difficulty in both alike 1—43
Use of tobacco or alcohol:
No effect in 14
Bad effect in 16
Good effect (in moderation) in 3
Don't use either 10—43

Flushings in arm or face during work in 22


Brittleness of nails in 5
Emotional factor present in 29
Sympathetic movements in other parts in 9
Pain in back during work in 18
Trouble in performance of other acts in 19
Pain on pressure over nerve or muscle in 13

Cramp of flexor group of muscles:


In writing only 0
In writing most 3
In telegraphing only 1
In telegraphing most 5— 9
Cramp of extensor group of muscles:
In writing only 0
In writing most 0
In telegraphing only 4
In telegraphing most 7—11
Doubtful as to character of cramp 11
No cramp 12—43

Manner of telegraphing:
Arm resting on table 22
Arm raised from table 12
Alternating one with the other 7
Doubtful 2—43

Manner of writing:
Arm movement 13
Finger movement 6
Combination of the two 24—43

TETANUS.

BY P. S. CONNER, M.D.

Tetanus (τεινω, to stretch) is a morbid condition characterized by


tonic contraction of the voluntary muscles, local or general, with
clonic exacerbations, occurring usually in connection with a wound.
Cases of it may be classified according to cause (traumatic or
idiopathic); to age (of the new-born and of those older); to severity
(grave and mild); or to course (acute and chronic), this latter
classification being the one of greatest value.

Though known from the earliest times, it is in the civil practice of


temperate regions of comparatively rare occurrence, and even in
military surgery has in recent periods only exceptionally attacked any
considerable proportion of the wounded.
Occurring in individuals of all ages, the great majority of the subjects
of it are children and young adults. Women seem to be decidedly
less liable to it than men. That this is due to sexual peculiarity may
well be doubted, since the traumatic cases are by far the most
numerous, and females are much less often wounded than males.
The traumatisms of childbed are occasionally followed by it
(puerperal tetanus).

That race has a predisposing influence would appear to be well


established; the darker the color, the greater the proportion of
tetanics. Negroes are especially likely to be attacked with either the
traumatic or idiopathic form.

Atmospheric and climatic conditions, beyond question, act powerfully


in, if not producing at least favoring, the development of tetanus.
Places and seasons in which there is great difference between the
midday and the midnight temperature, the winds are strong, and the
air is moist, are those in which the disease is most prevalent; and it
is because of these conditions that the late spring and early autumn
are the periods of the year when cases are most often seen.1
1 In his account of the Austrian campaign of 1809, Larrey wrote: “The wounded who
were most exposed to the cold, damp air of the chilly spring nights, after having been
subjected to the quite considerable heat of the day, were almost all attacked with
tetanus, which prevailed only at the time when the Reaumur thermometer varied
almost constantly between the day and the night by the half of its rise and fall; so that
we would have it in the day at 19°, 20°, 21°, and 23° above zero (75°–84° F.), while
the mercury would fall to 13°, 12°, 10°, 9°, and 8° during the night (50°–61° F.). I had
noticed the same thing in Egypt.”

Cold has, from the time of Hippocrates, been regarded as a great


predisposing if not exciting cause, and the non-traumatic cases have
been classed together as those a frigore. It is not, however, the
exposure to simply a low temperature that is followed by the disease,
but to cold combined with dampness, and quickly succeeding to a
temperature decidedly higher, as in the cool nights coming on after
hot days in tropical regions, or in the spring and fall seasons of
temperate latitudes, or in the cold air blowing over or cold water
dashed upon a wound or the heated skin. That such cold, thus
operating, does most usually precede the attack of tetanus is
unquestionable; and it has by many been held that without it no
traumatism will be followed by the disease. Observers generally are
agreed, with Sir Thomas Watson, that “there is good reason for
thinking that in many instances one of these causes (wound and
cold) alone would fail to produce it, while both together call it forth.”

In the low lands of hot countries (as the East and West Indies) the
disease is very frequently met with, at times prevailing almost
epidemically; and, on the other hand, it is rare in dry elevated
regions and in high northern latitudes, as in Russia, where during a
long military and civil experience Pirogoff met with but eight cases.
Trismus nascentium would seem to be an exception to the general
rule of the non-prevalence of tetanus in places far north, since, e.g.,
it has been at different periods very common in the Hebrides and the
small islands off the southern coast of Iceland. But these localities,
from their peculiar position, are not extremely cold, and their climate
is damp and variable; so that, even if the lockjaw of infants be
accepted as a variety of true tetanus, the geographical exception
indicated is but an apparent one.

Traumatic cases are greatly more numerous than idiopathic, and no


class of wounds is free from the possibility of the supervention of
tetanus. Incised wounds are much less likely to be thus complicated
than either of the other varieties, though operation-wounds of all
sorts, minor and major, have been followed by this affection. So
frequently has it been associated with comparatively trivial injuries
that it has become a common belief that the slighter the traumatism
the greater the danger of tetanus. That this is not true the records of
military surgery abundantly show. Wounds of the lower extremity are
much graver in this respect than those of the upper. Injuries of the
hand and feet, especially roughly punctured wounds of the palmar
and plantar fasciæ (as, e.g., those made by rusty nails), have long
been regarded as peculiarly liable to develop the disease, and
accidents of this nature always give rise to the fear of lockjaw.
Though there can be no question but that more than one-half of the
cases of tetanus in civil life are associated with wounds of these
localities, yet the number of such injuries is so much greater than of
those of other parts of the body that the special liability of the
subjects of them to become tetanic may well be questioned. In this
connection it is a significant fact that during our late war of perhaps
12,000 or 13,000 wounds of the hand, only 37 were followed by
tetanus, and of 16,000 of the foot, but 57. A few years ago numerous
cases of tetanus were observed in our larger cities complicating
hand-wounds produced by the toy pistol—injuries that were often
associated with considerable laceration of the soft parts, and
generally with lodgment of the wad.

Not even the complete cicatrization of a wound altogether protects


against the occurrence of the disease, the exciting cause of which,
under such circumstances, is probably to be found in retained
foreign bodies or pent-up fluids.

ETIOLOGY.—Almost universally regarded as an affection of the central


nervous system, inducing a heightened state of the reflex irritability,
though some have maintained that the reflex excitability of the
medulla and the cord is actually lessened, how such affection is
produced is unknown; and it is an unsettled question whether it is
through the medium of the nerves or the vessels, whether by
ascending inflammation, by reflected irritation, or by the presence of
a septic element or a special micro-organism in the blood.

That the disease is due to ascending neuritis finds support in the


congested and inflamed state of the nerves leading up from the
place of injury (affecting them in whole or in part, it may be in but a
few of their fibres), and in the inflammatory changes discoverable in
the cord and its vessels. But time and again thorough and careful
investigation by experienced observers has altogether failed to
detect any alterations in the nerves or pathological changes in the
cord, other than those that might properly be attributed to the
spasms, the temperature, or the drugs administered. The symptoms
of acute neuritis and myelitis (pain, paralyses, and later trophic
changes) are not those which are present in cases of tetanus. The
evidences of inflammation of the cord are most apparent, not in that
portion of it into which the nerves from the wounded part enter, but,
as shown by Michaud, so far as the cellular changes in the gray
matter are concerned, always in the lumbar region, no matter where
the wound may be located.

The much more generally accepted theory of reflex neurosis is


based upon the association of the disease with “all forms of nerve-
irritation, mechanical, thermal, chemical, and pathological;” upon the
direct relation existing between the likelihood of its occurrence and
the degree of sensibility of the wounded nerve;2 in the, at times, very
short interval between the receipt of the injury and the
commencement of the tetanic symptoms; in the local spasms
unquestionably developed by nerve-pressure and injury; in the
primary affection of muscles at a distance from the damaged part; in
the already-referred-to absence of the structural lesions of
inflammation; and in the relief at times afforded by the removal of
irritating foreign bodies, the temporary cutting off of the nerve-
connection with the central organs, or the amputation of the injured
limb. But that something more than irritation of peripheral nerves is
necessary to the production of tetanus would seem to be proved by
the frequency of such irritation and the rarity of the disease; by the
not infrequent prolonged yet harmless lodgment of foreign bodies,
even sharp and angular ones, against or in nerves of high
sensibility;3 by the primary affection of the muscles about the jaws,
and not those in the neighborhood of the wound; by the almost
universal failure to produce the affection experimentally, either by
mechanical injuries or by electrical excitations; by certain well-
attested instances of its repeated outbreak in connection with a
definite locality, a single ship of a squadron, a particular ward in a
hospital, or even bed in a ward; by the usual absence of that pain
which is the ordinary effect of nerve-irritation; and by the small
measure of success which has attended operations, even when
early performed, permitting the taking away of foreign bodies
pressing upon or resting in a nerve, interrupting the connection with
the cord, or altogether removing the wound and its surroundings.
Even in the idiopathic cases—many of which, it would at first sight
appear, can be due only to reflected irritation—another explanation
of the mode of their production may, as we will see, be offered.
2 According to Gubler, the danger is greatest in wounds of parts containing numerous
Pacinian corpuscles.

3 Heller has reported a case in which a piece of lead was lodged in the sheath of the
sciatic nerve. Though chronic neuritis resulted, the wound healed perfectly. Two years
later, after exposure while drilling, the man was seized with tetanus and died of it.

The so-called humoral theory would find the exciting cause of the
disease in a special morbific agent developed in the secretions of the
unbroken skin or the damaged tissues of the wound, or introduced
from without and carried by the blood-stream to the medulla and the
cord, there to produce such cell-changes as give rise to the tetanic
movements. It finds support in the unsatisfactory character of the
neural theories; in the strong analogy in many respects of the
symptoms of the disease to the increased irritability and muscular
contractions of hydrophobia and strychnia-poisoning, or those
produced by experimental injections of certain vegetable alkaloids; in
the recent discoveries in physiological fluids, as urine and saliva, of
chemical compounds,4 and in decomposing organic matter of
ptomaïnes capable of tetanizing animals when injected into them; in
the rapidly-enlarging number of diseases known to be, or with good
reason believed to be, consequent upon the presence of peculiar
microbes; in the more easy explanation by it than upon other
theories of the ordinary irregularity and infrequency of its occurrence,
its occasional restriction within narrow limits, and its almost endemic
prevalence in certain buildings and even beds; in the extreme gravity
of acute cases and the protracted convalescence of those who
recover from the subacute and chronic forms; in the very frequent
failure of all varieties of operative treatment; and in the success of
therapeutic measures just in proportion to their power to quiet and
sustain the patient during the period of apparent elimination of a
poison or development and death of an organism.
4 Paschkie in some recent experiments found that the sulphocyanide of sodium
applied in small quantities caused a tetanic state more lasting than that caused by
strychnia.

This theory is as yet unsupported by any positive facts. Neither


septic element nor peculiar microbe has been discovered.5 Failure
has attended all efforts to produce the disease in animals by
injecting into them the blood of tetanics. There is no testimony
worthy of acceptance of the direct transmission of the disease to
those, either healthy or wounded, coming in contact with the tetanic
patient; nor can much weight be attached to such reports as that of
Betoli of individuals being attacked with it who had eaten the flesh of
an animal dead of it.
5 Curtiss of Chicago thought that he had found a special organism, but further
investigation showed that it was present in the blood of healthy members of the family
and in the water of a neighboring pond.

The ordinary absence of fever has been thought to prove the


incorrectness of this theory, but increased body-heat is not a
symptom of rabies or strychnia-poisoning, of the tetanic state
following the injection of ptomaïnes, or of cholera—a disease very
probably dependent upon the presence and action of a bacillus.
Martin de Pedro, regarding the affection as rheumatic in character,
located it in the muscles themselves, there being produced, through
poisoning of the venous blood, a muscular asphyxia.

MORBID ANATOMY.—The pathological conditions observed upon


autopsy in the wound, the nerves, the central organs, and the
muscles, have been so various and inconstant that post-mortem
examinations have afforded little or no definite information respecting
the morbid anatomy of the disease. Many of the reported lesions
have unquestionably been dependent upon cadaveric changes or
defective preparation for microscopic study. The wound itself has
been found on the one hand healthy and in due course of
cicatrization,6 on the other showing complete arrest of the reparative
process (“the sores are dry in tetanus,” wrote Aretæus),7 or even
gangrenous, with pus-collections, larger or smaller, in its immediate
vicinity, usually in connection with retained foreign bodies.

You might also like