Stochastic Processes in Classical and Quantum Physics and Engineering, Parthasarathy, 2023

VISIT…
Stochastic Processes in Classical and

Quantum Physics and Engineering
Stochastic Processes in Classical and
Quantum Physics and Engineering
Harish Parthasarathy
Professor
Electronics & Communication Engineering
Netaji Subhas Institute of Technology (NSIT)
New Delhi, Delhi-110078
First published 2023
by CRC Press
4 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN
and by CRC Press
6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742
© 2023 Manakin Press
CRC Press is an imprint of Informa UK Limited
The right of Harish Parthasarathy to be identified as the author of this work has been
asserted in accordance with sections 77 and 78 of the Copyright, Designs and Patents Act
1988.
All rights reserved. No part of this book may be reprinted or reproduced or utilised in any
form or by any electronic, mechanical, or other means, now known or hereafter invented,
including photocopying and recording, or in any information storage or retrieval system,
without permission in writing from the publishers.
For permission to photocopy or use material electronically from this work, access www.
copyright.com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood
Drive, Danvers, MA 01923, 978-750-8400. For works that are not available on CCC
please contact mpkbookspermissions@tandf.co.uk
Trademark notice: Product or corporate names may be trademarks or registered
trademarks, and are used only for identification and explanation without intent to infringe.
Print edition not for sale in South Asia (India, Sri Lanka, Nepal, Bangladesh, Pakistan or
Bhutan).
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library
ISBN: 9781032405391 (hbk)
ISBN: 9781032405414 (pbk)
ISBN: 9781003353553 (ebk)
DOI: 10.4324/9781003353553
Typeset in Arial, BookManOldStyle, CourierNewPS, MS-Mincho, Symbol,
TimesNewRoman and Wingdings
by Manakin Press, Delhi
2
Preface
Preface
This book covers a wide range of problems involving the applications of
stochastic processes, stochastic calculus, large deviation theory, group represen
tation theory and quantum statistics to diverse fields in dynamical systems, elec
tromagnetics, statistical signal processing, quantum information theory, quan
tum neural network theory, quantum filtering theory, quantum electrodynam
ics, quantum general relativity, string theory, problems in biology and classical
and quantum fluid dynamics. The selection of the problems has been based
on courses taught by the author to undergraduates and postgraduates in elec
tronics and communication engineering. The theory of stochastic processes has
been applied to problems of system identification based on time series models of
the process concerned to derive time and order recursive parameter estimation
followed by statistical performance analysis. The theory of matrices has been
applied to problems such as quantum hypothesis testing between two states in
cluding a proof of the quantum Stein Lemma involving asymptotic formulas for
the error probabilities involved in discriminating between a sequence of ordered
pairs of tensor product states. This problem is the quantum generalization of
the large deviation result regarding discriminating between to probability dis
tributions from iid measurements. The error rates are obtained in terms of the
quantum relative entropy between the two states. Classical neural networks are
used to match an inputoutput signal pair. This means that the weights of the
network are trained so that it takes as input the true system input and repro
duces a good approximaition of the system output. But if the input and output
signals are stochastic processes characterized by probability densities and not
by signal values, then the network must take as input the input probability
distribution and reproduce a good approximation of the output probability dis
tribution. This is a very hard task since a probability distribution is specified
by an infinite number of values. A quantum neural network solves this problem
easily since Schrodinger’s equation evolves a wave function whose magnitude
square is always a probability density and hence by controlling the potential
in this equation using the input pdf (probability distribution function) and the
error between the desired output pdf and that generated by Schrodinger’s equa
tion, we are able to track a time varying output pdf successfully during the
training process. This has applications to the problem of transforming speech
data to EEG data. Speech is characterized by a pdf and so is EEG. Further the
speech pdf is correlated to the EEG pdf because if the brain acquires a disease,
the EEG signal will get distorted and correspondingly the speech signal will also
get slurred. Thus, the quantum neural network based on Schrodinger’s equation
can exploit this correlation to get a fairly good approximation of the EEG pdf
given the speech pdf. These issues have been discussed in this book. Other prob
lems include a proof of the celebrated Lieb’s inequality which is used in proving
convexity and concavity properties of various kinds of quantum entropies and
quantum relative entropies that are used in establishing coding theorems in
quantum information theory like what is the maximum rate at which classical
information can be transmitted over a Cq channel so that a decoder with error
probability converging to zero can be constructed. Some of the inequalities dis
v
3
cussed in this book find applications to more advanced problems in quantum

information theory that have not been discussed here like how to generate a se
quence of maximally entangled states having maximal dimension asymptotically
from a sequence of tensor product states using local quantum operations with
one way or two way classical communication using a sequence of TPCP maps
or conversely, how to generate a sequence tensor product states optimally from
a sequence of maximally entangled states by a sequence of TPCP maps with
the maximally entangled states having minimal dimension. Or how to gener
ate a sequence of maximal shared randomness between two users a sequence of
quantum operations on a sequence of maximally entangled states. The perfor
mance of such algorithms are characterized by the limit of the logarithm of the
length divided by the number of times that the tensor product is taken. These
quantities are related to the notion of information rate. Or, the maximal rate at
which information can be transmitted from one user to another when they share
a partially entangled state. This book also covers problems involving quantiza
tion of the action functionals in general relativity and expressing the resulting
approximate Hamiltonian as perturbations to harmonic oscillator Hamiltonians.
It also covers topics such as writing down the Schrodinger equation for a mixed
state of a particle moving in a potential coupled to a noisy bath in terms of
the Wigner distribution function for position and momentum. The utility of
this lies in the fact that the position and momentum observables in quantum
mechanics do not commute and hence by the Heisenberg uncertainty principle,
they do not have a joint probability density unlike the classical case of a particle
moving in a potential with fluctuation and dissipation forces present where we
have a FokkerPlanck equation for the joint pdf of the position and momentum.
The Wigner distribution being complex, is not a joint probability distribution
function but its marginals are and it is the closest that one can come in quan
tum mechanics for making an analogy with classical joint probabilities and we
show in this book that its dynamics is the same a that of a classical Fokker
Planck equation but with small quantum correction terms that are represented
by a power series in Planck’s constant. We then also apply this formalism to
Belavkin’s quantum filtering theory based on the HudsonParthasarathy quan
tum stochastic calculus by showing that when this equation for a particle moving
in a potential with damping and noise is described in terms of the Wigner dis
tribution function, then it just the same as the KushnerKallianpur stochastic
filter but with quantum correction terms expressed as a power series in Planck’s
constant. This book also covers some aspects of classical large deviation theory
especially in the derivation of the rate function of the empirical distribution of
a Markov process and applied to the Brownian motion and the Poisson process.
This problem has applications to stochastic control where we desire to control
the probability of deviation of the empirical distribution of the output process
of a stochastic system from a desired probability distribution by incorporating
control terms in the stochastic differential equation. This book also discusses
some aspects of string theory especially the problem of how field equations of
physics get modified when string theory is taken into account and conformal
invariance is imposed. How supersymmetry breaking stochastic terms in the
vi
4
Lagrangian can be reduced by feedback stochastic control etc. Some problem

deal with modeling blood flow, breathing mechanisms etc using stochastic or
dinary and partial differential equations and how to control them by applying
large deviation theory. One of the problems deals with analysis of the motion
of a magnet through a tube with a coil wound around it in a graviational field
taking into account electromagnetic induction and back reaction. This problem
is result of discussions with one of my colleagues. The book discusses quan
tization of this motion when the electron and electromagnetic fields are also
quantized. In short, this book will be useful to research workers in applied
mathematics, physics and quantum information theory who wish to explore the
range of applications of classical and quantum stochastics to problems of physics
and engineering.
Table of Contents Author

chapter 1
Stability theory of nonlinear systems based on Taylor approximations of the
nonlinearity
1.1 Stochastic perturbation of nonlinear dynamical systems:Linearized anal
ysis
1.2 Large deviation analysis of the stability problem
1.3 LDP for YangMills gauge fields and LDP for the quantized metric tensor
of spacetime
1.4 Strings in background fields
chapter 2
2.1 String theory and large deviations
2.2 Evaluating quantum string amplitudes using functional integration
2.3 LDP for Poisson fields
2.4 LDP for strings perturbed by mixture of Gaussian and Poisson noise
chapter 3
ordinary, partial and stochastic differential equation models for biological
systems,quantum gravity, Hawking radiation, quantum fluids,statistical signal
processing, group representations, large deviations in statistical physics
3.1 A lecture on biological aspects of virus attacking the immune system
3.2 More on quantum gravity
3.3 Large deviation properties of virus combat with antibodies
3.4 Large deviations problems in quantum fluid dynamics
3.5 Lecture schedule for statistical signal processing
3.6 Basic concepts in group theory and group representation theory
3.7 Some remarks on quantum gravity using the ADM action
3.8 Diffusion exit problem
3.9 Large deviations for the Boltzmann kinetic transport equation
3.10 Orbital integrals for SL(2, R) with applications to statistical image pro
cessing.
3.11 Loop Quantum Gravity (LQG)
vii
Brief Contents
1. Stability Theory of Nonlinear Systems Based on Taylor

Approximations of the Nonlinearity 1–10
2. String Theory and Large Deviations 11–14
3. Ordinary, Partial and Stochastic Differential Equation Models for
Biological Systems, Quantum Gravity, Hawking Radiation,
Quantum Fluids, Statistical Signal Processing,
Group Representations, Large Deviations in Statistical Physics 15–60
4. Research Topics on Representations 61–74
5. Some Problems in Quantum Information Theory 75–86
6. Lectures on Statistical Signal Processing 87–106
7. Some Aspects of Stochastic Processes 107–126
8. Problems in Electromagnetics and Gravitation 127–128
9. Some More Aspects of Quantum Information Theory 129–140
10. Lecture Plan for Statistical Signal Processing 141–182
11. My Commentary on the Ashtavakra Gita 183–186
12 LDP-UKF Based Controller 187–192
13. Abstracts for a Book on Quantum Signal Processing Using
Quantum Field Theory 193–200
14. Quantum Neural Networks and Learning Using Mixed State
Dynamics for Open Quantum Systems 201–208
15. Quantum Gate Design, Magnet Motion, Quantum Filtering,
Wigner Distributions 209–238
16. Large Deviations for Markov Chains, Quantum Master Equation,
Wigner Distribution for the Belavkin Filter 239–248
17. String and Field Theory with Large Deviations, Stochastic
Algorithm Convergence 249–260
ix
Detailed Contents
1. Stability Theory of Nonlinear Systems Based on Taylor
Approximations of the Nonlinearity 1–10
1.1 Stochastic Perturbation of Nonlinear Dynamical Systems:
Linearized Analysis 1
1.2 Large Deviation Analysis of the Stability Problem 3
1.3 LDP for Yang-Mills Gauge Fields and LDP for the Quantized
Metric Tensor of Spacetime 4
1.4 String in Background Fields 6
2. String Theory and Large Deviations 11–14
2.1 Evaluating Quantum String Amplitudes Using
Functional Integration 11
2.2 LDP for Poisson Fields 12
2.3 LDP for Strings Perturbed by Mixture of Gaussian and
Poisson Noise 13
3. Ordinary, Partial and Stochastic Differential Equation Models for
Biological Systems, Quantum Gravity, Hawking Radiation,
Quantum Fluids, Statistical Signal Processing,
Group Representations, Large Deviations in Statistical Physics 15–60
3.1 A lecture on Biological Aspects of Virus Attacking
the Immune System 15
3.2 More on Quantum Gravity 23
3.3 Large Deviation Properties of Virus Combat with Antibodies 25
3.4 Large Deviations Problems in Quantum Fluid Dynamics 31
3.5 Lecture Schedule for Statistical Signal Processing 34
3.6 Basic Concepts in Group Theory and Group Representation Theory 37
3.7 Some Remarks on Quantum Gravity Using the ADM Action 38
3.8 Diffusion Exit Problem 39
3.9 Large Deviations for the Boltzmann Kinetic Transport Equation 41
3.10 Orbital Integrals for SL(2,R) with Applications to
Statistical Image Processing 42
3.11 Loop Quantum Gravity (LQG) 44
3.12 Group Representations in Relativistic Image Processing 45
3.13 Approximate Solution to the Wave Equation in the Vicinity of
the Event Horizon of a Schwarchild Black-hole 48
3.14 More on Group Theoretic Image Processing 51
3.15 A Problem in Quantum Information Theory 58
3.16 Hawking Radiation 59
xi
4. Research Topics on Representations 61–74
4.1 Remarks on Induced Representations and Haar Integrals 61
4.2 On the Dynamics of Breathing in the Presence of a Mask 66
4.3 Plancherel Formula for SL(2,R) 69
4.4 Some Applications of the Plancherel Formula for a Lie Group to
Image Processing 71
5. Some Problems in Quantum Information Theory 75–86
5.1 Quantum Information Theory and Quantum Blackhole Physics 81
5.2 Harmonic Analysis on the Schwarz Space of G = SL(2,R) 83
5.3 Monotonicity of Quantum Entropy Under Pinching 84
6. Lectures on Statistical Signal Processing 87–106
6.1 Large Deviation Analysis of the RLS Algorithm 88
6.2 Problem Suggested
6.3 Fidelity Between Two Quantum States 90
6.4 Stinespring’s Representation of a TPCP Map, i.e,
of a Quantum Operation 92
6.5 RLS Lattice Algorithm, Statistical Performance Analysis 93
6.6 Approximate Computation of the Matrix Elements of
the Principal and Discrete Series for G = SL(2,R) 94
6.7 Proof of the Converse Part of the Cq Coding Theorem 95
6.8 Proof of the Direct part of the Cq Coding Theorem 100
6.9 Induced Representations 102
6.10 Estimating Non-uniform Transmission Line Parameters and
Line Voltage and Current Using the Extended Kalman Filter 104
6.11 Multipole Expansion of the Magnetic Field Produced
by a Time Varying Current Distribution 105
7. Some Aspects of Stochastic Processes 107–126
7.1 Problem Suggested by Prof. Dhananjay 121
7.2 Converse of the Shannon Coding Theorem 122
8. Problems in Electromagnetics and Gravitation 127–128
9. Some More Aspects of Quantum Information Theory 129–140
9.1 An Expression for State Detection Error 129
9.2 Operator Convex Functions and Operator Monotone Functions 131
9.3 A Selection of Matrix Inequalities 133
9.4 Matrix Holder Inequality 138
10. Lecture Plan for Statistical Signal Processing 141–182
10.1 End-Semester Exam, Duration: Four Hours, Pattern Recognition 143
xii
10.2 Quantum Stein’s Theorem 153
10.3 Question Paper, Short Test, Statistical Signal Processing 154
10.4 Question Paper on Statistical Signal Processing, Long Test 157
10.5 Quantum Neural Networks for Estimating EEG pdf from
Speech pdf Based on Schrodinger’s Equation 166
10.6 Statistical Signal Processing 173
10.7 Monotonocity of Quantum Relative Renyi Entropy 178
11. My Commentary on the Ashtavakra Gita 183–186
12 LDP-UKF Based Controller 187–192
13. Abstracts for a Book on Quantum Signal Processing Using
Quantum Field Theory 193–200
14. Quantum Neural Networks and Learning Using Mixed State
Dynamics for Open Quantum Systems 201–208
15. Quantum Gate Design, Magnet Motion, Quantum Filtering,
Wigner Distributions 209–238
15.1 Designing a Quantum Gate Using QED with a Control
c-number Four Potential 209
15.2 On the Motion of a Magnet Through a Pipe with a Coil Under
Gravity and the Reaction Force Exerted by the Induced Current
Through the Coil on the Magnet 211
15.3 Quantum Filtering Theory Applied to Quantum Field Theory for
Tracking a Given Probability Density Function 219
15.4 Lecture on the Role of Statistical Signal Processing in General
Relativity and Quantum Mechanics 221
15.5 Problems in Classical and Quantum Stochastic Calculus 223
15.6 Some Remarks on the Equilibrium Fokker-Planck Equation for
the Gibbs Distribution 226
15.7 Wigner-distributions in Quantum Mechanics 228
15.8 Motion of An Element having Both an Electric and a Magnetic
Dipole Moment in an External Electromagnetic Field Plus a Coil 236
16. Large Deviations for Markov Chains, Quantum Master
Equation, Wigner Distribution for the Belavkin Filter 239–248
16.1 Large Deviation Rate Function for the Empirical
Distribution of a Markov Chain 239
16.2 Quantum Master Equation in General Relativity 241
16.3 Belavkin Filter for the Wigner Distirbution Function 246
xiii
17. String and Field Theory with Large Deviations, Stochastic
Algorithm Convergence 249–260
17.1 String Theoretic Corrections to Field Theories 249
17.2 Large Deviations Problems in String Theory 250
17.3 Convergence Analysis of the LMS Algorithm 252
17.4 String Interacting with Scalar Field, Gauge Field and
a Gravitational Field 253
xiv
Chapter 1
Chapter 1
Stability Theory
Stability theory of
ofNonlinear
Systems based
nonlinear systems Based on
on
Taylor
Taylor Approximations
approximations of of
the Nonlinearity
the nonlinearity
1.1 Stochastic perturbation of nonlinear dynam

ical systems:Linearized analysis
Consider the nonlinear system defined by the differential equation
dx(t)/dt = f (x(t)), t ≥ 0
where f : Rn → Rn is a continuously differentiable mapping. Let x0 be a fixed

point of this system, ie
f (x0 ) = 0
Denoting δx(t) to be a small fluctuation of the state x(t) around the stability
point x0 , we get on first order Taylor expansion, ie, linearization,
dδx(t)/dt = f � (x0 )δx(t) + O(|δx(t)|2 )
If we neglect O(|δx|2 ) terms, then we obtain the approximate solution to this

differential equation as
δx(t) = exp(tA)δx(0), t ≥ 0
where
A = f � (x0 )
is the Jacobian matrix of f at the equilibrium point x0 and hence it follows that
this linearized system is asymptotically stable under small perturbations in all
1
2 Stochastic Processes in Classical and Quantum Physics and Engineering
10CHAPTER 1. STABILITY THEORY OF NONLINEAR SYSTEMS BASED ON TAYLO
directions iff all the eigenvalues of A have negative real parts. By asymptoti

cally stable, we meant that limt→∞ δx(t) = 0. The linearized system is critically
stable at x0 if all the eigenvalues of A have nonpositive real part, ie, an eigen
value being purely imaginary is also allowed. By critically stable, we mean that
δx(t), t ≥ 0 remains bounded in time.
Stochastic perturbations: Consider now the stochastic dynamical system

obtained by adding a source term on the rhs of the above dynamical system
dx(t)/dt = f (x(t)) + w(t)
where w(t) is stationary Gaussian noise with mean zero and autcorrelation
Rw (τ ) = E(w(t + τ )w(t))
Now it is meaningless to speak of a fixed point, so we choose a trajectory x0 (t)

that solves the noiseless problem
dx0 (t)/dt = f (x0 (t))
Let δx(t) be a small perturbation around x0 (t) so that x(t) = x0 (t) + δx(t)
solves the noisy problem. Then, assuming w(t) to be small and δx(t) to be of
the ”same order or magnitude” as w(t), we get approximately upto linear orders
in w(t), the linearized equation
dδx(t)/dt = Aδx(t) + w(t), A = f ' (x0 )
which has solution

f t
δx(t) = Φ(t)δx(0) + Φ(t − τ )w(τ )dτ, t ≥ 0,
0
where
Φ(t) = exp(tA)
Assuming δx(0) to be a nonrandom initial perturbation, we get
E(δx(t)) = Φ(t)δx(0)
which converges to zero as t → ∞ provided that all the eigenvalues of A have

negative real part. Moreover, in this case,
f tf t
E(|δx(t)|2 ) = T r Φ(t − t1 )Rw (t1 − t2 )Φ(t − t2 )T dt1 dt2
0 0
f tf t
= Tr Φ(t1 )Rw (t1 − t2 )Φ(t2 )T dt1 dt2
0 0
which converges as t → ∞ to
f ∞f ∞
Px = T r Φ(t1 )Rw (t1 − t2 )Φ(t2 )T dt1 dt2
0 0
Stochastic Processes in Classical and Quantum Physics and Engineering 3
1.2. LARGE DEVIATION ANALYSIS OF THE STABILITY PROBLEM 11
Suppose now that A is diagonalizable with spectral representation

r
r
A=− λ(k)Ek
k=1
Then
r
r
Φ(t) = exp(−tλ(k))Ek
k=1
and we get
r
r f ∞ f ∞
, Px = T r(Ek .Rw (t1 − t2 ).Em )exp(−λ(k)t1 − λ(m)t2 )dt1 dt2
k,m=1 0 0
A sufficient condition for this to be finite is that |Rw (t)| remain bounded in
which case we get when
supt |Rw (t)| ≤ B < ∞∀t
that
r
r
Px ≤ B (Re(λ(k) + λ(m)))−1 ≤ 2Br2 /λR (min)
k,m=1
where
λR (min) = mink Re(λ(k))
1.2 Large deviation analysis of the stability prob

lem
Let x(t) be a diffusion process starting at some point x0 within an open set
V with boundary S = ∂V . If the noise level is low say parametrized by the
parameter E, then the diffusion equation can be expressed as
√
dx(t) = f (x(t))dt + Eg(x(t))dB(t)
Let PE (x) denote the probability of a sample path of the process that starts at
x0 end ends at the stop time τE = t at which the process first hits the boundary
S. Then, from basic LDP, we know that the approximate probability of such a
path is given by
exp(−E−1 infy∈S V (x0 , y, t))
where
f t
V (x, t) = infu:dx(s)/ds=f (x(s))+g(x(s))u(s),0≤s≤t,x(0)=x0 u(s)2 ds/2
0
4 12CHAPTERStochastic Processes
1. STABILITY in Classical
THEORY OF and Quantum Physics
NONLINEAR SYSTEMSand Engineering
BASED ON TAYLO
and
V (x0 , y, t) = infx:x(0)=x0 ,x(t)=y V (x0 , x, t)
The greater the probability of the process hitting S, the lesser the mean value
of the hitting time. This is valid with an inverse proportion. Thus, we can say
that
τE ≈ exp(V /E), E → 0
where
V = infy∈S,t≥0 V (x0 , y, t)
More precisely, we have the LDP
limE→0 P (exp((V − δ)/E) < τE < exp((V + δ)/E)) = 1
Note that τE = exp(V /E) for a given path x implies that the probability of x
PE (x) = exp(−V /E). This is because of the relative frequency interpretation of
probability given by Von Mises or equivalently because of ergodic theory: If the
path hits the boundary N (E) times in a total of N0 trials, with the time taken
to hit being τE for each path, then the probability of hitting the boundary is
N (E)/N . However, since it takes a time τE for each trial hit, it follows that the
number of hits in time T is T /τE . Thus if each trial run in the set of N trials is
of duration T , then the probability of hit is
PE = N (E)/N = T /(N.τE )
and if we assume that that T, N are fixed, or equivalently that T, N → ∞ with

N/T → to a finite nonzero constant, then we deduce from the above,
τE ≈ exp(V /E)
in the sense that E.log(τE ) converges to V as E → 0.
1.3 LDP for YangMills gauge fields and LDP

for the quantized metric tensor of space
time
Let L(qab , qab , N, N a ) denote the ADM Lagrangian of spacetime. Here,
a
N defines the diffeomorphism constraint, N the Hamiltonian constraint. The
canonical momenta are
P ab = ∂L/∂qab
The Hamiltonian is then
H = P ab qab − L
Stochastic
1.3. LDPProcesses in Classical and
FOR YANGMILLS Quantum
GAUGE PhysicsAND
FIELDS and Engineering 5
LDP FOR THE QUANTIZED MET
1
Note that the time derivatives N ' , N a of N and N a do not appear in L. QGTR
is therefore described by a constrained Hamiltonian with the constraints being
P N = ∂L/∂N ' = 0,
leading to the equation of motion
H = ∂L/∂N = 0
which is the WheelerDeWitt equation, the analog of Schrodinger’s equation in

particle nonrelativistic mechanics. It should be interpreted as
H(N )ψ = 0∀N
where
H(N ) = N.Hd3 x
The Large deviation problem: If there is a small random perturbation in the

initial conditions of the Einstein field equations, then what will be the statistics
of the perturbed spatial metric qab after time t ? This is the classical LDP. The
quantum LDP is that if there is a small random perturbation in the initial wave
function of a system and also small random Hamiltonian perturbations as well
as small random time varying perturbations in the potentials appearing in the
Hamiltonian, then what will be the statistics of quantum averaged observables
?
The YangMills Lagrangian in curved spacetime: Let Aµ = Aaµ τa denote

the Lie algebra valued YangMills connection potential field. Let
Dµ = ∂µ + Γµ
denote the spinor gravitational covariant derivative. Then the YangMills field
tensor is given by
Fµν = Dµ Aν − Dν Aµ + [Aµ , Aν ]
The action functional of the YangMills field in the background gravitational

field is given by
√
SY M (A) = g µα g νβ T r(Fµν Fαβ ) −gd4 x
Problem: Derive the canonical momentum fields for this curved spacetime
YangMills Lagrangian and hence obtain a formula for the canonical Hamilto
nian density.
1.4 String in background fields

The Lagrangian is
L(X, X,α , α = 1, 2) = (1/2)gµν (X)hαβ X,α

µ ν
X,β
+Φ(X) + Hµν (X)Eαβ X,α

µ ν
X,β
where a conformal transformation has been applied to bring the string world
sheet metric hαβ to the canonical form diag[1, −1]. Here, Φ(X) is a scalar field.
The string action functional is
J
S[X] = Ldτ dσ
The string field equations
µ
∂α ∂L/∂X,α − ∂L/∂X µ = 0
are
∂α (gµν (X)hαβ X,β

ν
)+Eαβ (Hµν (X)X,β
ν
),α
= (1/2)gρσ,µ (X)hab X,a

ρ σ
X,b +Φ,µ (X)+Hρσ,µ (X)Eab X,a
ρ σ
X,b
or equivalently,
gµν,ρ (X)hab X,a

ρ ν
X,b + gµν (X)hab X,ab
ν
+ Eab Hµν,ρ (X)X,a
ρ ν
X,b + Eab Hµν (X)X,ab
ν
= (1/2)gρσ,µ (X)hab X,a

ρ σ
X,b + Φ,µ (X) + Hρσ,µ (X)Eab X,a
ρ σ
X,b
These differential equations can be put in the form
µ ρ
η ab X,ab = δ.F µ (X ρ , X,a
ρ
, X,ab )
where δ is a small perturbation parameter and ((η ab )) = diag[1, −1] The prob
lem is to derive the correction to string propagator caused by the nonlinear
perturbation term. The propagator is
Dµν (τ, σ|τ 1 , σ 1 ) =< 0|T (X µ (τ, σ).X ν (τ 1 , σ 1 ))|0 >
and we see that it satisfies the following fundamental propagator equations: let
x = (τ, σ), x1 = (τ 1 , σ 1 ), ∂0 = ∂τ , ∂1 = ∂σ
Then, 1
∂0 Dµν (x|x1 ) = δ(x0 − x 0 ) < [X µ (x), X ν (x1 )] >
µ
+ < T (X,0 (x).X ν (x1 )) >
µ
=< T (X,0 (x).X ν (x1 ) >
and likewise
1 µ
∂02 Dµν (x|x1 ) = δ(x0 − x 0 ) < [X,0 (x), X ν (x1 )] >
1.4. STRING IN BACKGROUND FIELDS 15
µ
+ < T (X,00 (x).X ν (x� )) >
� µ
= δ(x0 − x 0 ) < [X,0 (x), X ν (x� )] >
+∂12 Dµν (x|x� ) + δ. < T (F µ (X, X,a X,b )(x).X ν (x� )) >
or equivalently
η ab ∂a ∂b D(x|x� ) =
δ. < T (F µ (X, X,a X,b )(x).X ν (x� )) >
� µ
+δ(x0 − x 0 ) < [X,0 (x), X ν (x� )] > − − −(1)
Now consider the canonical momenta
µ ν
Pµ (x) = ∂L/∂X,0 = gµν (X(x))X,0 (x)
ν
−2Hµν (X(x))X,1 (x)
Using the canonical equal time commutation relations
[X µ (τ, σ), X ν (τ, σ � )] = 0
so that
ν
[X µ (τ, σ), X,1 (τ, σ � )] = 0
and
[X µ (τ, σ), Pν (τ, σ � )] = iδµν δ(σ − σ � )
we therefore deduce that
ρ
[X µ (τ, σ), gνρ (X(τ, σ � ))X,0 (τ, σ � )] = iδµν δ(σ − σ � )
or equivalently,
ν
[X µ (τ, σ), X,0 (τ, σ � )] = ig µν (X(τ, σ))δ(σ − σ � )
and therefore (1) can be expressed as
�Dµν (x|x� ) = η ab ∂a ∂b Dµν (x|x� ) =
δ. < T (F µ (X, X,a X,b )(x).X ν (x� )) >

−i < g µν (X(x)) > δ 2 (x − x� ) − − − (2)
Now suppose we interpret the quantum string field X µ as being the sum of a
large classical component X0µ (τ, σ) and a small quantum fluctuating component
δX µ (τ, σ), then we can express the above exact relation as
Dµν (x|x� ) = X0µ (x)X0ν (x� )+ < T (δX µ (x).δX ν (x� )) >
and further we have approximately upto first order of smallness

µ
F µ (X, X � , X �� ) = F µ (X0 , X0,a , X0,b )+F,1ρ (X0 , X0� , X0�� )δX ρ
µ µ ρ
+F,2ρa (X, X � , X �� )δX,a
ρ
+F,3ρab (X, X � , X �� )δX,ab
and hence upto second orders of smallness, (2) becomes
D(X0µ (x)X0ν (x' )) + D < T (δX µ (x).δX ν (x' )) >
= δF µ (X0 , X0,a , X0,b )(x)X0ν (x' )

µ
+δ.F,1ρ (X0 , X0' , X0'' )(x) < T (δX ρ (x)δX ν (x' )) >
µ
+δ.F,2ρa (X, X ' , X '' )(x) < T (δX,a
ρ
(x).δX ν (x' )) >
µ ρ
+δ.F,3ρab (X, X ' , X '' )(x) < T (δX,ab (x).δX ν (x' )) >
−i < g µν (X(x)) > δ 2 (x − x' )

To proceed further, we require the dynamics of the quantum fluctuation part
δX µ . To get this, we apply first order perturbation theory to the system
µ ρ
η ab X,ab = δ.F µ (X ρ , X,a
ρ
, X,ab )
to get
µ
η ab X0,ab =0
µ
η ab δX,ab = δ.F µ (X0 , X0' , X0'' )
and solve this to express δX µ as the sum of a classical part (ie a particular
solution to the inhomogeneous equation) and a quantum part (ie the general
solution to the homogeneous equation). The quantum component is
2
δXqµ (τ, σ) = −i (aµ (n)/n)exp(in(τ − σ))
n=0
where
[aµ (n), aν (m)] = nη µν δ[n + m]
Note that in particular, DX0µ (x) = 0 and hence the approximate corrected
propagator equation becomes
D < T (δX µ (x).δX ν (x' )) >
= δF µ (X0 , X0,a , X0,b )(x)X0ν (x' )

µ
+δ.F,1ρ (X0 , X0' , X0'' )(x) < T (δX ρ (x)δX ν (x' )) >
µ
+δ.F,2ρa (X, X ' , X '' )(x) < T (δX,a
ρ
(x).δX ν (x' )) >
µ ρ
+δ.F,3ρab (X, X ' , X '' )(x) < T (δX,ab (x).δX ν (x' )) >
−ig µν (X0 (x))δ 2 (x − x' )

provided that we neglect the infinite term
gµν,αβ (X0 ) < δX α (x)δX β (x) >

1.4. STRING IN BACKGROUND FIELDS 17
if we do not attach the perturbation tag δ to the F term, ie, we do not regard
F as small, then the unperturbed Bosonic string X0 satisfies
DX0µ = F µ (X0 , X0' , X0'' )
and then the corrected propagator Dµν satisfies the approximate equation
DDµν (x|x' ) =
µ
+δ.F,1ρ (X0 , X0' , X0'' )(x) < T (δX ρ (x)δX ν (x' )) >0
µ
+δ.F,2ρa (X, X ' , X '' )(x) < T (δX,a
ρ
(x).δX ν (x' )) >0
µ ρ
+δ.F,3ρab (X, X ' , X '' )(x) < T (δX,ab (x).δX ν (x' )) >0
−iη µν δ 2 (x − x' )
provided that we assume that the metric is a very small perturbation of that
of Minkowski flat spacetime. Here, < . >0 denotes the unperturbed quantum
average, ie, the quantum average in the absence of the external field Hµν and
also in the absence of metric perturbations of flat spacetime. We write
D0µν (x|x' ) =< T (δX µ (x).δX ν (x' )) >0
Now observe that in the absence of such perturbations,

L
δX µ (x) = −i (aµ (n)/n)exp(in(τ − σ))
n=0
so
< T (δX µ (x).δX ν (x' )) >0 =
L
− (θ(τ − τ ' )/n)exp(−in(τ − τ ' − σ + σ ' ))
n>0
L
− (θ(τ ' − τ )/n).exp(−in(τ ' − τ − σ ' + σ))
n>0
µ
< T (δX,0 (x).δX ν (x' )) >0 =
L
− (θ(τ − τ ' )/n)(−in)exp(−in(τ − τ ' − σ + σ ' ))
n>0
L
− (θ(τ ' − τ )/n).(in)exp(−in(τ ' − τ − σ ' + σ))
n>0
L
=i (θ(τ − τ ' )exp(−in(τ − τ ' − σ + σ ' ))
n>0
L
−i (θ(τ ' − τ )exp(−in(τ ' − τ − σ ' + σ))
n>0
µ
< T (δX,1 (x).δX ν (x' )) >0 =
�
− (θ(τ − τ � )/n)(in)exp(−in(τ − τ � − σ + σ � ))
n>0
�
− (θ(τ � − τ )/n).(−in)exp(−in(τ � − τ − σ � + σ))
n>0
�
= −i (θ(τ − τ � )exp(−in(τ − τ � − σ + σ � ))
n>0
�
+i (θ(τ � − τ )exp(−in(τ � − τ − σ � + σ))
n>0
µ
= − < T (δX,0 (x).δX ν (x� )) >0
Note that we have the alternate expression
µ
< T (δX,1 (x).δX ν (x� )) >0 =
∂1 D0µν (x|x� )
because differentiation w.r.t the spatial variable σ commutes with the time or
dering operation T . Further,
µ
< T (δX,00 (x).δX ν (x� )) >0
µ
=< T (δX,11 (x).δX ν (x� )) >0
= ∂12 D0µν (x|x� )
µ
< T (δX,01 (x).δX ν (x� )) >0
µ
= ∂1 < T (δX,0 (x).δX ν (x� )) >0
�
= ∂1 [i (θ(τ − τ � )exp(−in(τ − τ � − σ + σ � ))
n>0
�
−i (θ(τ � − τ )exp(−in(τ � − τ − σ � + σ))] =
n>0
�
i (θ(τ − τ � )(in)exp(−in(τ − τ � − σ + σ � ))
n>0
�
−i (θ(τ � − τ )(−in)exp(−in(τ � − τ − σ � + σ)) =
n>0
�
− (θ(τ − τ � )n.exp(−in(τ − τ � − σ + σ � ))
n>0
�
− (θ(τ � − τ )n.exp(−in(τ � − τ − σ � + σ))
n>0
Chapter 2
Chapter 2
String Theory and Large
String theory and Deviations
large
deviations
2.1 Evaluating quantum string amplitudes using

functional integration
Assume that external lines are attached to a given string at spacetime points
(τr , σr ), r = 1, 2, ..., M . Note that the spatial points σr are fixed while the
interaction times τr vary. Let Pr (σ) denote the momentum of the rth external
line at σ. At the point σ = σr where it is attached, this momentum is specified
as Pr (σr ). Thus noting that
L
X(τ, σ) = −i a(n)exp(in(τ − σ))/n
n=0
we have
L
Pr (σr ) = X,τ (σr ) = a(n)exp(−inσr )
n=0
This equation constrains the values of a(n). We thus define the numbers Prn
by
L
Pr (σr ) = Prn exp(−inσr ) − − − (1)
n
and the numbers {Prn } characterize the momentum of the external line attached
at the point σr . Note that for reality of the momentum, we must have Prn =
;
Pr,−n if we assume that the Prn s are real and therefore
L
Pr (σr ) = 2 Prn cos(nσr )
n>0
19
11
20 CHAPTER 2. STRING THEORY AND LARGE DEVIATIONS
Note that Pr (σ) is the momentum field corresponding to the rth external line
;
and this is given to us. Thus, the Prn s are determined by the equation
L
Pr (σ) = 2 Prn cos(nσ)
n>0
Equivalently, we fix the numbers {Prn }r,n corresponding to the momenta carried

by the external lines and then obtain the momentum Pr (σr ) of the line attached
at σr using (1).
The large deviation problem: Express the Hamiltonian density of the string
having Lagrangian density (1/2)∂α X µ ∂ α Xµ in terms of X µ , Pµ = ∂0 X µ . Show
that it is given by
H0 = (1/2)P µ Pµ + (1/2)∂1 X µ ∂1 Xµ
In the presence of a vector potential field Aµ (σ), the charged string field has the
Hamiltonian density H with Pµ replaced by Pµ + eAµ . Write down the string
field equations for this Hamiltonian and assuming that the vector potential is a
weak Gaussian field, evaluate the rate function of the string field X µ .
LDP in general relativistic fluid dynamics.
2.2 LDP for Poisson fields

Small random perturbations in the metric field having a statistical model of
the sum of a Gaussian and a Poisson field affect the fluid velocity field. Let
X1 (x), x ∈ Rn be a zero mean Gaussian random field with correlation R(x, y) =
E(X1 (x)X1 (y)) and let N (x), x ∈ Rn be a Poisson random field with mean
E(N (dx)) = λ(x)dx. Consider the family of random Poisson measure NE (E) =
Ea N (E−1 E) for any Borel subset E of Rn . We have
J
E[exp( f (x)N (dx))]
Rn
J
= exp( λ(x)(exp(f (x)) − 1)dx)
Rn
Thus, J
ME (f ) = Eexp( f (x)NE (dx))
J J
a
= exp( E f (x)N (dx/E))) = exp( Ea f (Ex)N (dx))
J
= exp( λ(x)(exp(Ea f (Ex)) − 1)dx)
Thus, J
−a
log(ME (E f )) = λ(x)(exp(f (Ex)) − 1)dx
2.3. LDP FOR STRINGS PERTURBED BY MIXTURE OF GAUSSIAN AND POISSON N
�
= �−n λ(�−1 x)(exp(f (x)) − 1)dx
Then,
lim�→0 �n .log(M� (�−a f ))
�
= λ(∞) (exp(f (x)) − 1)dx
assuming that
λ(∞) = lim�→0 λ(x/�)
exists. More generally, suppose we assume that this limit exists and depends
only on the direction of the vector x ∈ Rn so that we can write
λ(∞, x̂) = lim�→0 λ(x/�)
then we get for the GartnerEllis limiting logarithmic moment generating func
tion
lim�→0 �n .log(M� (�−a f ))
�
= λ(∞, x̂)(exp(f (x)) − 1)dx = Λ(f ¯ )
2.3 LDP for strings perturbed by mixture of

Gaussian and Poisson noise
Consider the string field equations
√
�X µ (σ) = fkµ (σ) �W k (τ ) + gkµ �dNk (τ )/dτ
where W k are independent white Gaussian noise processes and Nk are indepen
dent Poisson processes. Assume that the rate of Nk is λk /�. Calculate then the
LDP rate functional of the string process X µ . Consider now a superstring with
Lagrangian density
¯ α ∂α ψ
L = (1/2)∂α X µ ∂ α Xµ + ψρ
Add to this an interaction Lagrangian

µ ν
�αβ Bµν (X)X,α X,β
an another interaction Lagrangian

¯ αψ
Jα (X)ψρ
Assume that Bµν (X) and Jα (X) are small Gaussian fields. Calculate the joint
rate function of the Bosonic and Fermionic string fields X µ , ψ.
Chapter 3
Chapter 3
Ordinary, Partial and
ordinary,Stochastic Differential
partial and
stochasticEquation Models for
differential
equation
Biologicalmodels
Systems,forQuantum
biological systems,quantum
Gravity, Hawking Radiation,
gravity,
QuantumHawking
Fluids,radiation,
Statistical
quantum fluids,statistical
Signal Processing, Group
signal processing, group
Representations, Large
representations, large
Deviations in Statistical
deviations in statistical
Physics
physics
3.1 A lecture on biological aspects of virus at

tacking the immune system
We discuss here a mathematical model based on elementary fluid dynamics and
quantum mechanics, how a virus attacks the immune system and how antibodies
generated by the immune system of the body and the vaccination can wage a
counter attack against this virus thereby saving the life. According to recent
23
15
16 24CHAPTERStochastic ProcessesPARTIAL
3. ORDINARY, in Classical
ANDandSTOCHASTIC
Quantum Physics and Engineering EQUATI
DIFFERENTIAL
researches, a virus consists of a spherical body covered with dots/rods/globs

of protein. When the virus enters into the body, as such it is harmless but
becomes dangerous when it lodges itself in the vicinity of a cell in the lung where
the protein blobs are released. The protein blobs thus penetrate into the cell
fluid where the ribosome gives the command to replicate in the same way that
normal cell replication occurs. Thus a single protein blob is replicated several
times and hence the entire cell is filled with protein and thus gets destroyed
completely. When this happens to every cell in the lung, breathing is hampered
and death results. However, if our immune system is strong enough, then the
blood generates antibodies the moment it recognizes that a stranger in the form
of protein blobs has entered into the cell. These antibodies fight against the
protein blobs and in fact the antibodies lodge themselves on the boundary of
the cell body at precisely those points where the protein from the virus has
penetrated thus preventing further protein blobs from entering into the cell.
Whenever this fight between the antibodies and the virus goes on, the body
develops fever because an increase in the body temperature disables the virus
and enables the antibodies to fight against the proteins generated by the virus.
If however the immune system is not powerful enough, sufficient concentration of
antibodies is not generated and the viral protein dominates over the antibodies
in their fight finally resulting in death of the body. The Covid vaccine is prepared
by generating a sample virus in the laboratory within a fluid like that present
within the body and hence antibodies are produced within the fluid by this
artifical process in the same way that the smallpox vaccine was invented by
Edward Jenner by injecting the smallpox virus into a cow to generate the
required antibodies. Now we discuss the mathematics of the virus, protein, cells
and antibodies.
Assume that the virus particles have masses m1 , ..., mN and charges e1 , ..., eN .
The blood fluid has a velocity field v(t, r) whose oxygenated component cir
culates to all the organs of the body and whose deoxygenated component is
directed toward the lungs. This circulation is caused by the pumping of the
heart. Assume that the heart pumps blood with a generation rate g(t, r) having
units of density per unit time or equivalently mass per unit volume per unit
time. Assume also that the force of this pumping is represented by a force
density field f (t, r). Also there may be present external electromagnetic fields
E(t, r), B(t, r) within the blood field which interact with the current field within
the blood caused by nonzero conductivity σ of the blood. This current density
is according to Ohm’s law given by
J(t, r) = σ(E(t, r) + v(t, r) × B(t, r))
The relevant MHD equations and the continuity equation are given by
ρ(∂t v + (v, \)v) = −\p + η\2 v + f + σ(E + v × B) × B
∂t ρ + div(ρ.v) = g
Sometimes a part of the electromagnetic field within the blood is of internal
origin, ie, generated by the motion of this conducting blood fluid. Thus, if we
3.1. A LECTURE ON BIOLOGICAL ASPECTS OF VIRUS ATTACKING THE IMMUNE
write Eext , Bext for the externally applied electromagnetic field and Eint , Bint
for the internally generated field, then we have
Eint (t, r) = −�φint − ∂t Aint , Bint = � × Aint
where �
Aint (t, r) = (µ/4π) J(t − |r − r� |/c, r� )d3 r� /|r − r� |,
� t
Φint (t, r) = −c2 divAint dt
0
The total electromagnetic field within the blood fluid is the sum of these two
components:
E = Eext + Eint , B = Bext + Bint
We have control only over Eext , Bext and can use these control fields to ma
nipulate the blood fluid velocity appropriately so that the virus does not enter
into the lung region. Let rk (t) denote the position of the k th virus particle. It
satisfies the obvious differential equation
drk (t)/dt = v(t, rk (t))
with initial conditions rk (0) depending on where on the surface of the body
the virus entered. If the virus is modeled as a quantum particle with nucleus
at the classical position rk (t) at time t, then the protein molecules that are
stuck to this virus surface have relative positions (rkj , j = 1, 2, ..., M } and their
joint quantum mechanical wave function ψk (t, rkj , j = 1, 2, ..., M ) will satisfy
the Schrodinger equation
M
�
ih∂t ψk (t, rkj , j = 1, 2, ..., M ) = (−h2 /2mkj )�2rkj ψk (t, rkj , j = 1, 2, ..., M )
j=1
M
�
+ ek Vk (t, rkj − rk (t))ψk (t, rkj , j = 1, 2, ..., M )
j=1
where Vk (t, r) is the binding electrostatic potential energy of the viral nucleus.
There may also be other vector potential and scalar potential terms in the above
Schrodinger equation, these potentials coming from the prevalent electromag
netic field. Some times it is more accurate to model the entire swarm of viruses
as a quantum field rather than as individual quantum particles but the result
ing dynamics becomes more complicated as it involves elaborate concepts from
quantum field theory. Suppose we have solved for the joint wave functions ψk
of the protein blobs on the k th virus. Then the average position of the lth blob
on the k th viral surface is given by
�
< rkl > (t) = rkl ψk (t, rkj , j = 1, 2, ..., M )ΠM 3
j=1 d rkj , 1 ≤ k ≤ M
26CHAPTER 3. ORDINARY, PARTIAL AND STOCHASTIC DIFFERENTIAL EQUATI
We consider the time at which these average positions of the lth protein blob on
the k th viral surface hits the boundary S = ∂V of a particular cell. This time
is given by
τ (k, l) = inf {t ≥ 0 :< rkl > (t) ∈ S}
At this time, replication of this protein blob due to cell ribosome action begins.
Let λ(n)dt denote the probability that in the time interval [t, t + dt], n protein
blobs will be generated by the ribosome action on a single blob. Let Nkl (t)
denote the number of blobs due to the (k, l)th one present at time t. Then for
Pkln (t) = P r(Nkl (t) = n), n = 1, 2, ...
we have the ChapmanKolmogorov Markov chain equations

� �
dPkln (t)/dt = Pklr (t)λ(kl, r, n) − Pkln (t) Pkln (t)
1≤r<n 1≤r<n
where λ(kl, r, n)dt is the probability that in time [t, t + dt], n blobs will be
generated from r blobs. Clearly
λ(klr, n) = rλ(n − r)
because each protein blob generates blobs indepndently of the others and hence
the probability of two blobs generating more than one each is zero. In an
other discrete time version of blob replication we can make use of the theory of
branching processes.
Let X(n) denote the number of blobs at time n. At time n + 1, each blob
splits into several similar blobs, each splitting being independent of the others.
let ξ(n, r) denote the number of blobs into which the rth blob splits. Then, we
have the obvious relation
X(n)
�
X(n + 1) = ξ(n, r)
r=1
Hence, if
Fn (z) = E(z X(n)
is the moment generating function of X(n) and if
φ(z) = E(z ξ(n,r) )
is the moment generating function of the number of blobs into which each blob
splits, then we get the recursion
Fn+1 (z) = E(φ(z)X(n) ) = Fn (φ(z))
In order to mimick this dynamics in the continuous time situation, we assume

that there are X(t) blobs at time t. Each blob independently of the others,
waits for a time T that has a probability density f (t) and then replicates into
3.1. A LECTURE ON BIOLOGICAL ASPECTS OF VIRUS ATTACKING THE IMMUNE
N blobs. Usually, we assume f (t) to be the exponential density. The aim is to

calculate the moment generating function F (t, z) = E(z X(t) ) of the number of
blobs at time t. Assuming f to be exponential with parameter λ, we find that
X(t + dt) will be the
The effect of quantum gravity on the motion of the virus globules: Since
the size of the virus is on the nano/quantum scale, the effects of gravity on it
motion will correspondingly be quantum mechanical effects. Consider the ADM
action for quantum gravity obtained as the integral
�
a �
S[qab , N, N ] = L(qab , qab , N, N a )d4 x
where qab are the spatial components of the metric tensor in the embedded
3dimensional space at each time t and N, N a appear in the Lagrangian but
not their time derivatives. The qab are six canonical position fields and the
corresponding canonical momentum fields are
�
P ab = ∂L/∂qab
The ADM Hamiltonian is

�
�
H= (P ab qab − L)d3 x
and these turn out to be fourth degree polynomials in the momenta but highly
nonlinear in the position fields. Let eia denote a tetrad basis for the three
dimensional embedded manifold. Thus,
qab = eia eib
The metric relative to this tetrad basis is therefore the identity δij and the group
of transformations of this tetrad that preserve the metric is therefore SO(3)
which can be regarded as SU (2) using the adjoint map. Note that SU (2) is in
fact the covering group of SO(3). The spin connection involves representing the
spatial connection in the SU (2)spin Lie algebra representation. The form of
this connection can in terms of the spatial connection Γabc can be derived from
the fact that the corresponding covariant derivative Db should annihilate the
tetrad eia . Thus if Γjb
i
denotes the spin connection, then
0 = Db eia = eia,b − Γcba eic + Γijb eja = 0
or equivalently,
Γijb = −eja (ea,b
i
− Γcba eic ) = −eaj eia:b
where a double dot denotes the usual spatial covariant derivative. Equivalently,
we can lower the index i using the identity SO(3) metric and write
Γijb = −eaj eia:b = eaj:b eia = eja:b eai = −Γjib

Thus, the spin connection Γija is antisymmetric w.r.t (ij) and hence can be
expressed as
Γija = Γka (τk )ij = Γka (τk )ij
where τi , i = 1, 2, 3 is the standard basis for the Lie algebra of SO(3) which
satisfy the usual angular momentum commutation relations:
[τi , τj ] = �(ijk)τk
Note that (τi )jk = �(ijk) upto a proportionality constant. We observe that the
spatial connection coefficients of the embedded 3D manifold are
Γabc = (1/2)(qab,c + qac,b − qbc,a )
and the spatial curvature tensor is constructed out of these in the usual way.
Our aim is to develop the calculus of general relativity in a form that resembles
the calculus of the YangMills nonAbelian gauge theories. This means that we
must introduce a nonAbelian SO(3) connection Γ = Γi τi so that the curvature
tensor has the representation
R = dΓ + Γ ∧ Γ = [d + Γ, d + Γ]
where Γ is an SU (2)Lie algebra valued one form. Note that
qab,c = (eia eib ),c = ea,c

i
ebi + eai eb,c
i
In fact, Cartan’s equations of structure precisely formulate the expression for

the curvature tensor in the language of nonAbelian gauge theory. We write
Γk = Γak dxa
or equivalently,
Γ = Γk τk = Γka τk dxa
and hence
R = dΓ + [Γ, Γ] = Γka,b τk dxb ∧ dxa + Γka Γlb [τk , τl ]dxa ∧ dxb
= (Γm k l b
a,b − Γa Γb �(klm))τm dx ∧ dx
a
m
= Rba τm dxb ∧ dxa
where
m
Rba = (1/2)(Γm m k l
a,b − Γb,a ) − Γa Γb �(klm)
is the Riemann curvature tensor in the spin representation.

When the metric fluctuation components δqab , δN a , δN are quantum fields
is a coherent state of the gravitational field, we can write the corresponding fluc
tuations in the spacetime components of the metric tensor along the following
lines:
g µν = q µν + N 2 nµ nν
Stochastic Processes inON
3.1. A LECTURE Classical and Quantum
BIOLOGICAL Physics OF
ASPECTS and VIRUS
Engineering 21 IMMUNE
ATTACKING THE
where
q µν = qab X,a
µ ν
X,b
µ
T µ = X,0 = N µ + N.nµ = N a X,a
µ
+ N.nµ
where N a , N satisfy the orthogonal relations
gµν (T µ − N a X,a
µ ν
)X,b = 0,
gµν nµ nν = 1
We can equivalently write
g µν = qab X,a
µ ν
X,b + N 2 (X,µ0 − N a X,a
µ ν
).(X,0 − N b X,b
ν
)
When the metric g µν fluctuates, it results correspondingly in the fluctuation of

qab and N, N a (these are totally ten in number corresponding to the fact that
the original metric tensor g µν has ten components). The coordinate mapping
functions X µ (x) are assumed to be fixed and nonfluctuating in this formalism.
Thus, the fluctuations in the metric tensor g µν can be expressed in terms of the
corresponding fluctuations in qab , N a , N as
δg µν = δqab X,a
µ ν
X,b
µ
+2N.δN.(X,0 − N a X,a
µ ν
).(X,0 − N b X,b
ν
)
−N 2 δN a X,a
µ ν
(X,0 − N b X,b
ν
)
µ
−N 2 ((X,0 − N a X,a
µ ν
)X,b δN b
Given the metric quantum fluctuations, the corresponding quantum fluctuations
in the equation of motion of the virus particle will be
µ
d2 δX µ /dτ 2 + 2Γαβ (x)δX α dX α /dτ
δΓµαβ (X)(dX α /dτ ).(dX β /dτ )

µ
Γαβ,ρ (X)δX ρ (dX α /dτ ).(dX β /dτ ) = 0
The fluctuations in the particle position are quantum observables evolving in
time. According to the Heisenberg picture, these observables evolve dynamically
in time according to the above equation with the state of the gravitational
field being prescribed. So solving the above equation for δX µ (τ ) in terms of
the metric fluctuation field δgµν (X), we can in principle evaluate the mean,
correlation and higher order moments of the particle position field in the given
state of the gravitational field. Once the mean and correlations of the virus
position are known, we can calculate the probability dispersion of this position
at the first time that the mean position of this virus first hits a given cell. This
dispersion gives us an estimate of what range relative to its mean position will be
affected by the virus. We could also give a classical stochastic description of the
30CHAPTER 3. ORDINARY, PARTIAL AND STOCHASTIC DIFFERENTIAL EQUATIO
dynamics of the virus but since the virus size is also of the atomic dimensions,
such a quantum probabilistic description is more accurate.
A cruder model for the virus motion would be to express its trajectory as
the sum of a classical and a quantum trajectory in accordance with Newton
Heisenberg equation of motion with a force that is the sum of a classical com
ponent and a purely quantum fluctuating component:
d2 /dt2 (r(t)) = f0 (r(t)) + δf (r(t))
Solving this differential equation using first order perturbation theory,
r(t) = r0 (t) + δr(t)
with r0 (t) a purely classical trajectory and δr(t) a quantum, ie, operator valued
trajectory, we get
d2 r0 (t)/dt2 = f0 (r0 (t)),
d2 δr(t)/dt2 = f0� (r0 (t))δr(t) + δf (r0 (t))
Here δf (r) is an operator valued field and f0 (r) is a classical cnumber field.
Expressing the first order perturbed trajectory in state variable form gives us
dδr(t)/dt = δv(t), dδv(t)/dt = f0� (r0 (t))δr(t) + δf (r0 (t))
or equivalently � �
d δr(t)
dt δv(t)
� ��
0 I δr(t)
=
f0� (r0 (t)) 0 δv(t)
� �
0
+
δf (r0 (t))
Now we can also include damping terms in this model by allowing the force to
be both position and velocity dependent:
d2 r0 (t)+δr(t))/dt2 = f0 (r0 (t)+δr(t), r0� (t)+δr� (t))+δf (r0 (t)+δr(t), r�0 (t)+δr� (t))
which gives using first order perturbation theory,
d2 r0 (t)/dt2 = f0 (r0 (t), r�0 (t)),
d2 δr(t)/dt2 = f0,1 (r0 (t), r�0 (t))δr(t) + F0,2 (r0 (t), r�0 (t))δr� (t) + δf (r0 (t), r0� (t))
After introducing the state vector as above, and solving the linearized equations
using the usual state transition matrix obtained in terms of the Dyson series
expansion, we get
� t
δr(t) = Φ(t, s)δf (r0 (s), r0� (s))ds
0
3.2. MORE ON QUANTUM GRAVITY 31
where Φ(t, s) is a classical cnumber function and δf (r, v) is a quantum operator

valued function of phase space variables, ie, a quantum field on R6 .
MATLAB simulation studies:
[1] Simulation of branching processes in discrete time.
[2] Simulation of blood flow in an electromagnetic field.
Continuous time branching processes: Let X(t) denote the number of par
ticles at time t. Then X(t + dt) will equal X(t) + ξ(dt) where conditioned on
X(t), ξ(dt) takes the value r with � probability X(t)λ(r)dt for r = 1, 2, ... and the
value 0 with probability 1 − X(t) r λ(r)dt. This is because, each particle has
the probability ξ(r)dt of giving birth to r offspring. This gives us the recursion
L
E(z X(t+dt) ) = E(z X(t) z r X(t))λ(r)+
r≥1
L
+E((1 − X(t)λ(r)dt)z X(t) )
r
So writing
F (t, z) = E(z X(t)
we get
F (t + dt, z) = φ(z)z∂F (t, z)/∂z + F (t, z) − z(∂F (t, z)/∂z)φ(1)
where L
φ(z) = λ(r)z r
r
and hence we derive the following differential equation for the evolution of the
generating function of the number of particles at time t:
∂F (t, z)/∂t = z(φ(z) − φ(1))∂F (t, z)/∂z
3.2 More on quantum gravity

Let J
Eia = det(q)eai , q = ((qab ))
Then,
det(E) = det((Eia )) = det((Eai ))−1 = det(q)3/2 det(e)−1 = det(q)
Note that,
J
qab = eia eib , det(q) = det(e)2 , det(q) = det(e) = det((eia ))
Db (Eia ) = 0
follows from
Db eia = 0
32CHAPTER 3. ORDINARY, PARTIAL AND STOCHASTIC DIFFERENTIAL EQUAT
and the expression
Eia = det(e)eai = �(a1 , a2 , a3 ))e1a1 e2a2 e3a3 eai
Note also that defining
Fai = �(aa2 a3 )�(ii2 i3 )eai22 eai33
we get
eai Fai = �(a1 a2 a3 )�(i1 i2 i3 )eai11 eai22 eai33
= 6.det((eai )) = 6.det(e)−1
The factor of 6 comes because of the identity
�(i1 i2 i3 )�(i1 i2 i3 ) = 6
More generally,
ebi Fai = �(aa2 a3 )�(i1 i2 i3 )eib1 eai22 eai33
= �(aa2 a3 )�(ba2 a3 )det(e)−1 = δab .6.det(e)−1
Thus,
Fai = 6.det(e)−1 eia
Taking the inverse gives
Fia = 6−1 det(e)eia = 6−1 .Eia
or equivalently,
Eia = 6.Fia
We can derive another alternate expression for Fia or equivalently for Eia . Let
Gai = �(aa2 a3 )�(ii2 i3 )eia22 eia33
Then,
eib Gai = �(aa2 a3 )�(ba2 a3 )det(e) = 6δba .det(e)
Thus
Gai = 6.eai det(e) = 6.Eia
In particular,
Da Eia = 0
or equivalently,
√
( qeai ),a − Γjia Eja = 0
Note that this is the same as
√ j a
q(eai:a − Γia ej ) = 0
which is the same as

√
qDa eia = 0
Stochastic Processes
3.3. LARGE in ClassicalPROPERTIES
DEVIATION and Quantum Physics and Engineering
OF VIRUS 25
COMBAT WITH ANTIBODIES33
ie
Da eia = 0
Substituting for the spin connection
j
Γia = Γjia = Γak (τk )ji = Γka �(kji)
we get
a
Ei,a − Γak �(kji)Eja = 0 − − − (1)
In other words, the covariant derivative of Eia w.r.t the spin connection Γka
vanishes. This result is of fundamental importance because it states that these
position fields Eia which are called the Ashtekar position fields, are covariantly
flat w.r.t the spin connection. We wish to express the SO(3) spin connection
Γka in terms of the Ashtekar canonical position fields Eia . To this end, we note
that (1) can be expressed as
Γka �(kji) =
Γabc = (1/2)((eia eib ),c + (eai eci ),b − (eci ebi ),a )
3.3 Large deviation properties of virus combat

with antibodies
A virus is a small spherical blob around which the blood fluid moves. Let n(t, r)
denote the concentration in terms of number of virus blobs per unit volume and
let g(t, r) denote the net rate of virus generation per unit volume. The virus
gradient current density is given by
Jv (t, r) = −D�n(t, r)
and hence the conservation law for the virus number is given by
divJv + ∂t n = g
or equivalently,
∂t n(t, r) − �2 n(t, r) = g(t, r)
Now let us take the drift current also into account caused by the presence of
charge on the virus body interacting with the external electric field/ Let v(t, r)
denote the velocity of the fluid in which the viurs particles are immersed. Then,
if E(t, r) is the electric field and µ the virus mobility, we have that the total
drift plus diffusion current density is given by
J = nµE − D�n
and its conservation law is given by
divJ + ∂t n = g
or equivalently,
∂t n − D�2 n + µndivE + µE.�n = g
Let q denote the charge carried by a single virus particle. Then the virus charge
density is given by
ρ = qn
and by Gauss’ law,
divE = −ρ/�
so our equation of continuity for the virus particles is given by
∂t n − D�2 n + (µq/�)n2 = g
This is the pde satisfied by the virus number density. If g(t, r) is the sum of a
random Gaussian field and a random Poisson field both of small amplitude, then
the problem is to calculate the rate functional of the field n(t, r). Using this
rate functional, we can predict the probability of the rare event that the virus
concentration in the fluid will exceed the critical concentration level defined by
the solution to the above pde without the generation term by an amount more
than a given threshold. Note that the mobility/drift term for the current is
obtained as follows: let v(t) denote the velocity of a given virus particle and m
its mass, γ the viscous damping coefficient. Then, the equation of motion of the
virus particle is given by
mv � (t) = −γv(t) + qE(t)
and at equilibrium, v � = 0 so that
v(t) = qE(t)/γ = µE(t), µ = q/γ
Now let us consider the motion of the virus particle taking into account random
fluctuating forces. Its equation of motion is
mdv(t) = −γv(t) + qE(t) + σ.dB(t)
where B is standard three dimensional Brownian motion. The solution to this
stochastic differential equation is given by
� � t
v(t) = exp(−γ(t − s)/m)(q/m)E(s)ds + σ exp(−γ(t − s)/m)dB(s)
0
3.3. LARGE DEVIATION PROPERTIES OF VIRUS COMBAT WITH ANTIBODIES35
The corresponding total drift current density is given by

j
Jd (t, r) = nv = n(t, r)( exp(−γ(t−s)/m)(q/m)E(s, r)ds
j t
+σ exp(−γ(t−s)/m)dB(s, r))
0
where now B(t, r) is a Gaussian random field, ie, a zero mean Gaussian field
having correlations
E(B(t, r).B(s, r' )) = min(t, s)RB (r, r' )
Note that the pdf p(t, v) of v(t), ie, of a single virus particle satisfies the Fokker
Planck equation
∂t p(t, v) = γdiv(vp(t, v)) − qE(t).'v p(t, v) + (σ 2 /2)∂ 2 p(t, v)
p(t, v)d3 v gives us the probability that the virus particle will have a velocity in
the range d3 v around v. n(t, r)p(t, v)d3 rd3 v is the number of particles having
position in the range d3 r around r and having a velocity in the range of d3 v
around v.
MATLAB simulations for virus dynamics based on a stochastic differential
equation model:
[a] Simulation of blood velocity flow: The heart pumping force field per unit
volume can be represented as
f (t, r) = f0 (t)K(r)
where f0 (t) is a periodic function and K(r) is a point spread function that is
a smeared version of the Dirac δfunction at the point r0 , the position of the
heart. The equation of motion of a virus particle in this force field is given by
dr(t) = v(t)dt, mdv(t) = −γv(t)dt + f (t, r(t))dt + σdB(t) − −(1)
The corresponding pdf is p(t, r, v) in position space. If n0 is the total number
of virus particles in the fluid, then n0 p(t, r, v)d3 rd3 v equals the total number of
virus particles in the position range d3 r and having velocity in the range d3 v
centred around (r, v). It satisfies the FokkerPlanck equation
∂t p(t, r, v) = −v.'r p(t, r, v)
−f (t, r).'v p(t, r, v)++γ.divv (v.p(t, r, v))+(σ 2 /2)'2v p(t, r, v)−−−(2)

equations (1) and (2) can be discretized respectively in time and in spacetime
and simulated. For example, simulation of (1) will be based on the discrete time
model
r(t + δ) = r(t) + δ.v(t),
√
v(t + δ) = v(t) − (γδ/m)v(t) + f (t, r(t))δ + σ δW (t)
where W (t), t = 1, 2, ... are iid standard normal R3 valued random vectors.
2836CHAPTERStochastic Processes
PARTIAL ANDandSTOCHASTIC
DIFFERENTIAL
Simulation of blood flow in the body: The force exerted by the heart is
localized around the position of the heart and as such this force density field
can be represented as
f (t, r) = f0 (t)K(r)
where f0 (t) is a periodic function of time and K(r) is a smeared version of
the Dirac δfunction centred around the heart’s position. The fluiddynamical
equations to be simulated are
ρ(∂t v(t, r) + (v, �)v(t, r)) = −�p + f + η�2 v
div(ρ.v) + ∂t ρ = g
where ρ(t, r) is the virus density and g(t, r) is the rate of generation of virus
mass per unit volume. Assuming an equation of state
p = F (ρ)
we get a set of four pde’s for the four fields ρ, v by replacing �p with F � (ρ)�ρ.
Standard CFD (Computational fluid dynamics) methods can be used to dis
cretize this dynamics and solve it in MATLAB.
The spread of virus from a body within a room: Let C(t) denote the concen
tration of the virus at a certain point within the room. It increases due to the
presence of a living organism carrying the virus and decreases due to the pres
ence of ventilation, medicine etc. other effects. Thus, the differential equation
for its dynamics can be expressed in the form
dC(t)/dt = I(t) − β(X(t))C(t)
where I(t) denotes the rate of increase due to the external living organism and
X(t) denotes the other factors like medicine, ventilation etc. Formally, we can
solve this equation to give
� t
C(t) = Φ(t, 0))C(0) + Φ(t, s)I(s)ds
0
where � t
Φ(t, s) = exp(− β(X(τ ))dτ )
s
This dynamical equation can be generalized in two directions, one, by including
a spatial dependence of the virus concentration and describing the dynamics
using a partial differential equation, and two, by including stochastic terms.
Let C(t, r) denote the virus concentration in the room at the spatial location r
and at time t. The virus diffuses from a region of higher concentration to a region
of lower concentration in such a way that if D denotes the diffusion coefficient,
the virus diffusion current density, ie, number of virus particles flowing per unit
area per unit time at (t, r) is given by
J(t, r) = −D.�C(t, r)
3.3. LARGE DEVIATION PROPERTIES OF VIRUS COMBAT WITH ANTIBODIES37
Let X(t, r) denote the external factors at the point r at time t. Its effect on the
decrease in the virus concentration at the point r� at time t can be described
via an ”influence function” F (r� , r� ) which usually decreases with increase in the
spatial distance |r� − r|. So we can to a good degree of approximation, describe
the concentration dynamics as
�
∂t C(t, r) = I(t, r) − divJ(t, r) − F (r, r� )β(X(t, r� ))C(t, r� )d3 r�
�
= I(t, r) + D�2 C(t, r) − F (r, r� )β(X(t, r� )C(t, r� )d3 r�
In the particular case, if

F (r, r� ) = δ 3 (r − r� )
so that the external agents influence the decrease in virus only at the point
where they are located, the above equation simplifies to
∂t C(t, r) = I(t, r) + D�2 C(t, r) − β(X(t, r))C(t, r)
This equation can be solved easily using Fourier transform methods. More
generally, we can consider a situation when the influence of external agents
located at different points r1 , ..., rn takes place via nonlinear interactions so
Volterra interaction terms are added to the dynamics to give
∂t C(t, r) =
��
= I(t, r)+D�2 C(t, r)− Fn (r, r1 , ..., rn )β(X(t, r1 ), ...,
n≥1
X(t, rn ))C(t, r1 )...C(t, rn )d3 r1 ...d3 rn
Of course, this equation does not describe memory in the action of external
agents. To take into account such memory effects, we further modify it to
∂t C(t, r) =
= I(t, r) + D�2 C(t, r)

��
− Fn (r, r1 , ..., rn )β(τ1 , ..., τn , X(t − τ1 , r1 ), ..., X(t − τn , rn ))
n≥1 τj ≥0,j=1,2,...,n
.C(t − τ1 , r1 )...C(t − τn , rn )d3 r1 ...d3 rn dτ1 ...dτn

We could also consider a model which takes the delay in the effect of external
agents to be caused on account of the finite propagation velocity for the influence
of virus at the point r� to take place at r. Virus at r� has a concentration C(t, r� )
at time t and its effect propagates at a speed of v so that its effect at time t at
the point r will be described by a term C(t − |r − r� |/v, r� ). Our model then
assumes the form
∂t C(t, r) = I(t, r) + D�2 C(t, r)
LJ
− Fn (r, r1 , ..., .rn )β(r1 , ..., rn , X(t−|r−r1 |/v, r1 ), ..., X(t−|r−rn |/v), rn )
n≥1
.C(t − |r − r1 |/v, r1 )..C(t − |r − rn |/v, rn )d3 r1 ..d3 rn
Remark: Consider the wave equation pde with source and nonlinear terms:
J
∂t2 F (t, r) − c2 �2 F (t, r) + δ. H(τ, r, F (t − τ, r' ))dτ d3 r' = s(t, r)
Solving this using first order perturbation theory gives
F (t, r) = F0 (t, r) + δ.F1 (t, r) + O(δ 2 )
where
∂t2 F0 (t, r) − c2 �2 F0 (t, r) = s(t, r),
J
∂t2 F1 (t, r) 2 2
− c � F1 (t, r) = − H(τ, r, F0 (t − τ, r' ))dτ d3 r'
The causal solution to this can be expressed in terms of retarded potentials:

J
F0 (t, r) = − s(t − |r − r' |/c, r' )d3 r' /4π,
J
−1
F1 (t, r) = (4π) H(τ, r' , F0 (t − |r − r' |/c − τ, r'' ))dτ d3 r' d3 r''
It is clear from this expression therefore that delay terms coming from the finite
speed of propagation of the virus effects can be described using second order
partial differential equations in spacetime like the wave equation with source
terms.
Large deviation problems: Express the above pde as
∂t C(t, r) = L(C(t, r)) + s(t, r) + E.w(t, r)
where L is a nonlinear spatiotemporal integrodifferential operator and w is the

noise field. Note that the operator L depends upon the external medicine/ventilation
etc. fields X(t, r). So we can write L = L(X). The problem is to design these
external fields X so that the probability of the concentration field exceeding a
given threshold at some point in the room is a minimum. This probability can
for small E be described using the LDP rate function.
3.4. LARGE DEVIATIONS PROBLEMS IN QUANTUM FLUID DYNAMICS39
3.4 Large deviations problems in quantum fluid

dynamics
The NavierStokes equation for an incompressible fluid can be expressed as
∂t v(t, r) = L(v)(t, r) + f (t, r) − ρ−1 �p(t, r)
div(v) = 0
where L is the nonlinear NavierStokes vector operator:
L(v) = −(v, �)v + ηρ−1 �2 v
We can write
v = curlψ, divψ = 0, Ω = curlv = −�2 ψ
and then get
∂t Ω + curl(Ω × v) = ηρ−1 �2 Ω + g(t, r), g = curlf
Let G denote the Green’s function of the Laplacian. Then
∂t ψ = −G.curl(�2 ψ × curlψ) + ηρ−1 �2 ψ − G.g
Formally, we can express this as
∂t ψ = F (ψ) + h
where now F is a nonlinear functional of the entire spatial vector field ψ and
h = −G.g = −G.curlf is the modified force field. Here F (ψ) is a functional of
ψ at different spatial points but at the same time t which is why after spatial
discretization, it can be regarded as a nonlinear state variable system. Ideally
speaking, we should therefore write
∂t ψ(t, r) = F (r, ψ(t, r� ), r� ∈ D) + h(t, r)
where D is the region of three dimensional space over which the fluid flows.
Note that
F (ψ) = −G.curl(�2 ψ × curlψ) + ηρ−1 �2 ψ
Quantization: Introduce an auxiliary field φ(t, r) and define an action functional
� � �
S[ψ, φ] = φ.partialt ψd rdt − φ.F (ψ)d rdt − φ.hd3 rdt
3 3
Note that � �
3
φ.∂t ψd rdt = (1/2) (φ.∂t ψ − ψ.∂t φ)d3 rdt
� �
φ.F (ψ)d3 rdt = − (Gφ).curl(�2 ψ × curlψ)d3 rdt
�
= curl(Gφ).(�2 ψ × curlψ)d3 rdt
Quantization: The canonical momentum fields are
Pψ = δS/δ∂t ψ = φ/2,
Pφ = δS/δ∂t φ = −ψ/2
Then the Hamiltonian is given by
�
H = (Pψ .∂t ψ + Pφ .∂t φ)d3 r − S
� �
3
= φ.F (ψ)d rdt + φ.hd3 rdt
Surprisingly, no kinetic term, ie terms involving the canonical momenta appear

here. The Hamiltonian equations are then
dPψ /dt = −δH/δψ
dψ/dt = δH/δPψ = 0
and likewise for φ. Note that these equations imply that ψ(t, r) = ψ(r) is time
independent and hence Pψ varies linearly with time.
Second quantization of the Schrodinger equation: Let
L = i(ψ ∗ ∂t ψ − ψ∂t ψ ∗ ) − a(�ψ ∗ )(�ψ)
be the Lagrangian density. Assume that ψ and ψ ∗ are independent canonical

position fields. Then, the corresponding canonical momentum fields are
Pψ = ∂L/∂∂t ψ = iψ ∗ ,
Pψ∗ = ∂L/∂∂t ψ ∗ = −iψ

The Hamiltonian density corresponding to L is then
H = Pψ .∂t ψ + Pψ∗ .∂t ψ ∗ − L =
Pψ ∂ t ψ + P ψ ∗ ∂ t ψ ∗ − L =
a�ψ ∗ .�ψ
and hence the second quantized Hamiltonian density of the free Schrodinger
particle is
� � �
H = Hd3 x = a �ψ ∗ .�ψ.d3 x = −a ψ ∗ �2 ψ.d3 x
where of course, to get agreement with experiment,
a = h2 /2m
3.4. LARGE in Classical PROBLEMS
DEVIATIONS and Quantum IN
Physics and Engineering
QUANTUM 33
FLUID DYNAMICS41
where h is Planck’s constant divided by 2π. In the presence of an external

potential, this Hamiltonian becomes
�
H = (−aψ ∗ r2 ψ + V (t, r)ψ ∗ ψ)d3 r
The canonical commutation relations are
[ψ(t, r), Pψ (t, rf )] = ihδ 3 (r − rf )
or equivalently,
[ψ(t, r), ψ ∗ (t, rf )] = hδ 3 (r − rf )
We expand the wave operator field ψ(t, r) in terms of the stationary state energy
functions of the first quantized Schrodinger Hamiltonian operator
−ar2 + V
to get
ψ(t, r) = a(n)un (r)exp(−iω(n)t), ω(n) = E(n)/h
n
so that the above commutation relations give
[a(n), a(m)∗ ]un (r)um (rf )∗ exp(−i(ω(n) − ω(m))t) = hδ 2 (r − rf )

n,m
This implies that

[a(n), a(m)∗ ] = δn,m
since
un (r)un (rf )∗ = δ 3 (r − rf )
n
Actually, since the electronic field is Fermionic, the above commutation relations
should be replaced with anticommutation relations and hence all the above
commutators should actually be anticommutators. This picture does not show
the existence of the positron which is because the Schrodinger equation is a
nonrelativistic equation. If we use the Dirac relativistic wave equation, then
positrons automatically appear.
Large deviations in quantum fluid dynamics: Let ψ(t, r) be an operator

valued function of time with f and hence g, h being classical random fields.
Then solving
∂t ψ(t, r) = F (r, ψ(t, rf ), rf ∈ D) + h(t, r)
for ψ(t, r), we obtain ψ(t, r) in terms of its initial values ψ(0, r) and the clas
sical source h. Note that ψ(0, r) is an operator valued function of the spatial
coordinates. This solution can be obtained perturbatively. Hence, formally, we
can write
ψ(t, r) = Q(ψ(0, .), h(s, .), s ≤ t)
3442CHAPTERStochastic ProcessesPARTIAL
ANDandSTOCHASTIC
DIFFERENTIAL
Assuming h to be a ”small” classical random field, we can ask about the statistics
of ψ(t, r) in a given quantum state ρ, ie for things like the quantum moments
T r(ρ.ψ(t1 , r1 )...ψ(tN , rN ))
This will be a functional of h(s, .), s ≤ max(t1 , .., tN ) and hence this moment
will be a classical random variable and we can ask about the large deviation
properties of this moment after replacing h by Eh with E → 0.
3.5 Lecture schedule for statistical signal pro

cessing
[1] Revision of the basic concepts in linear algebra and probability theory.
[a] Probability spaces, random variables, probability distributions, charac
teristic functions, independence, convergence of random variables, emphasis on
weak convergence of random variables in a metric space.
[2] Basic Bayesian detection and estimation theory.
[3] Various kinds of optimal estimators for parameters: Maximum likeli
hood estimator for random and nonrandom parameters, maximum aposteriori
estimator for random parameters, nonlinear minimum mean square estimator
as a conditional expectation, best linear minimum mean square estimator for
random parameters, linear minimum unbiased minimum variance estimators of
parameters in linear models, expanding the nonlinear mmse as a Taylor se
ries. Emphasis on the orthogonality principle for optimal minimum linear and
nonlinear mean square estimators of random parameters.
[4] Estimating parameters in linear and nonlinear time series models: AR,
MA, ARMA models.
[5] Recursive estimation of parameters in linear models, the RLS algorithm.
[6] Estimation of random signals in noise: The orthogonality principle, the
FIR Wiener filter, the optimal noncausal Wiener filter—derivation based on
spectral factorization method of Wiener and on the innovations method of Kol
mogorov.
[7] Linear prediction theory. The infinite order Wiener predictor. Kol
mogorov’s formula for the minimum prediction error energy in terms of the
power spectral density of the process. Finite order forward and backward linear
prediction of stationary processes. The LevinsonDurbin algorithm and lattice
filters. Significance of the reflection coefficients from the viewpoint of stability.
Lattice filter realizations. Lattice to ladder filter conversion and viceversa.
[8] An introduction to Brownian motion and stochastic differential equations.
Application to real time nonlinear filtering theory. The KalmanBucy filter in
continuous and discrete time. The KushnerKallianpur nonlinear filter and the
extended Kalman filter. The disturbance observer in extended Kalman filtering.
Applications of the extended Kalman filter to robotics.
[9] Mean and variance propagation in stochastic differential equations for
nonlinear systems.
3.5. LECTURE SCHEDULE FOR STATISTICAL SIGNAL PROCESSING43
[10] Stochastic differential equations driven by Poisson fields.

[11] Estimating the frequencies of harmonic signals using time series methods
and using high resolution eigensubspace estimators like the Pisarenko harmonic
decomposition, MUSIC¯ and ESPRIT algorithm.
[12] Estimating bispectra and polyspectra of harmonic processes using mul
tidimensional versions of the MUSIC and ESPRIT algorithm.
[13] An introduction to large deviation theory for evaluating the effects of
low amplitude noise on the probabilities of deviation of the system from the
stability zone.
[a] Large deviations for iid normal random variables.
[b] Cramer’s theorem for empirical means of independent random variables.
[c] Sanov’s theorem for empirical distributions of independent random vari
ables.
[d] Large deviations problems in binary hypothesis testing. Evaluation of
the asymptotic miss probability rate given that the false alarm probability rate
is arbitrarily close to zero using the optimal NeymanPearson test.
[e] The Gibbs conditioning principle for noninteracting and interacting par
ticles:Given that the empirical distribution for iid random variables satisfies
certain energy constraints corresponding to noninteracting and interacting po
tentials, evaluate the most probable value assumed by this empirical distribu
tion using Sanov’s relative entropy formula for the probability distribution of
the empirical distribution.
[f] Diffusion exit from a domain. Calculate the mean exit time of a diffusion
process from a boundary given that the driving white Gaussian noise is weak.
[14] Large deviations for Poisson process driven stochastic differential equa
tions. Let N (t), t ≥ 0 be a Poisson process with rate λ. Consider the scaled
process N� (t) = �.N (t/�). Then N� is a Poisson process whose jumps are in
units of � and whose rate is λ/�. Consider the stochastic differential equation
dX(t) = F (X(t))dt + dN� (t), t ≥ 0
The rate function of the process N� , � → 0 is evaluated as follows:
� T
−1
Λ(f ) = �.logE[exp(� f (t)dN� (t))]
0
� T
= λ. (exp(f (t)) − 1)dt
0
so the corresponding rate function is
� T � T
IN (X) = supf ( f (t)X � (t) − λ. (exp(f (t)) − 1)dt
0 0
and this can be expressed in the form

� T
IN (X) = J(X � (t))dt
0
ANDandSTOCHASTIC
DIFFERENTIAL
where
J(u) = supv (uv − λ.(exp(v) − 1))
and then the rate function of the solution X(t) to the sde over the time interval
[0, T ] is given by
� T
IX (X) = J(X � (t) − F (X (t)))dt
0
More generally, consider a spatial Poisson field N (t, dξ) with a rate measure of
λ.dF (ξ), ie
E(N (t, E)) = λ.tF (E)
where E takes values in the Borel sigma field of Rn . X(t) is assumed to satisfy
the sde �
dX(t) = F (X(t))dt + g(X(t), u(t), ξ)N (dt, dξ)
where u(t) is an input signal field. Now scale the Poisson in both time and
amplitude to obtain the following sde satisfied by X(t):
�
dX� (t) = F (X� (t))dt + � g(X� (t), u(t), ξ).N (dt/�, dξ), t ≥ 0
We require to calculate the rate function of the process X� over the time interval
[0, T ]. To this end, we consider first a more general situation: Let N (dξ) be
a spatial Poisson field on Rn which we write formally as N (dx1 , ...dξn ). We
assume a constant rate field measure, ie,
E(N (dξ1 , ..., dξn )) = λ.dξ1 ...dξn
Consider a different scaling along the different axes:
Nδ1 ...δn (dξ1 , ..., dξn ) = δ1 ...δn N (dξ1 /δ1 , ..., dξn /δn )
Then, �
Mδ1 ...δn (f ) = Eexp( f (ξ1 , ..., ξn )Nδ1 ...δn (dξ1 , ..., dξn ) =
�
= exp(λ. (exp(δ1 ...δn .f (δ1 ξ1 , ...δn ξn )) − 1)dξ1 ...dξn )
�
= exp(λ.(δ1 ...δn )−1 (exp(δ1 ...δn .f (ξ1 , ...ξn )) − 1)dξ1 ...dξn )
from which we get for the scaled logarithmic moment generating function
δ1 ...δn .log(Mδ1 ...δn ((δ1 ...δn )−1 f ))

�
λ (exp(f (ξ1 , ...ξn )) − 1)dξ1 ...dξn
so the rate function of this scaled Poisson field is given by

� �
I(ν) = supf ( f (ξ1 , ..., ξn )dν(ξ1 , ..., dξn ) − λ (exp(f (ξ1 , ..., ξn )) − 1)dξ1 ...dξn )
3.6. BASIC CONCEPTS IN GROUP THEORY AND GROUP REPRESENTATION THEO
3.6 Basic concepts in group theory and group

representation theory
[1] Show that SL(2, C) and SU (2, C) are simply connected.
Hint: SU (2, C) is topologically isomorphic to S 3 via the representation
� �
α β
g=
−β̄ ᾱ
where α, β ∈ C are arbitrary subject to the only constraint |α|2 + |β|2 = 1. S 3 is

simply connected, ie, any closed loop in S 3 can be deformed continuously to any
single point. Note that S 2 is also simply connected but S 2 without the north
pole is not simply connected because a loop around the north pole cannot be
deformed into a point within this loop continuously without allowing the loop
to intersect the north pole at some stage.
To see that G = SL(2, C) is simply connected, simply use the Iwasawa de
composition G = KAN with K Maximally compact = SU (2, C) and A Abelian
diagonal, N unipotent upper triangular and hence topologically, G is isomor
phic to SU (2, C)xR4 whose fundamental group is the same as that of SU (2, C)
which is zero since SU (2, C) is simply connected.
[2] Show that the centre of the covering group of G = SL(2, R) is isomorphic
to Z the additive group of integers. For again by the Iwasawa decomposition,
G = KAN where now K = SO(2) and hence, G is topologically isomorphic to
SO(2)xR2 whose fundamental group is that of SO(2) which is Z.
(Problem taken from V.S.Varadarajan, ”Lie groups, Lie algebras and their
representations”, Springer).
Problem: Let G be a real analytic group and let π be a finite dimensional ir
reducible representation of G in a vector space V . Let B = {v1 , ..., vn } be a basis
for V and let π denote the representation π in the basis B, ie, we use the same
notation for the representation and its matrix representation in the basis B. Let
F denote the vector space of all functions on G spanned by πij (g), 1 ≤ i, j ≤ n.
Show that dimF = n2 , ie, the πij are all n2 linear independent functions on G.
When G is compact, this result follows from the Schur orthogonality relations.
How does one do this problem for any real analytic group ?
Problem:
L Deduce from the above result that the set of all finite linear com
binations g∈G f (g)π(g) with f a complex function on G vanishing at all but
a finite number of points equals the complete matrix algebra Mn (C).
Let A be the algebra of all finite linear combinations of the π(g), g ∈ G.
This is the group algebra corresponding to the representation π of G. Note
that we are assuming that π is an irreducible representation of G. Note that
A is a subalgebra of the full matrix algebra Mn (C). We wish to prove that
A = Mn (C). Suppose T commutes with A, ie
T ∈ Mn (C), T π(g) = π(g)T ∀g ∈ G
Then R(T ) and N (T ) are π(G) − invariant, ie, A invariant subspaces. By
irreducibility of π, it then follows that R(T ) = V and N (T ) = 0, ie, T is non
singular. Now let c be an eigenvalue of T . Then N (T − c) is a πinvariant

subspace and hence by irreducibility of π, N (T − c) = Cn . Hence, T = cI.
This shows that the centre of A consists precisely of all scalar multiples of the
identity and hence A is the full matrix algebra Mn (C). Now we are in a position
to prove that the functions
� πij (g), 1 ≤ i, j ≤ n form a linearly independent set.
Suppose not. Then i,j c(i, j)πij (g) = 0∀g ∈ G where some c(i, j) is nonzero.
This equation can be expressed as
T r(Cπ(g)) = 0∀g ∈ G
and hence since the set of linear combinations of π(g), g ∈ G equals Mn (C), it
follows that
T r(CX) = 0∀X ∈ Mn (C)
from which we deduce that C = 0. The claim is completely proved.
Note: This problem was taken from V.S.Varadarajan, ”Lie groups, Lie alge
bras and their representations”, Springer.
3.7 Some remarks on quantum gravity using the

ADM action
Express
µ ν
Kab = X,a X,b �µ nν
in terms of qab , qab,c , qab,0 .
�µ nν = nν,µ − Γσµν nσ
µ ν ν
X,a X,b nν,µ = X,b nν,a
ν
= −nν X,ab
since
ν
nν X,b =0
µ ν σ
X,a X,b Γµν nσ
µ ν
= X,a X,b Γσµν nσ
µ ν
= (1/2)X,a X,b (qσµ,ν + qσν,µ − qµν,σ )nσ
µ ν
X,a X,b qσµ,ν =
µ µ
X,a qσµ,b = qσa,b − X,ab qσµ
Thus,
µ ν
X,a X,b qσµ,ν nσ = qσa,b nσ
σ
= qσa,b (X,0 − N c X,c
σ
)
= qσa,b X,σ0 − N c (qca,b − qσa X,bc
σ
)
3.8. DIFFUSION EXIT PROBLEM 47
Likewise,
µ ν
X,a X,b qσν,µ nσ = qσb,a nσ
µ ν
X,a X,b qµν,σ nσ
µ ν σ
= X,a X,b qµν,σ (X,0 − N c X,c
σ
)
µ ν
= X,a X,b qµν,0 − N c X,a
µ ν
X,b qµν,c
µ ν
= qab,0 − qµν (X,a X,b ),0 − N c (qab,c − qµν (X,a
µ ν
X,b ),c )
ν µ
= qab,0 − qaν X,b0 − qµb X,a0 − N c qab,c + N c qaν X,bc
ν
+ N c qµb X,ac
µ
Now
ν ν
qaν X,b0 = qab,0 − qaν,0 X,b
µ µ
qµb X,a 0 = qab,0 − qµb,0 X,a
Alternately,
ν ν
qaν X,b0 = qa0,b − qaν,b X,0
µ µ
qµb X,a 0 = qb0,a − qµb,0 X,0
Now,
gµν (X,µ0 − N a X,a
µ ν
)X,b =0
gives
q0b = N a qab
So we can substitute this expression for qa0 in the above equations.
HamiltonJacobi equation and large deviations:
3.8 Diffusion exit problem

Consider the diffusion exit problem in which the process X(t) ∈ Rn satisfies the
sde √
dX(t) = f (X(t))dt + �g(X(t))dB(t)
The process starts at x within a connected open set G with boundary ∂G and
let τ (�) denote the first time at which the process hits the boundary. Then, we
know (Reference:A.Dembo and O.Zeitouni, ”Large deviations, Techniques and
Applications”) that
lim�→0 �.log(E(τ (�)) = V (x)
where
V (x) = inf (V (t, x, y) : y ∈ ∂G, t ≥ 0)
where
� t � t � t
V (t, x, y) = inf { |u(s)|2 /2 : y = x + f (X(s))ds + g(X(s))u(s)ds}
0 0 0
PARTIAL AND andSTOCHASTIC
DIFFERENTIAL
Suppose g(X(s)) is an invertible square matrix. Then,

� T
V (T, x, y) = inf { (X � (t)−f (X(t))T g(X(t))T g(X(t))(X � (t)−f (X(t))dt : X(0)
0
= x, X(T ) = y}
The problems of calculating V (T, x, y) therefore amounts more generally to com
puting the extremum of the functional
� T
S[T, X] = L(X(t), X � (t))dt
0
subject to the boundary conditions X(0) = x, X(T ) = y. Denoting this infi

mum by S(T, x, y), we know from basic classical mechanics that S satisfies the
HamiltonJacobi equation
∂t S(t, x, y) + H(x, ∂y S(t, x, y)) = 0
where H(x, p) the Hamiltonian is defined as the Legendre transform of the

Lagrangian
H(x, p) = infy (py − L(x, y))
or equivalently,
H(x, p) = py − L(x, y), p = ∂y L(x, y)
It is now clear that in the context of our diffusion exit problem,
L(x, y) = (y − f (x))T g(x)T g(x)(y − f (x))
and
V (t, x, y) = S(t, x, y)
Then
V (x, y) = inft≥0 V (t, x, y) = inft≥0 S(t, x, y) = S(x, y)
where S(x, y) satisfies the stationary HamiltonJacobi equation
H(x, ∂y (S(x, y)) = 0
We note that it is hard to compute the exact value of E(τ (�)) for � > 0 be
cause this involves computing the Green’s function of the diffusion generator.
However, computing the limiting value of �.logE(τ (�)) as � → 0 is simpler as it
involves only computing the stationary solution of the HamiltonJacobi equa
tion.
3.9. LARGE DEVIATIONS FOR THE BOLTZMANN KINETIC TRANSPORT EQUATIO
3.9 Large deviations for the Boltzmann kinetic

transport equation
f,t (t, r, p) + (p, �r )f (t, r, p) + (F (t, r, p), �p )f (t, r, p) =
�
(w(p + q, q)f (t, r, p + q) − w(p, q)f (t, r, p))d3 q
The rhs is interpreted as follows: w(p + q, p)d3 q is the probability of a particle

having momentum centred around p + q in the momentum volume d3 q to get
scattered to the momentum p after losing a momentum q and likewise w(p, q)d3 q
is the probability of a particle having momentum p to get scattered to a momen
tum centred around q within a momentum volume of d3 q. Making appropriate
approximations gives us
w(p + q, q)f (t, r, p + q) ≈ w(p, q)f (t, r, p) + (q, �p )(w(p, q)f (t, r, p))
+(1/2)T r(qq T �p �Tp )w(p, q)f (t, r, p)

It follows that
�
(w(p + q, q)f (t, r, p + q) − w(p, q)f (t, r, p))d3 q
≈ (q, �p )(w(p, q)f (t, r, p)) + (1/2)T r(qq T �p �Tp )(w(p, q)f (t, r, p))
Now letting � �
q.w(p, q)d3 q = A(p), qq T w(p, q)3 q = B(p)
we can write down the approximated kinetic transport equation as
∂t f (t, r, p) + (p, �r )f (t, r, p) + (F (t, r, p), �p )f (t, r, p) =
divp (A(p)f (t, r, p)) + (1/2)T r(�p �Tp B(p)f (t, r, p))
which is the FokkerPlanck equation that one encounters in diffusion processes
described in the form of stochastic differential equations driven by Brownian
motion.
The large deviation problem: If the scattering probability function w(p, q)
has small random fluctuations, the we can ask what would be the deviation
in the resulting solution to the particle distribution function f ? Of course, to
answer this question, we must expand the solution f in powers of a perturbation
parameter tag δ that is attached the random vector and matrix valued functions
A(p), B(p). Specifically, assuming w(p, q) to be of order δ, we have that A(p)
and B(p) are random functions respectively of orders δ and δ 2 .
50CHAPTER 3. ORDINARY, PARTIAL AND STOCHASTIC DIFFERENTIAL EQUAT
3.10 Orbital integrals for SL(2, R) with applica

tions to statistical image processing
G = SL(2, R)
Iwasawa decomposition of G is
G = KAN
where ( )
cos(θ) −sin(θ)
K = {u(θ) : 0 ≤ θ < 2π}, u(θ) =
sin(θ) cos(θ)
A = {diag(a, a−1 ) : a ∈ R − {0}}
( )
1 n
N ={ : n ∈ R}
0 1
Define ( )
0 1
H = diag(1, −1), X =
0 0
( )
0 0
Y =
1 0
Then,
[H, X] = 2X, [H, Y ] = −2Y, [X, Y ] = H
Consider the following element in the universal enveloping algebra of G:
Ω = 4XY + H 2 − 2H
Then,
[H 2 , X] = 2(XH+HX) = 2(2HX−2X) = 4HX−4X, [XY, X] = −XH

= −(HX−2X) = −HX+2X
Thus,
[Ω, X] = 0
[H 2 , Y ] = −2(HY + Y H) = −2(2HY + 2Y ) = −4HY − 4Y, [XY, Y ] = HY
Thus,
[Ω, Y ] = 0
and hence
[Ω, H] = [Ω, [X, Y ]] = 0
Thus, Ω belongs to the centre of the universal enveloping algebra.
Now consider the orbital integral
1
Ff (h) = Δ(h) f (xhx−1 )dx, h ∈ A
G/A
Stochastic Processes in
3.10. ORBITAL Classical andFOR
INTEGRALS Quantum
SL(2,Physics
R) WITHand APPLICATIONS
Engineering 43
TO STATISTICAL
Noting that
G = KN A
we can write
G/A = KN
and hence � �
f (xhx−1 )dx = f (unhn−1 u−1 )dudn
G/A K×N
�
= f¯(nhn−1 )dn
N
where �
f¯(x) = f (uxu−1 )du
K
Now let
nhn−1 − h = M, n ∈ N, h ∈ A
It is clear that M is strictly upper triangular since n is unipotent, ie, upper
triangular with ones on the diagonal. Then, for fixed h,
dM = d(nhn−1 ) = |a − 1/a|dn = Δ(h)dn
(Simply write � �
1 m
n= ,m ∈ R
0 1
and note that dn = dm and
� �
1 −m
n−1 =
0 1
h = diag(a, 1/a)
and observe that � �
−1 0 (1/a − a)m
nhn −h=
0 0
) Note that for a = exp(t) and h = diag[a, 1/a], we have
Δ(h) = |et − e−t |
This gives
� � �
−1 −1
f (nhn )dn = f (M + h)(dn/dM )dM = Δ(h) f (M + h)dM
N n n
where n is the Lie algebra of N and consists of all strictly upper triangular real
matrices. Finally, using Weyl’s integration formula,
� � � �
2 −1
f (x)dx = dh|Δ(h)| f (xhx )dx = |Δ(h)|f (M + h)dhdM
G A G/A A×n
PARTIAL AND andSTOCHASTIC
DIFFERENTIAL
�
= |exp(t)−exp(−t)|f (a(t)+sX)dtds, a(t) = diag(exp(t), exp(−t)) = exp(tH)
R×R
Reference: V.S.Varadarajan, ”Harmonic Analysis on Semisimple Lie Groups”,

Cambridge University Press.
3.11 Loop quantum gravity (LQG)

Here, first we introduce the SO(3) connection coefficients that realize the co
variant derivatives of mixed spinors and tensors in such a way that the mixed
covariant derivative of the tetrad becomes zero. In fact this equation defines the
spinor connection coefficients in terms of the spatial Christoffel connection coef
ficients. We then introduce the quantities Kaj which are obtained by considering
the spatial components Kab of the covariant derivative of the unit normal four
vector to the three dimensional hypersurface at each time t that has been em
bedded into R4 . The Ashtekar momentum fields Eja are obtained by scaling the
tetrad eaj with the square root of the determinant of the spatial metric qab where
qab = eja ejb . The tetrad basis matrix eaj or rather its inverse matrix eja is used
� �
to contract the Kab s to obtain the Kaj . The fact that that Kab s are symmetric
j j j j
is manifested in the equation that Ka eb or equivalently Ka Eb is antisymmetric
in (a, b) and this constraint is called the Gauss constraint. By Combining this
with the fact that the spinor covariant derivative of Eaj or Eja vanishes, we are
able to obtain fundamental relation which states that the covariant derivative
of Eaj or Eja with respect to another connection called the ImmirziSenBarbero
Ashtekar connection obtained by adding the the spinor connection Γja with any
scalar multiple of Kaj vanishes. This new connection is denoted by Aja . Fur
ther, in accordance with the ADM action, the canonical position fields are the
spatial metric coefficients qab and the corresponding canonical momentum fields
P ab are obtained in terms of the Kab , namely the spatial components of the
covariant derivative of the unit normal to the embedded three dimensional hy
persurface. Ashtekar proved that if we replace these canonical position fields
by the Aja and the canonical momentum fields by the Eja then because the
spin connection Γja are homogeneous functions of the Eja , it follows that these
new variables also satisfy the canonical commutation relations and hence can
serve as canonical position and momentum fields. Moreover, he showed that the
Hamiltonian constraint operator when expressed in term of these new canonical
variables is just polynomial functional in these fields and hence all the renor
malizability problems are overcome. Further the Gauss constraint in these new
variables is simplified to the form that it states that the covariant derivative of
the Eja vanishes. This formalism means that the new connection Aja behaves
like a YangMills connection for general relativity, the curvature of this con
nection being immediately related after appropriate contraction with the Eja to
the EinsteinHilbert action. This means that one can discretize space into a
graphical grid and replace the curvature terms in the EinsteinHilbert action by
3.12. GROUP REPRESENTATIONS IN RELATIVISTIC IMAGE PROCESSING53
parallel displacements of the connection Aja around loops in the graph and the
connection terms Aja integrated over edges by SO(3) holonomy group elements.
In other words, the whole programme of quantizing the EinsteinHilbert action
using such discretized graphs becomes simply that of quantizing a function of
SO(3) holonomy group elements, one element associated with each edge of a
graph and a holonomy group element corresponding to the curvature obtained
by parallely transporting the connection around a loop.
3.12 Group representations in relativistic image

processing
For G = SL(2, R), we’ve seen that
� �
f (x)dx = Δ(t)f (a(t) + sX)dtds, Δ(t) = |exp(t) − exp(−t)|
G
Now suppose that f, g are two functions on G. Let χ be the character of a

representation of G. Consider
�
I(f, g) = f (x)g(y)χ(y −1 x)dxdy
G×G
Let u ∈ G and observe that

�
I(f ou−1 , gou−1 ) = f (u−1 x)g(u−1 y)χ(y −1 x)dxdy
G×G
� �
= f (x)g (y)χ(y −1 u−1 ux)dxdy = f (x)g(y)χ(y −1 x)dxdy
G×G G×G
= I(f, g)
Thus, I(f, g) is an invariant function of two image fields defined on G. I can be
evaluated using the above integration formula. Note that this formula is true
for any function χ on G, not necessarily a character. Now consider
�
J(f, g) = f (x)g(y)χ(yx−1 )dxdy
G×G
We shall show that J is invariant whenever χ is a class function an in particular,

when it is a character. We have
�
J(f ou−1 , gou−1 ) = f (u−1 x).g(u−1 y)χ(yx−1 )dxdy
�
= f (u−1 x).g(u−1 y)χ(yx−1 )dxdy
�
= f (x)g(y)χ(uyx−1 u−1 )dxdy = J(f, g)
because χ is a class function ie,
χ(uxu−1 ) = χ(x)∀u, x ∈ G
Some other formulae regarding orbital integrals, Fourier transforms and

Weyl’s character formula. Let G be a compact Lie group and consider Weyl’s
character formula for an irreducible representation of G:
�
�(s)exp(s.(λ + ρ))
χ = s∈W �
s∈W �(s)exp(s.ρ)
�
where ρ = (1/2) α∈Δ+ α and λ is a dominant integral weight. Note that the
denominator can expressed as
�
Δ1 (exp(H)) = �(s).exp(s.ρ(H))
s∈W
= Πα∈Δ+ (exp(α(H)/2) − exp(−α(H)/2))

¯
= exp(ρ(H))Πα∈Δ+ (1 − exp(−α(H))) = exp(ρ(H))Δ(exp(H))
where
Δ(H) = Πα∈Δ+ (exp(α(H)) − 1)
so that
Δ̄(H) = Πα∈Δ+ (exp(−α(H)) − 1) = c.Πα∈Δ+ (1 − exp(−α(H)))
with c a constant that equals ±1. Note also that H varies over the Cartan
subalgebra a and since G is a compact group, it follows that α(H) is pure
imaginary for all α ∈ Δ and H ∈ a. Now, Weyl’s integration formula gives for
any function f on G, �
χ(f ) = f (g)χ(g)dg =
�
C. |Δ(h)|2 f (xhx−1 )χ(h)dhdx
A×G/A
�
= C. ¯
Δ(h)F f (h)χ(h)dh
A
where Ff , the orbital integral of f , is defined by

�
Ff (h) = Δ(h) f (xhx−1 )dx
G/A
3.12. GROUP REPRESENTATIONS IN RELATIVISTIC IMAGE PROCESSING55
Substituting for χ(h) from Weyl’s character formula, we get

1 L
χ(f ) = C. Ff (exp(H)). E(s)exp(s.λ(H))dH
a s∈W
1
= C|W | Ff (exp(H))exp(λ(H))dH = C|W |Fˆf (λ)
a
where ĝ denotes Fourier transform of a function g defined on a. This formula
shows that the ordinary Euclidean Fourier transform of the orbital integral on
the Cartan subalgebra evaluated at a dominant integral weight equals the char
acter of the corresponding irreducible representation when the group is compact.
Weyl’s dimension formula:

L
u(H) = χ(H)Δ(H) = E(s)χsm1 ...mn (H)
s∈W
where χ on the right is given by

χm1 ...mn (H) = exp(im1 θ1 + ... + imn θn )
with (θ1 , ..., θn ) denoting the standard toral coordinates of H ∈ a. Define the
differential operator
D = Πr>s i−1 (∂θr − ∂θs )
Then,
L
(Du)(0) = Πr>s (mr − ms ) χs (0) = |W |Πr>s (mr − ms )
s∈ W
on the one hand and on the other, using the fact that
L
Δ(H) = E(s).χsn,n−1,...,1 (H)
s∈W
where
χn−1,...,0 (H) = exp(inθ1 + ... + i2θ2 + iθ1 )
we get
DΔ(0) = |W |Πn≥r>s≥1 (r − s)
and hence
(Du)(0) = χ(0)(DΔ(0)) = d(χ).|W |.Πr>s (r − s)
where
d(χ) = χ(0)
is the dimension of the irreducible representation. Equating these two expres
sions, we obtain Weyl’s dimension formula
Πr>s (mr − ms )
d(χ) = d(m1 , ..., mn ) =
Πr>s (r − s)
This argument works for the compact group SU (n) where a dominant integral
weight is parametrized by an n tuple of nonnegative integers (m1 ≥ m2 ≥ ... ≥
m1 ≥ 0). In the general case of a compact semisimple Lie group G, we use
Weyl’s character in the form
�
�(s)exp(s.(λ + ρ)(H))
χ(H) = s∈W
Δ(H)
�
Δ(H) = �(s).exp(s.ρ(H)) = Πα∈Δ+ (exp(α(H)/2) − exp(−α(H)/2))
s∈W
Then define
D = Πα∈Δ+ ∂(Hα )
We get
DΔ(0) = |W |.Πα∈Δ+ ρ(Hα )
Thus, on the one hand,
D(χΔ)(0) = χ(0)DΔ(0) = d(χ).|W |.Πα∈Δ+ ρ(Hα )
and on the other,
D(Δ.χ)(0) = |W |Πα∈Δ+ (λ + ρ)(Hα )
Equating these two expressions gives us Weyl’s dimension formula for the irre
ducible representation of any compact semisimple Lie group corresponding to
the dominant integral weight λ in terms of the roots:
Πα∈Δ+ (λ + ρ)(Hα )
d(χ) = d(λ) =
Πα∈Δ+ ρ(Hα )
Πα∈Δ+ < λ + ρ, α >

=
Πα∈Δ+ < ρ, α >
3.13 Approximate solution to the wave equation

in the vicinity of the event horizon of a
Schwarchild blackhole
This calculation would enable us to expand the solution to the wave equation
in the vicinity of a blackhole in terms of ”normal modes” or eigenfunctions
of the Helmholtz operator in Schwarzchild spacetime. The coefficients in this
expansion will be identified with the particle creation and annihilation oper
ators which would therefore enable us to obtain a formula for the blackhole
entropy assuming that it is in a thermal state. It would enable us to obtain the
3.13. APPROXIMATE SOLUTION TO THE WAVE EQUATION IN THE VICINITY OF
blackhole temperature at which it starts radiating particles in the Rindler space

time metric which is an approximate transformation of the Schwarzchild metric.
By using the relationship between the time variable in the Rindler spacetime
and the time variable in the Schwarzchild spacetime, we can thus obtain the
characteristic frequencies of oscillation in the solution of the wave equation in
Schwarzchild spacetime in terms of the the gravitational constant G and the
mass M of the blackhole. These characteristic frequencies can be used to calcu
late the critical Hawking temperature at which the blackhole radiates in terms
of G, M using the formula for the entropy of a Gaussian state.
Let
g00 = α(r) = 1 − 2GM/c2 r, g11 = α(r)−1 /c2 = c−2 (1 − 2GM/rc2 )−1 ,
g22 = −r2 /c2 , g33 = −r2 sin2 (θ)/c2

The Schwarzchild metric is then
dτ 2 = gµν dxµ dxν , x0 = t, x1 = r, x2 = θ, x3 = φ
or equivalently,
dτ 2 = α(r)dt2 − α(r)−1 dr2 /c2 − r2 dθ2 /c2 − r2 sin2 (θ)dφ2 /c2
The KleinGordin wave equation in Schwarzchild spacetime is

√ √
(g µν ψ,ν −g),µ + µ2 −gψ = 0
where
µ = mc/h
with h denoting Planck’s constant divided by 2π. Now,
√
−g = (g00 g11 g22 g33 )1/2 = r2 sin(θ)/c3
so the above wave equation expands to give
α(r)−1 r2 sin(θ)ψ,tt − ((c2 /α(r))r2 sin(θ)ψ,r ),r
−(c2 /r2 )r2 (sin(θ)ψ,θ )),θ − (c2 /r2 sin2 (θ))(r2 sin(θ)ψ,φφ + µ2 r2 sin(θ)ψ = 0
This equation can be rearranged as
(1/c2 )∂t2 ψ = (α(r)/r2 )((r2 /α(r))ψ,r ),r
+(α(r)/r2 )((sin(θ))−1 (sin(θ)ψ,θ ),θ + sin(θ)−2 ψ,φφ ) − µ2 α(r)ψ

This wave equation can be expressed in quantum mechanical notation as
(1/c2 )∂t2 ψ = (α(r)/r2 )((r2 /α(r))ψ,r ),r
−(α(r)/r2 )L2 ψ − µ2 α(r)ψ

ANDandSTOCHASTIC
DIFFERENTIAL
where
1 ∂ ∂ 1 ∂2
L2 = −( sin(θ) + )
sin(θ) ∂θ ∂θ sin2 (θ) ∂φ2
is the square of the angular momentum operator in nonrelativistic quantum
mechanics. Thus, we obtain using separation of variables, the solution
�
ψ(t, r, θ, φ) = fnl (t, r)Ylm (θ, φ)
nlm
where fnl (t, r) satisfies the radial wave equation
(1/c2 )∂t2 fnl (t, r) = (α(r)/r2 )((r2 /α(r))fnl, r ),r
−((α(r)/r2 )l(l + 1) + µ2 α(r))fnl (t, r)

where l = 0, 1, 2, .... We write
fnl (t, r) = exp(iωt)gnl (r)
Then the above equation becomes

�
(−ω 2 /c2 )gnl (r) = (α(r)/r2 )((r2 /α(r))gnl (r))�
−((α(r)/r2 )l(l + 1) + µ2 α(r))gnl (r)

For each value of l, this defines an eigenvalue equation and on applying the
boundary conditions, the possible frequencies ω assume discrete values say
ω(n, l), n = 1, 2, .... These frequenices will depend upon GM and µ since α(r)
depends on GM . Let gnl (r) denote the corresponding set of orthonormal eigen
functions w.r.t the volume measure
√
c3 −gd3 x = r2 sin(θ)dr.dθ.dφ
The general solution can thus be expanded as
ψ(t, r, θ, φ)
�
= [a(n, l, m)gnl (r)Ylm (θ, φ)exp(iω(nl)t)+a(n, l, m)∗ ḡnl (r)Y¯lm (θ, φ)exp(−iω(nl)t)]
nlm
where n = 1, 2, ..., l = 0, 1, 2, ..., m = −l, −l + 1, ..., l − 1, l. The Hamiltonian

operator of the field can be expressed as
�
H= hω(nl)a(n, l, m)∗ a(n, l, m)
nlm
and the bosonic commutation relations are
[a(nlm), a(n� l� m� )∗ ] = hω(nl)δ(nn� )δ(ll� )δ(mm� )
3.14. MORE ON GROUP THEORETIC IMAGE PROCESSING 59
An alternate analysis of Hawking temperature.

The equation of motion of a photon, ie null radial geodesic is given by
α(r)dt2 − α(r)−1 dr2 = 0
or
dr/dt = ±α(r) = ±(1 − 2m/r), m = GM/c2
The solution is �
rdr/(r − 2m) = ±t + c
so that
2m.r + 2mln(r − 2m) = ±t + c
The change in time t when the particle goes from 2m+0 to 2m−0 is purely imag
inary an is given by δt = 2mπi since ln(r − 2m) = ln|r − 2m| + i.arg(r − 2m)
with arg(r − 2m) = 0 for r > 2m and = π for r < 2m. Thus, a photon
wave oscillating at a frequency ω has its amplitude changed from exp(iωt) to
exp(iω(t + 2mπi)) = exp(iωt − 2mπω). The corresponding ratio of the proba
bility when it goes from just outside to just inside the critical radius is obtained
by taking the modulus square of the probability amplitude and is given by
P = exp(−4mπω). The energy of the photon within the blackhole is hω (where h
is Planck’s constant divided by 2π) and hence by Gibbsian theory in equilibrium
statistical mechanics, its temperature T must be given by P = exp(−hω/kT ).
Therefore
hω/kT = 4πmω
or equivalently,
T = 4πmk/h = 8πGM k/hc2
This is known as the Hawking temperature and it gives us the critical temper
ature of the interior of the blackhole at photons can tunnel through the critical
radius, ie the temperature at which the blackhole can radiate out photons.
3.14 More on group theoretic image processing

Problem: Compute the Haar measure on SL(2, R) in terms of the Iwasawa
decomposition
x = u(θ)a(t)n(s)
where � �
cos(θ) sin(θ)
u(θ) = = exp(θ(X − Y )),
−sin(θ) cos(θ)
� �
1 s
a(t) = exp(tH) = diag(et , e−t ), n(s) = exp(sX) =
0 1
Solution: We first express the left invariant vector fields corresponding to

H, X, Y as linear combinations of ∂/∂t, ∂/∂θ, ∂/∂s and then the reciprocial of
the associated determinant gives the Haar density. Specifically, writing
H ∗ = f11 (θ, t, s)∂/∂θ + f12 (θ, t, s)∂/∂t + f13 (θ, t, s)∂/∂s,
X ∗ = f21 (θ, t, s)∂/∂θ + f22 (θ, t, s)∂/∂t + f23 (θ, t, s)∂/∂s,

Y ∗ = f31 (θ, t, s)∂/∂θ + f32 (θ, t, s)∂/∂t + f33 (θ, t, s)∂/∂s,
for the left invariant vector fields corresponding to H, X, Y respectively, it would
follow that the Haar measure on SL(2, R) is given by
F (θ, t, s)dθdtds, F = |det((fij ))|
or more precisely, the Haar integral of a function φ(x) on SL(2, R) would be

given by � �
φ(x)dx = φ(u(θ)a(t)n(s))F (θ, t, s)dθdtds
SL(2,R)
We have
d/ds(u(θ)a(t)n(s)) = u(θ)a(t)n(s)X
so
X ∗ = ∂/∂s
Then,
d/dt(u(θ)a(t)n(s)) = u(θ)a(t)Hn(s) = u(θ)a(t)n(s)n(s)−1 Hn(s)
Now,
n(s)−1 Hn(s) = exp(−s.ad(X))(H) = H − s[X, H] = H − 2sX
So
H ∗ − 2sX ∗ = ∂/∂t
Finally,
d/dθ(u(θ)a(t)n(s)) = u(θ)(X−Y )a(t)n(s)
= u(θ)a(t)n(s)(n(s)−1 a(t)−1 (X−Y )a(t)n(s))

and
a(t)−1 (X − Y )a(t) = exp(−t.ad(H))(X − Y ) = exp(−2t)X − exp(2t)Y
n(s)−1 a(t)−1 (X − Y )a(t)n(s) = n(s)−1 (exp(−2t)X − exp(2t)Y )n(s)

= exp(−2t)X − exp(2t).exp(−s.ad(X))(Y )
exp(−s.ad(X))(Y ) = Y − s[X, Y ] + (s2 /2)[X, [X, Y ]] = Y − sH − s2 X
so
exp(−2t)X ∗ − exp(2t)(Y ∗ − sH ∗ − s2 X ∗ ) = ∂/∂θ
3.14. MORE ONinGROUP
ClassicalTHEORETIC
and Quantum Physics
IMAGEand Engineering
PROCESSING 6153
Problem: Solve these three equations and thereby express X ∗ , Y ∗ , H ∗ as

linear combinations of ∂/∂s, ∂/∂t, ∂/∂θ.
Problem: Suppose I(f, g) is the invariant functional for a pair of image fields
f, g defined on SL(2, R). If f changes by a small random amount and so does
g, then what is the rate functional of the change in I(f, g) ? Suppose that for
each irreducible character χ of G, we have constructed the invariant functional
ˆ denote the set of all the irreducible
Iχ (f, g) for a pair of image fields f, g. Let G
characters of G. Consider the Plancherel formula for G:
�
f (e) = χ(f )dµ(χ)
Ĝ
where µ is the Plancherel measure on G.ˆ In other words, we have

�
δe (g) = χ(g)dµ(χ)
Ĝ
Then we can write

�
Iχ (f, g) = ˆ
f (x)g(y)χ(yx−1 )dxdy, χ ∈ G
G×G
Let ψ(g) be any class function on G. Then, by Plancherel’s formula, we have

�
ψ(g) = ψ ∗ δe (g) = δe (h−1 g)ψ(h)dh =
G
� �
−1
χ(h g).ψ(h)dµ(χ)dh = ψ ∗ χ(g).dµ(χ)
Ĝ
and hence the invariant Iψ associated with a pair of image fields can be expressed
as �
Iψ (f, g) = f (x)g(y)ψ(yx−1 )dxdy =
�
ψ ∗ χ(yx−1 )f (x)g(y)dxdy.dµ(χ)
G×G×Ĝ
�
= ψ(h)χ(h−1 yx−1 )f (x)g(y)dxdy.dµ(χ)dh
�
= ψ(h)Iχ (f, g, h)dh.dµ(χ)
ˆ
G×G
where we define
�
Iχ (f, g, h) = f (x)g(y)χ(yx−1 h)dxdy, h ∈ G
G×G
Note that (f, g) → Iχ (f, g, h) is not an invariant. We define the Fourier trans
form of the class function ψ by
�
ψ̂(χ) = ψ(h)χ(h−1 )dh
G
Then, we have �
δe (g) = χ(g)dµ(χ),
For any function f (g) on G, not necessarily a class function,

� �
f (g) = δe (h−1 g)f (h)dh = χ(h−1 g)f (h)dhdµ(χ)
Formally, let πχ denote an irreducible representation having character χ. Then,

defining � �
fˆ(πχ ) = f (g)π(g −1 )dg = f (g −1 )π(g)dg
G G
(Note that we are assuming G to be a unimodular group. In the case of SL(2, R),
this anyway true), we have
�
f (g) = T r(πχ (h−1 g))f (h)dhdµ(χ)
�
= T r(fˆ(πχ )πχ (g))dµ(χ)
Ĝ
and hence, �
Iχ (f, g, h) = f (x)g(y)χ(yx−1 h)dxdy
G×G
�
= f (x)g(y)T r(πχ (y)πχ (x−1 )πχ (h))dxdy
= T r(ĝ0 (πχ )fˆ(πχ )πχ (h))

where
g0 (x) = g(x−1 ), x ∈ G
Then, for a class function ψ,
�
Iψ (f, g) = ψ(h)T r(ĝ0 (πχ )fˆ(πχ )πχ (h))dhdµ(χ)
but, �
ψ̂(χ) = ψ(g)T r(πχ (g −1 ))dg =
ˆ χ ))
T r(ψ(π
We have, � �
ψ(h)πχ (h)dh = ψ0 (h)πχ (h−1 )dh
= ψˆ0 (πχ )
3.14. MORE ON GROUP THEORETIC IMAGE PROCESSING 63
We have, on the one hand

j
ψ(x) = ψ(h)χ(h−1 x)dhdµ(χ)
and since ψ is a class function, ∀x, y ∈ G,

j
ψ(x) = ψ(yxy −1 ) = ψ(h)χ(h−1 yxy −1 )dh.dµ(χ)
j
= ψ(h)χ(y −1 h−1 yx)dh.dµ(χ)
j
= ψ(h)χ1 (h, x, a)dµ(χ)
where
j j
−1 −1
χ1 (h, x, a) = a(y)χ(y h yx)dy = a(y)χ(h−1 yxy −1 )dy
G G
where a(x) is any function on G satisfying,

j
a(x)dx = 1
G
Remark: SL(2, R) under the adjoint action, acts as a subgroup of the Lorentz
transformations in a plane, ie, in the t−y −z space. Indeed, represent the space
time point (t, x, y , z) by
( )
t + z x + iy
Φ(t, x, y , z) =
x − iy t − z
Then define (t' , x' , y ' , z ' ) by
Φ(t' , x' , y ' , z ' ) = A.Φ(t, x, y , z)A∗
where
A ∈ SL(2, R)
Clearly, by taking determinants, we get
� � �2 �
t2−x2−y −z 2
= t2 − x2 − y 2 − z 2
and then choosing A = u(θ) = exp(θ(X − Y )), a(v) = exp(vH), n(s) = exp(sX)
successively gives us the results that
t' + z ' = t + z.cos(2θ) + x.sin(2θ), t' − z ' = t − z.cos(2θ) − x.sin(2θ)
x' + iy ' = x.cos(2θ) − z.sin(2θ)

ANDandSTOCHASTIC
DIFFERENTIAL
which means that
t� = t, x� = x.cos(2θ) − z.sin(2θ), z � = x.sin(2θ) + z.cos(2θ), y � = y
for the case A = u(θ), which corresponds to a rotation of the x − z plane around
the y axis by an angle 2θ. The case A = a(v) gives
x� + iy � = x + iy, t� + z � = e2v (t + z), t� − z � = e−2v (t − z)
or equivalently,
x� = x, y � = y, t� = t.cosh(2v) + z.sinh(2v), z � = t.sinh(2v) + z.cosh(2v)
which corresponds to a Lorentz boost along the z direction with a velocity of

u = −tanh(2v). Finally, the case A = n(s) gives
t� = t + sx + s2 (t − z)/2, x� = x + s(t − z), z � = z + sx + s2 (t − z), y � = y
This corresponds to a Lorentz boost along a definite direction in the xz plane.

The direction and velocity of the boost is a function of s. To see what this one
parameter group of transformations corresponding to A = n(s) corresponds to,
we first evaluate it for infinitesimal s to get
t� = t + sx, x� = x + s(t − z), z � = z + sx, y � = y
This transformation can be viewed as a composite of two infinitesimal transfor

mations, the first is
x → x − sz, z → z + sx, t → t
This transformation corresponds to an infinitesimal rotation of the xz plane
around the y axis by an angle s. The second is
x → x + st, t → t + sx
which corresponds to an infinitesimal Lorentz boost along the x axis by a velocity

of s. To see what the general noninfinitesimal case corresponds to, we assume
that it corresponds to a boost with speed of u along the unit vector (nx , 0, nz ).
Then the corresponding Lorentz transformation equations are
(x� , z � ) = γ(u)(xnx + znz − ut)(nx , nz ) + (x, z) − (xnx + znz )(nx , nz )
t� = γ(u)(t − xnx − znz )

The first pair of equation is in component form,
x� = γ(xnx + znz − ut)nx + x − (xnx + znz )nx ,
z � = γ(xnx + znz − ut)nz + z − (xnx + znz )nz

To get the appropriate matching, we must therefore have
1 + s2 /2 = γ(u), s = −γ(u)nx , −s2 /2 = γ(u)nz ,

3.14. MORE ONinGROUP
ClassicalTHEORETIC
and Quantum Physics
IMAGEand Engineering
PROCESSING 6557
which is impossible. Therefore, this transformation cannot correspond to a

single Lorentz boost. However, it can be made to correspond to the composition
of a rotation in the xz plane followed by a Lorentz boost in the same plane. Let
Lx (u) denote the Lorentz boost along the x axis by a speed of u and let Ry (α)
denote a rotation around the y axis by an angle α counterclockwise. Then, we
have already seen that the finite transformation corresponding to A = n(s) is
given by
L(s) = limn→∞ (Lx (−s/n)Ry (−s/n))n = exp(−s(Kx + Xy ))
where Kx is the infinitesimal generator of Lorentz boosts along the x axis while
Xy is the infinitesimal generator of rotations around the y axis. Specifically,
Kx = L�x (0), Xy = Ry� (0)
The characters of the discrete series of SL(2, R). Let m be a nonpositive integer.
Let em−1 be a highest weight vector of a representation πm of sl(2, R). Then
π(X)em−1 = 0, π(H)em−1 = (m − 1)em−1
Define
π(Y )r em−1 = em−1−2r , r = 0, 1, 2, ...
Then
π(H)em−1−2r = π(H)π(Y )r em−1 = ([π(H), π(Y )r ] + π(Y )r π(H))em−1
= (m − 1 − 2r)π(Y )r em = (m − 1 − 2r)em−1−2r , r ≥ 0
Thus, considering the module V of SL(2, R) defined by the span of em−2r , r ≥ 0,
we find that
� exp((m − 1)t)
T r(π(a(t)) = T r(π(exp(tH)) = exp(t(m − 1 − 2r)) =
1 − exp(−2t)
r≥0
exp(mt)
=
exp(t) − exp(−t)
Now,
π(X)em−1−2r = π(X)π(Y )r em−1 = [π(X), π(Y )r ]em−1
= (π(H)π(Y )r−1 + .... + π(Y )r−1 π(H))em−1
= [(−2(r−1)π(Y )r−1 +π(Y )r−1 π(H)−2(r−2)π(Y )r−1
+π(Y )r−1 π(H)+...+π(Y )r−1 π(H)]em−1
= [−r(r − 1) + r(m − 1)]π(Y )r−1 em−1 = r(m − r)em+1−2r
Also note that
π(Y )em−1−2r = em−3−2r , r ≥ 0
Problem: From these expressions, calculate T r(π(u(θ)) where u(θ) = exp(θ(X−
Y )). Show that it equals exp(imθ)/(exp(iθ) − exp(−iθ)). In this way, we are
able to determine all the discrete series characters for SL(2, R).
3.15 A problem in quantum information theory

Let K be a quantum operation, ie, a CPTP map. Its action on an operator X
is given in terms of the ChoiKraus representation by
�
K(X) = Ea XEa∗
a
Define the following inner product on the space of operators:

< X, Y >= T r(XY ∗ )
Then, calculate K ∗ relative to this inner product in terms of the ChoiKraus
representation of K.
Answer:
� �
< K(X), Y >= T r(K(X)Y ∗ ) = T r( Ea XEa∗ Y ∗ ) = T r(XEa∗ Y ∗ Ea )
a a
�
= T r(X.( Ea∗ Y Ea )∗ ) = T r(X.K ∗ (Y )∗ ) =< X, K(∗ (Y ) >
a
where �
K ∗ (X) = Ea∗ XEa
a
Problem: Let A, B be matrices A square and B rectangular of the appropri
ate size so that � �
A B
X= ≥0
B∗ I
Then deduce that
A ≥ BB ∗
Solution: For any two column vectors u, v of appropriate size
� �
∗ ∗ u
(u , v )x ≥0
v
This gives
u∗ Au + u∗ Bv + v ∗ B ∗ u + v ∗ v ≥ 0
Take
v = Cu
where C is any matrix of appropriate size. Then, we get
u∗ (A + BC + C ∗ B ∗ + C ∗ C)u ≥ 0
this being true for all u, we can write
A + BC + C ∗ B ∗ + C ∗ C ≥ 0
for all matrices C of appropriate size. In particular, taking C = −B ∗ gives us
A − BB ∗ ≥ 0
3.16. HAWKING RADIATION 67
3.16 Hawking radiation

Problem: For a spherically symmetric blackhole with a charge Q at its centre,
solve the static EinsteinMaxwell equations to obtain the metric. Hence, cal
culate the equation of motion of a radially moving photon in this metric and
determine the complex time taken by the photon to travel from just outside
the critical blackhole radius to just inside and denote this imaginary time by
iΔ. Note that Δ will be a function of the parameters G, M, Q of the blackhole.
Deduce that if T is the Hawking temperature for quantum emission of photons
for such a charged blackhole, then at frequency ω,
exp(−hω/kT ) = exp(−ωΔ)
and hence that the Hawking temperature is given by
T = h/kΔ
Evaluate this quantity.
Let
D(ρ|σ) = T r(ρ.(log(ρ) − log(σ)))
the relative entropy between two states ρ, σ defined on the same Hilbert space.
We know that it is nonnegative and that if K is a quantum operation, ie, a
CPTP map, then
D(ρ|sigma) ≥ D(K(ρ)|K(σ))
Consider three subsystems A, B, C and let ρ(ABC) be a joint state on these
three subsystems. Let ρm = IB /d(B) the completely mixed state on B. IB is
the identity operator on HB and d(B) = dimHB . Let K denote the quantum
operation corresponding to partial trace on the B system. Then consider the
inequality
D(ρ(ABC)|ρm ⊗ ρ(AC)) ≥ D(K(ρ(ABC))|K(ρm ⊗ ρ(AC)))
= D(ρ(AC)|ρ(AC)) = 0
Evaluating gives us
T r(ρ(ABC)log(ρ(ABC)) − T r(ρ(ABC).(log(ρ(AC)) − log(d(B))) ≥ 0
or equivalently,
−H(ABC) + H(AC) + log(d(B)) ≥ 0
Now, let K denote partial tracing on the Csubsystem. Then,
D(ρ(ABC)|ρm ⊗ ρ(AC)) ≥ D(K(ρ(ABC))|K(ρm ⊗ ρ(AC)))
gives
−H(ABC) + H(AC) + log(d(B)) ≥ D(ρ(AB)|Im ⊗ ρ(A))
= −H(AB) − T r(ρ(AB)(log(ρ(A)) − log(d(B)))

= −H(AB) + H(A) + d(B)
giving
H(AB) + H(AC) − H(A) − H(ABC) ≥ 0
Let the metric of spacetime be given in spherical polar coordinates by
dτ 2 = A(r)dt2 − B(r)dr2 − C(r)(dθ2 + sin2 (θ)dφ2 )
The equation of the radial null geodesic is given by
A(r) − B(r)(dr/dt)2 = 0
or equivalently for an incoming radial geodesic,

�
dr/dt = − A(r)/B(r) = −F (r)
say where �
F (r) = A(r)/B(r)
Let F (r) have a zero at the critical radius r0 so that we can write
F (r) = G(r)(r − r0 )
where G(r0 ) = 0. Then, the time taken for the photon to go from r0 + 0 to
r0 − 0 is given by
� r0 +0
Δ= dr/((r − r0 )G(r)) = G(r0 )−1 lim�↓0 (log(�) − log(−�))
r0 −0
= −iπ.G(r0 )−1
so that the ratio change in the probability amplitude of a photon pulse of fre
quency ω is the quantity exp(−ωπ.G(r0 )−1 ) and equating this to the Gibbsian
probability exp(−hω/kT ) gives us the Hawking temperature as
T = h.G(r0 )/πk
Chapter 4 Chapter 4
Research
Research Topics on Topics on
Representations
represenations,
Using the Schrodinger equation to track the time varying probability density
of a signal. The Schrodinger equation naturally generates a family of pdf’s as
the modulus square of the evolving wave function and by controlling the time
varying potential in which the Schrodinger evolution takes place, it is possible
to track a time varying pdf, ie, encode this time varying pdf into the form the
weights of a neural network. From the stored weights, we can synthesize the
time varying pdf of the signal.
4.1 Remarks on induced representations and Haar

integrals
Let G be a compact group. Let χ be a character of an Abelian subgroup
N of G in a semidirect product G = HN . Let U = IndG N χ be the induced
representation. Suppose a function f on G is orthogonal to all the matrix
elements of U . Then we can express this condition as follows. First the
representation space V of U consists of all functions h : G → C for which
h(xn) = χ(n)−1 h(x)∀x ∈ G, n ∈ N and U acts on such a function to give
U (y)h(x) = h(y −1 x), x, y ∈ G. The condition
�
f (x)U (x)dx = 0
G
gives �
f (x)h(x−1 y)dx = 0∀y ∈ G∀h ∈ V
G
Now for h ∈ V , we have
h(xn) = χ(n)−1 h(x), x ∈ G, n ∈ N
69
61
70 CHAPTER 4. RESEARCH TOPICS ON REPRESENATIONS,
and hence for any representation π of G,

�
ˆ
h(xn)π(x−1 )dx = χ(n)−1 h(π)
or �
ˆ
h(x)π(n.x−1 )dx = χ(n)−1 h(π)
or
ˆ
π(n)h(π) ˆ
= χ(n−1 )h(π), n∈N
Define �
Pπ,N = π(n)dn
N
Assume without loss of generality that π is a unitary representation of G. Then,
Pπ,N is the orthogonal projection onto the space of all those vectors w in the
representation space of π for which π(n)w = w∀n ∈ N . Then the above equation
gives on integrating w.r.t n ∈ N ,
ˆ
Pπ,N h(π) = 0, h ∈ V, χ �= 1
and
ˆ
Pπ,N h(π) ˆ
= h(π), h ∈ V, χ = 1
on using the Schur orthogonality relations for the characters of N . We therefore
have The equation �
f (x)h(x−1 y)dx = 0
G
gives us �
f (x)h(x−1 y)π(y −1 )dxdy = 0
G
or �
f (x)h(z)π(z −1 x−1 )dxdz = 0
or
ˆ ˆ(π) = 0, h ∈ V
h(π)f
It also implies �
f (x)h(x−1 y)π(y)dxdy = 0
or �
f (x)h(z)π(xz)dxdz = 0
or
fˆ(π)∗ ĥ(π)∗ = 0
which is the same as the above. Suppose on the other hand that f satisfies
�
f (xny)dn = 0, x, y ∈ G − − − (1)
N
Stochastic Processes ON
4.1. REMARKS in Classical
INDUCED and REPRESENTATIONS
Quantum Physics and Engineering 63
AND HAAR INTEGRALS71
Then, it follows that

�
f (xny)π(y −1 )dydn = 0
G×N
which implies �
f (z)π(z −1 xn)dzdn = 0
G×N
This implies
fˆ(π)π(x)Pπ,N = 0, x ∈ G
It also implies �
f (xny)π(x−1 )dxdn = 0
which implies �
f (z)π(nyz −1 )dzdn = 0
or equivalently,
Pπ,N π(y)fˆ(π) = 0, y ∈ G
In particular, (1) implies
Pπ,N fˆ(π) = fˆ(π)Pπ,N = 0
for all representations π of G. The condition (1) implies that P π(y)fˆ(π) = 0

where P = Pπ,N for all representations π and hence for all functions h on
ˆ
G it also implies that P h(π)fˆ(π) = 0. More precisely, it implies that for all
representations π and all functions h on G,
ˆ ˆ(π) = 0 − − − (2)
Pπ,N h(π)f
Now suppose that h1 belongs to the the representation space of U = IndG N χ0

where χ0 is the identity character on N , ie, one dimensional unitary represen
tation of N equal to unity at all elements of G. Then, it is clear that h1 can be
expressed as �
h1 (x) = h(xn)dn, x ∈ N
N
for some function h on G and conversely if h is an arbitrary function on G for

which the above integral exists, then h1 will belong to the representation space
of U . That is because h1 as defined above satisfies
h1 (xn) = h1 (x), x ∈ G, n ∈ N
The condition that f be orthogonal to the matrix elements of U is that

�
f (x)U (x−1 )dx = 0
and this condition is equivalent to the condition that for all h on G,

� �
f (x)h1 (xy)dx = 0, y ∈ G, h = h(xn)dn
G N
and this condition is in turn equivalent to

�
f (x)h1 (xy)π(y −1 )dxdy = 0
or �
f (x)h1 (z)π(z −1 x)dxdz = 0
or equivalently,
ˆ 1 (π)fˆ(π)∗ = 0
h
for all unitary representations π of G and for all functions h on G. Now,
� �
ˆ 1 (π) = h(xn)π(x−1 )dxdn = h(z)π(nz −1 )dzdn = P.h(π)
h ˆ
and hence the condition is equivalent to

ˆ ˆ(π)∗ = 0 − − − (3)
Pπ,N h(π)f
for all unitary representations π of G and for all functions h on G. This condition
is the same as (2) if we observe that fˆ(π) can be replaced by fˆ(π)∗ because
f (x) being orthogonal to the matrix elements of U (x) is equivalent to f (x−1 )
being orthogonal to the same because for compact groups, a finite dimensional
representation is equivalent to a unitary representation which means that U is
an appropriate basis is a unitary representation. Thus, we have proved that
a necessary and sufficient condition that a function f be orthogonal to all the
matrix elements of the representation of G induced by the identity character χ0
on N is that f be a cusp form, ie, it should satisfies
�
f (xny)dn = 0∀x, y ∈ G
N
Now suppose χ is an arbitary character of N (ie a one dimensional unitary rep

resentation of N ). Then let U = IndG
N χ. The If h1 belongs to the representation
space of U , then h1 (xn) = χ(n)−1 h1 (x), x ∈ G, n ∈ N . Now let h(x) be any
function on G. Define �
h1 (x) = h(xn)χ(n)dn
N
Then, � �
h1 (xn0 ) = h(xn0 n)χ(n)dn = h(xn)χ(n−1
0 n)dn
N
= χ(n0 )−1 h1 (x), n0 ∈ N, x ∈ G

4.1. REMARKS ON INDUCED REPRESENTATIONS AND HAAR INTEGRALS73
It follows that h1 belongs to the representation space of U . conversely if h1

belongs to this representation space, then
J
h1 (x) = h1 (xn)χ(n)dn
N
Thus we have shown that h1 belongs to the representation space of U = IndG

Nχ
iff there exists a function h on G such that
J
h1 (x) = h(xn)χ(n)dn
Note that this condition fails for noncompact groups since the integral dn on
N cannot be normalized in general. Then, the condition for f to be orthogonal
to U is that J
f (x)U (x−1 )dx = 0 − − − (4)
G
and this condition is equivalent to
J
f (x)h1 (xy)dx = 0
for all h1 in the representation space of U or equivalently,

J
f (x)h(xyn)χ(n)dxdn = 0
for all functions h on G. This condition is equivalent to

J
f (x)h(xyn)χ(n)π(y −1 )dxdydn = 0
for all unitary representations π of G which is in turn equivalent to

J
f (x)h(z)χ(n)π(nz −1 x)dxdzdn = 0
G×G×N
or equivalently,
ˆ ˆ(π)∗ = 0 − − − (5)
Pχ,π,N .h(π).f
for all functions h on G and for all unitary representations π of G where
J
Pχ,π,N = P = χ(n)π(n)dn
N
Now consider the following condition on a function f defined on G:

J
f (xny)χ(n)dn = 0, x, y ∈ G
N
This condition is equivalent to

J
f (xny)χ(n)π(y −1 )dydn = 0
G×N
for all unitary representations π of G or equivalently,

�
f (z)χ(n)π(z −1 xn)dzdn = 0
G×N
and this is equivalent to
fˆ(π)π(x)P = 0, x ∈ G − − − (6)
which is in turn equivalent to
fˆ(π)h(π)P
ˆ = 0 − − − (7)
for all functions h on G as follows by multiplying (5) with h(x−1 ) and integrating
over G. This condition is the same as
ˆ ∗ fˆ(π)∗ = 0
Pχ,π,N h(π)
or since h(x) can be replaced by h(x−1 ), the condition is equivalent to
ˆ ˆ(π)∗ = 0 − − − (8)
Pχ,π,N h(π)f
for all functions h on G and all unitary representations π of G. This condition

is the same as (5). Thus, we have established that a necessary and sufficient
condition for f to be orthogonal to all the matrix elements of U = IndG N χ is
that f should satisfy
�
f (xny)χ(n)dn = 0, x, y ∈ G
N
In other words, f should be a generalized cusp form corresponding to the char

acter χ of N .
4.2 On the dynamics of breathing in the pres

ence of a mask
Let V denote the volume between the face and the mask. Oxygen passes from
outside the mask to within which the person breathes in. While breathing out,
the person ejects CO2 along with of course some fraction of oxygen. Let no (t)
denote the number of oxygen molecules per unit volume at time t and nc (t)
denote the number of carbondioxide molecules per unit volume at time t in the
region between the face and the mask. Let nr (t) denote the number of molecules
of the other gases in the same region per unit volume. Let λc (t)dt denote the
probability of a CO2 molecule escaping away the volume between the mask and
the face in time [t, t + dt]. Then let gc (t)dt denote the number of CO2 molecules
4.2. ON THE DYNAMICS OF BREATHING IN THE PRESENCE OF A MASK75
being ejected from the nose and mouth of the person per unit time. Then, we
have a simple conservation equation
dnc (t)/dt = −nc (t)λc (t) + gc (t)/V
To this differential equation, we can add stochastic terms and express it as
dnc (t)/dt = −nc (t)λc (t) + gc (t)/V + wc (t)
We can ask the question of determining the statistics of the first time τc at which
the CO2 concentration nc (t) exceeds a certain threshold Nc . Sometimes there
can be a chemical reaction between the CO2 trapped in the volume with some
other gases. Likewise other gases within the volume can react and produce CO2 .
If we take into account these chemical reactions, then there will be additional
terms in the above equation describing the rate of change of concentration of
CO2 . For example, consider a reaction of the form
a.CO2 + b.X → c.Y + d.Z
where a, b, c, d are positive integers. X denotes the concentration of the other

gases, ie, gases other than CO2 within the volume. At time t = 0, there are
nc (0) molecules of CO2 and nX (0) molecules of X and zero molecules of Y and
Z. Let K denote the rate constant at fixed temperature T . Then, we have an
equation of the form
dnc (t)/dt = −K.nc (t)a nX (t)b − − − (1)
according to Arrhenius’ law of mass action. After time t, therefore, nc (0)−nc (t)
molecules of CO2 have reacted with (nc (0) − nc (t))b/a moles of X. Thus,
nX (t) = nX (0) − (nc (0) − nc (t))b/a
from which we deduce
dnX (t)/dt = (b/a)dnc (t)/dt = (−Kb/a)nc (t)a .nX (t)b − − − (2)
equations (1) and (2) have to be jointly solved for nc (t) and nX (t). This can be
reduced to a single differential equation for nc (t):
dnc (t)/dt = −K.nc (t)a (nX (0) − (nc (0) − nc (t))b/a)b − − − (3)
Likewise, we could also have a chemical reaction in which CO2 is produced by a

chemical reaction between two compounds U, V . We assume that such reaction
is of the form
pU + qV → r.CO2 + s.M
Let nU (0), nV (0) denote the number of molecules of U and V at time t = 0.
Then, the rate of at which their concentrations change with time is again given
by Arrhenius’ law of mass action
dnU (t)/dt = −K1 nU (t)p nV (t)q

nV (t) = nV (0) − q(nU (0) − nU (t))/p

and then the number of CO2 molecules generated after time t due to this reaction
will be given by
nc (t) = (r/p)(nU (0) − nU (t))
and also
nV (t) = nV (0) − (q/p)(nU (0) − nU (t)) = nV (0) − (q/r)nc (t)
and hence
dnc (t)/dt = (−r/p)dnU (t)/dt = (r/p)K1 nU (t)p .nV (t)q
= (K1 r/p)(nU (0) − pnc (t)/r)p (nV (0) − (q/r)nc (t))q

It follows that combining these two processes we can write Now consider the
combination of two such reactions. Writing down explicitly these reactions, we
have
aCO2 + bX → c.Y + d.Z, p.U + q.V → r.CO2 + s.Z
At time t = 0, we have initial conditions on nc (0), nX (0), nU (0), nV (0). Then
the rate at which CO2 gets generated by the combination of these two reactions
and the other nonchemicalreaction processes of diffusion exit etc. is given by
dnc (t)/dt = −Knc (t)a nX (t)b + (K1 r/p)nU (t)p nV (t)q + f (nc (t), t)
where
dnX (t)/dt = (−Kb/a)nc (t)a nX (t)b , dnU (t)/dt = −K1 nU (t)p nV (t)q ,
dnV (t)/dt = (−K1 q/p)nU (t)p nV (t)q

Here, f (nc (t), t) denotes the rate at which CO2 concentration changes due to the
other effects described above. Of course, we could have more complex chemical
reactions in of the sort
a0 CO2 + a1 X1 + ... + an Xn → b1 Y1 + ... + bm Ym ,
for which the law of mass action would give
dnc (t)/dt = −Knc (t)a0 nX1 (t)a1 ...nXn (t)an ,
nXj (t) = nXj (0) − (aj /a0 )(nc (0) − nc (t)), j = 1, 2, ..., n,
nYj (t) = nYj (0) + (bj /a0 )(nc (0) − nc (t)), j = 1, 2, ..., m
4.3. PLANCHEREL FORMULA FOR SL(2, R) 77
4.3 Plancherel formula for SL(2, R)

Let G = KAN be the usual Iwasawa decomposition and let B denote the com
pact/elliptic Cartan subgroup B = {exp(θ(X − Y )), θ ∈ [0, 2π)}. Let L denote
the noncompact/hyperbolic Cartan subgroup L = {exp(tH), t ∈ R}. Let GB
denote the conjugacy class of B, ie, GB is the elliptic subgroup gBg−1, g ∈ G
and likewise GL will denote the conjugacy class of L ie, the hyperbolic subgroup
gLg −1 , g ∈ G. These two subgroups have only the identity element in common.
Now let Θm denote the discrete series characters. We know that
exp(imθ)
Θm (u(θ)) = −sgn(m) , m ∈ Z, u(θ) = exp(θ(X − Y ))
exp(iθ) − exp(−iθ)
exp(mt)
Θm (a(t)) = , a(t) = exp(tH)
exp(t) − exp(−t)
Weyl’s integration formula gives
�
Θm (f ) = Θm (x)f (x)dx =
G
� 2π
Θm (u(θ))Ff,B (theta)ΔB (θ)dθ
0
�
+ Θm (a(t))Ff,L (t)ΔL (t)dt
R
� �
= sgn(m)exp(imθ)Ff,B (θ)dθ + exp(−|mt|)Ff,L (t)dt
R
�
= sgn(m)F̂f,B (m) + exp(−|mt|)Ff,L (t)dt
R
where Ff,B and Ff,L are respectively the orbital integrals of f corresponding
to the elliptic and hyperbolic groups and if a(θ) is a function of θ, then â(m)
denotes its Fourier transform, ie, its Fourier series coefficients. Thus,
�
Ff,B (θ) = ΔB (θ) f (xu(θ)x−1 )dx
G/B
�
Ff,L (a(t)) = ΔL (t) f (xa(t)x−1 )dx
G/L
where
Δ(θ) = (exp(iθ) − exp(−iθ)), Δ(t) = exp(t) − exp(−t)
Now �
� 2π �
|m|Θm (f ) = ΔL (t)Ff,L (t) |m|Θm (a(t))dt
modd 0 modd
� �
+ ΔB (θ)Ff,B (θ). |m|Θm (u(θ))dθ
R modd
70
78 Stochastic Processes
CHAPTER in Classical
4. RESEARCH and Quantum
TOPICS Physics and Engineering
ON REPRESENATIONS,
Now, � �
|m|Θm (a(t)) = |m|exp(−|mt|)/(exp(t) − exp(−t))
modd modd
� �
mexp(−mt) = (2k + 1)exp(−(2k + 1)t)
m≥1,modd k≥0
�
m.exp(−mt) = −d/dt(1 − exp(−t))−1 = exp(−t)(1 − exp(−t))−2 , t ≥ 0
m≥0
So
�
(2k+1)exp(−(2k+1)t) = exp(−t)(2exp(−2t)(1−exp(−2t))−2 +(1−exp(−2t))−1 )
k ≥0
= exp(−t)(1+exp(−2t))/(1−exp(−2t))2 = (exp(t)+exp(−t))/(exp(t)−exp(−t))2
Thus,
� �
|m|exp(−|mt|) = 2 m.exp(−m|t|) = tanh(t) = w1 (t)
modd m≥0
say.
� �
|m|Θm (u(θ)) = m.exp(imθ)/(exp(iθ) − exp(−iθ)) = w2 (θ)
modd modd
say. Then, the Fourier transform of Ff� ,B (θ) (ie Fourier series coefficients)
�
Fˆf� ,B (m) = Ff� ,B (θ)exp(imθ)dθ =
�
− Ff,B (θ)(im)exp(imθ)dθ = −imFˆf,B (m)
Now, we’ve already seen that

�
Θm (f ) = sgn(m)Fˆf,B (m) + exp(−|mt|)Ff,L (t)dt
R
So
�
−imΘm (f ) = sgn(m)(−imFˆf,B (m)) − im exp(−|mt|)Ff,L (t)dt
�
= sgn(m)Fˆf� ,B (m) − im exp(−|mt|)Ff,L (t)dt
Multiplying both sides by sgn(m) gives us

�
−i|m|Θm (f ) = Fˆf,B
�
(m) − i|m| exp(−|mt|)Ff,L (t)dt
4.4. SOME APPLICATIONS OF THE PLANCHEREL FORMULA FOR A LIE GROUP T
Summing over odd m gives

� � �
−i |m|Θm (f ) = Fˆf� ,B (m) − i w1 (t)Ff,L (t)dt
modd modd
Now, �
�
F̂f� ,B (m) = Ff� ,B (0) −i w1 (t)Ff,L (t)dt
m
Now, �
Ff,B (θ) = (exp(iθ) − exp(−iθ)) f (xu(θ)x−1 )dx
G/B
which gives on differentiating w.r.t θ at θ = 0

�
Ff,B (0) = 2if (1)
Thus, we get the celebrated Plancherel formula for SL(2, R),

� �
2f (1) = − |m|Θm (f ) + w1 (t)Ff,L (t)dt
m
Now we know that the Fourier transform

�
F̂f,L (µ) = Ff,L (t)exp(itµ)dt
R
equals Tiµ (f ), the character of the Principal series corresponding to iµ, µ ∈ R.

Thus defining �
ŵ1 (µ) = w1 (t)exp(itµ)dt
R
the Plancherel formula can be expressed in explicit form:
� �
2f (1) = − |m|Θm (f ) + ŵ(µ)∗ .Tiµ (f )dµ
m R
This formula was generalized to all semisimple Lie groups and also to padic
groups by HarishChandra.
4.4 Some applications of the Plancherel formula

for a Lie group to image processing
ˆ let dµ(χ) be its
Let G be a group and let its irreducible characters be χ, χ ∈ G.
Plancherel measure: �
δe (g) = χ(g).dµ(χ)
or equivalently, �
f (e) = χ(f )dµ(χ)
Ĝ
for all f in an appropriate Schwartz space. Consider an invariant for image

pairs of the form
�
Iχ (f1 , f2 ) = f1 (x)χ(xy −1 )f2 (y)dxdy
G×G
Then if ψ is any class function on Ĝ, we have

�
χ(ψ).Iχ (f1 , f2 )dµ(χ) =
Ĝ
�
f1 (x)χ(ψ 0 )χ(xy −1 )f2 (y)dxdydµ(χ)
�
= f1 (x)ψ1 (xy −1 )f2 (y)dxdy
where �
ψ1 (x) = χ(ψ 0 )χ(x)dµ(χ), ψ 0 (x) = ψ(x−1 )
so that �
χ(ψ ) = 0
ψ(x−1 )χ(x)dx
Remark: let πχ be an irreducible representation having character χ. Then, we

can write �
f (x) = T r(fˆ(πχ )πχ (x))dµ(χ)
where �
fˆ(πχ ) = f (x)πχ (x−1 ))dx
Thus, in particular, if f is a class function,

�
f (x) = f (y)T r(πχ (y −1 )πχ (x))dµ(χ)dy
�
= f (y)χ(y −1 x)dµ(χ)dy
as follows from the Plancherel formula.

Problem: What is the relationship between ψ1 (x) and ψ(x) when ψ is a class
function ?
Note that if the group is compact, then if ψ is a class function and π is an
irreducible unitary representation of dimension d, we have
� �
ψ̂(π) = ψ(y)π(y)∗ )dy = ψ(xyx−1 )π(y −1 )dydx
�
= ψ(y)π(x−1 y −1 x)dydx
Stochastic
4.4. SOMEProcesses in Classical and
APPLICATIONS OFQuantum FORMULA FOR A LIE73GROUP T
THE PLANCHEREL
Now, � �
−1
πab (x yx)dx = πac (x−1 )πcd (y)πdb (x)dx
�
= πcd (y) π̄ca (x).πdb (x)dx = d−1 πcd (y)δ(c, d)δ(a, b)
= d−1 πcc (y)δ(a, b) = d−1 χ(y)δ(a, b)

where χ = T r(π) is the character of π. This means that
ψ̂(π) = d−1 χ(ψ)I
Thus, for a class function on a compact group, we have

�
ψ(g) = d(χ)−1 T r(χ(ψ 0 )π(g))
χ∈Ĝ
�
= d(χ)−1 χ(ψ 0 )χ(g)
χ∈Ĝ
� �
= d(χ)−1 χ(g) ψ(x)χ̄(x)dx
ˆ
χ∈G
This is the same as saying that

�
δg (x) = d(χ)−1 χ(g)χ̄(x)
ˆ
χ∈G
Is this result true for an arbitrary semisimple Lie group when we replace d(χ) by
the Plancherel measure/formal dimension on the space of irreducible characters
? In other words, is the result
�
δg (x) = χ(g)χ̄(x)dµ(χ)
Ĝ
true ?
Chapter 5
Chapter 5
Some Problems in Quantum
Information
Some problems Theory
in quantum
information theory
A problem in quantum information theory used in the proof of the coding the
orem for Cq channels : Let S, T be two nonnegative operators with S ≤ 1, ie,
0 ≤ S ≤ 1, T ≥ 0. Then
1 − (S + T )−1/2 S(S + T )−1/2 ≤ 2 − 2S + 4T
or equivalently,
(S + T )−1/2 S(S + T )−1/2 ≥ 2S − 1 − 4T
To prove this, we define the operators

√ √
A = T ((T + S)−1/2 − 1), B = T
Then, the equation

(A − B)∗ (A − B) ≥ 0
or equivalently,
A∗ A + B ∗ B ≥ A∗ B + B ∗ A
gives
((T +S)−1/2 −1)T +T ((T +S)−1/2 −1) ≤ T +((T +S)−1/2 −1)T ((T +S)−1/2 −1)
This equation can be rearranged go give
2(T + S)−1/2 T + 2T (T + S)−1/2 − 4T ≤ (T + S)−1/2 T.(T + S)−1/2
= (T + S)−1/2 (T + S − S)(T + S)−1/2 = 1 − (T + S)−1/2 S(T + S)−1/2

A rearrangement of this equation gives
4T +1+(S+T )−1/2 S(S+T )−1/2 ≥ 2(S+T )−1/2 S(S+T )−1/2
+2(S+T )−1/2 T +2T (S+T )−1/2
83
75
7684CHAPTERStochastic
5. SOMEProcesses in Classical
PROBLEMS and Quantum
IN QUANTUM Physics and Engineering
INFORMATION THEORY
So to prove the stated result, it suffices to prove that
(S + T )−1/2 S(S + T )−1/2 + (S + T )−1/2 T + T (S + T )−1/2 ≥ S
Let A, B be two states. The hypothesis testing problem amounts to choosing

an operator 0 ≤ T ≤ 1 such that
C(T ) = T r(A(1 − T )) + T r(BT )
is a minimum. Now,
C(T ) = 1 + T r((B − A)T )
It is clear that this is a minimum iff
T = {B − A ≤ 0} = {A − B ≥ 0}
ie T is the projection onto the space spanned by those eigenvectors of B −A that

have positive eigenvalues or equivalently the space spanned by those eigenvectors
of A − B that have nonnegative eigenvalues. The minimum value of the cost
function is then
C(min) = T r(A{A < B}) + T r(B{A > B})
Note that the minimum is attained when T is a projection.

An inequality: Let 0 ≤ s ≤ 1. Then if A, B are any two positive operators,
we have
T r(A{A < B}) ≤ T r(A1−s B s )
To see this we note that
T r(A1−s B s ) ≤ T r((B+(A−B)+ )1−s B s ) ≤ T r(B+(A−B)+ )1−s (B+(A−B)+ )s )
= T r(B+(A−B)+ ) = T r(B)+T r((A−B)+ ) = T r(B)+T r((A−B){A−B > 0})

= T r(B{A − B < 0}) + T r(A.{A − B > 0})
Other related inequalities are as follows:
T r(A1−s B s ) ≥ T r((B − (B − A)+ )1−s B s )
≥ T r((B − (B − A)+ )1−s .(B − (B − A)+ )s ) = T r((B − (B − A)+ )

= T r(B) − T r((B − A)+ ) = T r(B) − T r((B − A){B > A})
= T r(B.{A > B}) + T r(A.{B > A})
In particular, when A, B are states, the minimum probability of discrimination
error has the following upper bound:
P (error) = T r(A.{B > A}) + T r(B.{A > B}) ≤ min0≤s≤1 T r(A1−s B s )

85
= min0≤s≤1 exp(φ(s|A|B))
where
φ(s|A|B) = log(T r(A1−s B s ))
To obtain the minimum in the upper bound, we set the derivative w.r.t s to
zero to get
d/ds(T r(A1−s B s ) = 0
or
T r(log(A)A1−s B s ) = T r(A1−s B s log(B))
It is not clear how to solve this equation for s. Now define
ψ1 (s) = log(T r(A1−s B s ))/s, ψ2 (s) = log(T r(A1−s B s )/(1 − s)
Then an elementary application of Le’Hopital’s rule gives us
ψ1 (0+) = lims↓0 ψ1 (s) = −D(A|B) = −T r(A.(log(A) − log(B)),
ψ2 (1−) = lims↑1 ψ2 (s) = −D(B|A) = −T r(B.(log(B) − log(A))

Consider now the problem of distinguishing between tensor product states A⊗n
and B ⊗n . Let Pn (e) denote the minimum error probability. Then, by the above,
Pn (e) ≤ min0≤s≤1 exp(logT r(A⊗n(1−s) B ⊗ns )
Now,
T r(A⊗n(1−s) B ⊗ns ) = (T r(A1−s B s ))n
Therefore, we can write
limsupn→∞ n−1 log(Pn (e)) ≤
min0≤s≤1 logT r(A1−s B s )

Define
F (s) = T r(A1−s B s )
Then,
F � (s) = −T r(A1−s log(A)B s ) + T r(A1−s B s log(B))
F �� (s) = T r(A1−s (logA)2 B s ) + T r(A1−s B s (log(B))2 )
−T r(A1−s log(A)B s log(B)) + T r(A1−s .log(A).B s .log(B))
= T r(A1−s (logA)2 B s ) + T r(A1−s B s (log(B))2 ) ≥ 0
Hence F (s) is a convex function and hence F � (s) is increasing. Further,
F � (0) = −T r(A.log(A))+T r(A.log(B)) = −D(A|B) ≤ 0, F � (1)
= −T r(B.log(A))+T r(B.log(B)) = D(B|A) ≥ 0

Hence F � (s) changes sign in [0, 1] and hence it must be zero at some s ∈ [0, 1],
say at s0 and since F is convex, it attains its minimum at s0 .
78 86CHAPTERStochastic
5. SOMEProcesses in Classical
PROBLEMS and Quantum
IN QUANTUM Physics and Engineering
INFORMATION THEORY
Let ρ, σ be two states and let

� �
ρ= p(i)|ei >< ei |, σ = q(i)|fi >< fi |
i i
be their respective spectral decompositions. Define
p(i)| < ei |fj > |2 = P (i, j), q(j)| < ei |fj > |2 = Q(i, j)
Then P, Q are both classical probability distributions. We have for 0 ≤ T ≤ 1,

�
T r(ρ(1 − T )) = < ei |ρ|fj >< fj |1 − T |ei >
i,j
�
= p(i) < ei |fj > (< fj |ei > − < fj |T |ei >)
i,j
�
=1− p(i) < ei |fj > t(j, i)
i,j
and likewise,
� �
T r(σT ) = < ei |σ|fj > t(j, i) = q(j) < ei |fj > t(j, i)
i,j i,j
where
t(j, i) =< fj |T |ei >
So,
�
T r(ρ(1 − T )) + T r(σT ) = 1 + t(j, i)(q(j) − p(i)) < ei |fj > t(j, i)
i,j
Now,
� � �
| p(i) < ei |fj > t(j, i)|2 ≤ p(i)|ei |fj > |2 . p(i)|t(j, i)|2
i,j i,j i,j
�
= p(i)|t(j, i)|2
i,j
Thus, �
T r(ρ(1 − T )) ≥ 1 − p(i)|t(j, i)|2
i,j
�
T r(σT ) = q(j) < ei |fj > t(j, i)
i,j
So �
T r(σT )2 ≤ q(j)|t(j, i)|2
i,j
87
Then, if c is any positive real number,

� �
T r(ρ(1 − T )) − c.T r(σT ) ≥ 1 − p(i)|t(j, i)|2 − c. q(j)|t(j, i)|2
i,j i,j
Suppose now that T is a projection. Then,

� �
T r(ρ(1 − T )) = p(i) < ei |1 − T |ei >= p(i) < ei |(1 − T )2 |ei >
i i
�
= p(i)| < ei |1 − T |fj > |2 ≥i,j a(i, j)| < ei |1 − T |fj > |2
i,j
and likewise,
� �
T r(σT ) = q(j)| < fj |T |ei > |2 ≥ a(i, j)| < fj |T |ei > |2
i,j i,j
where
a(i, j) = min(p(i), q(j))
Now,
| < ei |1 − T |fj > |2 + < ei |T |fj > |2 ≥ (1/2)| < ei |1 − T |fj > + < ei |T |fj > |2
= (1/2)| < ei |fj > |2

Thus, �
T r(ρ(1 − T )) + T r(σT ) ≥ (1/2) a(i, j)| < ei |fj > |2
i,j
� �
= (1/2) P (i, j)χp(i)≤q(j) + (1/2) Q(i, j)χp(i)>q(j)
i,j i,j
� �
= (1/2) P (i, j)χP (i,j)≤Q(i,j) + (1/2) Q(i, j)χP (i,j)>Q(i,j)
i,j i,j
Now, for tensor product states, this result gives
T r(ρ⊗n (1 − Tn )) + T r(σ ⊗n Tn ) ≥
(1/2)P n (P n ≤ Qn ) + (1/2)Qn (P n > Qn )

for any measurement Tn in the tensor product space, ie, 0 ≤ Tn ≤ 1. Now by
the large deviation principle as n → ∞, we have
n−1 .log(P n (P n ≥ Qn )) ≈ −infx≤0 I1 (x)
where I1 is the rate function under P n of the random family n−1 log(P n /Qn ).
Evaluating the moment generating function of this r.v gives us
Mn (ns) = EP n exp(s.log(P n /Qn )) = EP n (P n /Qn )s

88CHAPTER 5. SOME PROBLEMS IN QUANTUM INFORMATION THEORY
�
= (EP (P/Q)s )n = ( P 1−s Qs )n = Ms (P, Q)n
and hence the rate function is
I1 (x) = sups (sx − logMs (P, Q)) =

�
sups (sx − log( P (ω)1−s Q(ω)s ))
ω
Then, �
infx≤0 I1 (x) = sups≥0 − log( P (ω)1−s Q(ω)s ))
ω
�
= −infs≥0 log( P (ω)1−s Q(ω)s ))
ω
This asymptotic result can also be expressed as
liminfn→∞ inf0≤Tn ≤1 [n−1 log(T r(ρ⊗n (1 − Tn )) + T r(σ ⊗n Tn ))]

� �
≥ −inf0≤s≤1 [log( P 1−s Qs )] − inf0≤s≤1 [log Q1−s P s ]
Relationship between quantum relative entropy and classical relative en
tropy: �
T r(ρ.log(ρ)) = p(i)log(p(i))
i
�
T r(ρ.log(σ)) = p(i)log(q(j)) < ei |fj > |2
i,j
D(ρ|σ) = T r(ρ.log(ρ) − ρ.log(σ))

Therefore,
�
D(ρ|σ) = p(i) < ei |fj > |2 (log(p(i)) + log(| < ei |fj > |2 ))
i,j
�
− p(i) < ei |fj > |2 (log(q(j)) + log(| < ei |fj > |2 ))
i,j
(Note the cancellations)

� �
= P (i, j)log(P (i, j)) − P (i, j)log(Q(i, j))
i,j i,j
= D(P |Q)
Let X be a r.v. with moment generating function
M (λ) = E[exp(λX)]
5.1. QUANTUM INFORMATION THEORY AND QUANTUM BLACKHOLE PHYSICS8
Its logarithmic moment generating function is
Λ(λ) = log(M (λ))
Let µ = E(X) be its mean. Then,
µ = Λ� (0)
Further,
Λ�� (λ) ≥ 0∀λ
ie Λ is a convex function. Hence Λ� (λ) is increasing for all λ. Now suppose
x ≥ µ. Consider
ψ(x, λ) = λ.x − Λ(λ)
Since
∂λ2 ψ(x, λ) = −Λ�� (λ) < 0
it follows that ∂λ ψ(x, λ) is decreasing or equivalently that ψ(x, λ) is concave in
λ. Suppose λ0 is such that
∂λ ψ(x, λ0 ) = 0
Then it follows that for a fixed value of x, ψ(x, λ) attains its maximum at
λ0 . Now ∂λ ψ(x, λ0 ) = 0 and ∂λ ψ(x, λ) is decreasing in λ. It follows that
∂λ ψ(x, λ) ≤ 0 for λ ≤ λ0 . Hence, ψ(x, λ) is decreasing for λ ≥ λ0 . Also for
λ < λ0 , ∂λ ψ(x, λ) ≥ 0 (since ∂λ ψ(x, λ0 ) = 0 and ∂λ ψ(x, λ) is decreasing for all
λ) which means that ψ(x, λ) is increasing for λ < λ0 . Now
∂λ ψ(x, 0) = x − µ ≥ 0
by hypothesis. Since ∂λ ψ(x, λ) is decreasing in λ, it follows that ∂λ ψ(x, λ) ≥ 0

for all λ ≤ 0. This means that ψ(x, λ) is increasing for all λ ≥ 0. In particular,
ψ(x, λ) ≤ ψ(x, 0) = x, ∀λ ≤ 0
Thus,
I(x) = supλ∈R ψ(x, λ) = supλ≥0 ψ(x, λ)
in the case when x ≥ µ. Consider now the case when x ≤ µ.
5.1 Quantum information theory and quantum

blackhole physics
Let ρ1 (0) be the state of the system of particles within the event horizon of a
Schwarzchild blackhole and ρ2 (0) the state outside. At time t = 0, the state of
the total system to the interior and the exterior of the blackhole is
ρ(0) = ρ1 (0) ⊗ ρ2 (0)

Here, as a prototype example, we are considering the KleinGordon field of

particles in the blackhole as well as outside it. Let χ(x) denote this KG field.
Its action functional is given by
� � �
S[χ] = (1/2) (g µν (x) −g(x)χ,µ (x)χ,ν (x) − µ2 −g(x)χ(x)2 )d4 x
� �
= (1/2) [g µν (x)χ,µ (x)χ,ν (x) − µ2 χ(x)2 ] −g(x)d4 x
which in the case of the Schwarzchild blackhole reads

� �
S[χ] = (1/2) [α(r)−1 χ2,t −α(r)χ2,r −r−2 χ2,θ −r−2 sin(θ)−2 χ2,φ ]r2 sin(θ)dtdrdθdφ = Ld4 x
The corresponding KG Hamiltonian can also be written down. The canonical

position field is χ and the canonical momentum field is
π = δS/δχ,t = α(r)−1 χ,t
so that the Hamiltonian density is
H = π.χ,t − L =
(1/2)[α(r)−1 χ,t
2 2
+ α(r)χ,r + r−2 χ,θ
2
+ r−2 sin(θ)−2 χ2,φ ]
= (1/2)[α(r)π 2 + α(r)χ2,r + r−2 χ2,θ + r−2 sin(θ)−2 χ2,φ + µ2 χ2 ]
The KG field Hamiltonian is then
� �
H = Hd3 x = Hr2 sin(θ)drdθdφ
The radial Hamiltonian: Assume that the angular oscillations of the KG

field are as per the classical equations of motion of the field corresponding to
an angular momentum square of l(l + 1). Then, we can write
χ(t, r, θ, φ) = ψ(t, r)Ylm (θ, φ)
an substituting this into the action functional, performing integration by parts
using
L2 Ylm (θ, φ) = l(l + 1)Ylm (θ, φ),
∂ ∂ 1 ∂2
L2 = −(sin(θ)−1sin(θ) +
∂θ ∂θ sin2 (θ) ∂φ2
gives us the ”radial action functional” as
�
Sl [φ] = (1/2) (α(r)−1 ψ,t
2 2
− α(r)ψ,r − l(l + 1)ψ 2 /r2 − µ2 ψ 2 )r2 dtdr
and the corresponding radial Hamiltonian as

� ∞
Hl (ψ, π) = (1/2) [α(r)(π 2 + ψ,r
2
) + (µ2 + l(l + 1)/r2 )ψ 2 ]r2 dr
0
where ψ = ψ(t, r), π = π(t, r), the canonical position and momentum fields are
functions of only time and the radial coordinate.
5.2. HARMONIC ANALYSIS ON THE SCHWARZ SPACE OF G = SL(2, R)91
5.2 Harmonic analysis on the Schwarz space of

G = SL(2, R)
Let fmn (x, µ) denote the matrix elements of the unitary principal series ap
pearing in the Plancherel formula and let φkmn (x) the matrix elements of the
discrete series appearing in the Plancherel formula. Then we have a splitting
of the Schwarz space into the direct sum of the discrete series Schwarz space
and the principal series Schwarz space. Thus, for any f ∈ L2 (G), we have the
Plancherel formula
� � �
2 2
|f (x)| dx = d(k)| < f |φkmn > | + | < f |fmn (., µ) > |2 c(mn, µ)dµ
G kmn R
where d(k)' s are the formal dimensions of the discrete series while c(mn, µ)
are determined from the wave packet theory for principal series, ie, from the
asymptotic behaviour of the principal series matrix elements. The asymptotic
behaviour of the principal and discrete matrix elements is obtained from the
differential equations
Ω.fmn (x, µ) = −µ2 fmn (x, µ)
where Ω is the Casimir element for SL(2, R). We can express x ∈ G as x =

u(θ1 )a(t)u(θ2 ) and Ω as a second order linear partial differential operator in
∂/∂t, ∂/∂θk , k = 1, 2. Then, observing that
fmn (u(θ1 )a(t)u(θ2 ), µ) = exp(i(mθ1 + nθ2 ))fmn (a(t), µ)
we get on noting that the restriction of the Casimir operator to R.H is given by
d2
(Ωf )(a(t)) = ΔL (t)−1 ΔL (t)f (a(t))
dt2
where
ΔL (t) = et − e−t
as follows by applying both sides to the finite dimensional irreducible characters
of SL(2, R) restricted to R.H on noting that the restriction of such a character is
given by (exp(mt)−exp(−mt))/(exp(t)−exp(−t)), we get a differential equation
of the form
d2
fmn (a(t), µ) − qmn (t)fmn (a(t), µ) = −µ2 fmn (a(t), µ)
dt2
where qmn (t) is expressed in terms of the hyperbolic Sinh functions. From
this equation, the asymptotic behaviour of the matrix elements fmn (a(t), µ) as
|t| → ∞ may be determined. It should be noted that the discrete series modules
can be realized within the principal series modules and hence the asymptotic be
haviour of the matrix elements fmn (a(t), µ) will determine that of the principal
series when µ is nonintegral and that of the discrete series when µ is integral.
Specifically, consider a principal series representation π(x). Let em denote its

highest weight vector and define
fm (x) =< π(x)em |em >
where now x is regarded as an element of the Lie algebra g of G. We have for

any a ∈ U (g), the universal enveloping algebra of g, the fact that
π(a).em = 0 =⇒ afm (x) = f (x; a) = 0
This means that if we consider the U (g)module V = U (g)fm and the map
T : π(U (g))em → V defined by T (π(a)em ) = a.fm , then it is clear that T is a
well defined linear map with kernel
KT = {π(a)em : a ∈ U (g), a.fm = 0}
Note that π(a)em ∈ KT implies a.fm = 0 implies b.a.fm = 0 implies π(b)π(a)em =

π(b.a)em ∈ KT for any a, b ∈ U (g). Thus, KT is U (g) invariant. Then, the
U (g)module V is isomorphic to the quotient module π(U (g))em /KT . However,
π(U (g))em is an irreducible (discrete series) module. Hence, the submodule
KT of it must be zero and hence, we obtain the fundamental conclusion that
V = U (g)fm is an irreducible discrete series module. Note that fm is a princi
pal series (reducible since m is an integer) matrix element and this means that
we have realized the discrete series representations within the principal series.
This also means that in order to study the asymptotic properties of the matrix
elements of the discrete series, it suffices to study asymptotic properties of dif
ferential operators in U (g) acting on the principal series matrix elements fm .
5.3 Monotonicity of quantum entropy under pinch

ing
Let |ψ1 >, ..., |ψn > be normalized pure states, not necessarily orthogonal. Let
p(i), i = 1, 2, ..., n be a probability distribution and let |i >, i = 1, 2, ..., n be an
orthonormal set. Consider the pure state
n �
�
|ψ >= p(i)|ψi > ⊗|i >
i=1
Note that this is a state because

��
< ψ|ψ >= p(i)p(j) < ψj |ψi >< j|i >= p(i)p(j) < ψj |ψi > δ(j, i)
i,j i,j
�
= p(i) = 1
i
We have �
ρ1 = T r2 (|ψ >< ψ|) = p(i)|ψi >< ψi |
i
5.3. MONOTONICITY OF QUANTUM ENTROPY UNDER PINCHING 93
��
ρ2 = T r1 (|ψ >< ψ| = p(i)p(j) < ψj |ψi > |i >< j|
i
Then by Schmidt’s theorem,
H(ρ1 ) = H(ρ2 ) ≤ H(p)
Now let σ1 , σ2 be two mixed states and let p > 0, p + q = 1. Consider the
spectral decompositions
� �
σ1 = p(i)|ei >< ei |, σ2 = q(i)|fi > fi |
i i
Then, �
pσ1 + qσ2 = (pp(i)|ei >< ei | + qq(i)|fi >< fi |)
i
So by the above result,
H(pσ1 + qσ2 ) ≤ H({pp(i), qq(i)})

�
=− (pp(i)log(pp(i)) + qq(i)log(qq(i)))
i
= −p.log(p) − q.log(q) + pH({p(i)}) + qH({qi }) = H(p, q) + p.H(σ1 ) + q.H(σ2 )

where
H(p, q) = −p.log(p) − q.log(q) + pH({p(i)})
Now consider several mixed states σ1 , ..., σn and a probability distribution p(i), i =
1, 2, ..., n. Then
n
� n
�
p(i)σi = p(1)σ1 + (1 − p(1)) p(i)/(1 − p(1))σi
i=1 i=2
So the same argument and induction gives

n
� n
�
H( p(i)σi ) ≤ H(p(1), 1−p(1))+p(1)H(σ1 )+(1−p(1)) (p(i)/(1−p(1)))H(σi )
i=1 i=2
+H(p(i)/(1 − p(1)), i = 2, 3, ..., n)

n
�
= H(p(i), i = 1, 2, ..., n) + p(i)H(σi )
i=1
Remark: Let ρ be a state and {Pi } a PVM. Define the state

� �
σ= Pi ρ.Pi = 1 − Pi (1 − ρ)Pi
i i
Then
0 ≤ D(ρ|σ) = T r(ρ.log(ρ)) − T r(ρ.log(σ))
� �
= −H(ρ) − T r(ρ.Pi .log(Pi ρ.Pi )Pi ) = −H(ρ) − T r(Pi ρ.Pi .log(Pi ρ.Pi )
i i
� �
= −H(ρ) − T r(Pi ρ.Pi .log( Pj ρ.Pj )) = −H(ρ) + H(σ)
i j
Thus,
H(σ) ≥ H(ρ)
which when stated in words, reads, pinching increases the entropy of a state.
Remark: We’ve made use of the following facts from matrix theory:
� �
log( Pi .ρ.Pi ) = Pi .log(ρ)Pi
i i
because the Pi� s satisfy Pj Pi = 0, j =

� i. In fact, we write
� �
Pi ρ.Pi = 1 − Pi (1 − ρ)Pi = 1 − A
i i
and use
log(1 − z) = −z − z 2 /2 − z 3 /3 − ...
combined with Pj Pj = Pi δ(i, j) to get
�
log( Pi ρ.Pi ) = −A − A2 /2 − A3 /3 − ...
i
where
� � �
A= Pi (1 − ρ)Pi , A2 = Pi (1 − ρ)2 Pi , ..., An = − Pi (1 − ρ)n Pi
i i i
so that �
log(1 − A) = Pi .log(ρ)Pi
i
A particular application of this result has been used above in the form
��
H( p(i)p(j) < ψj |ψi > |i >< j|)
i
��
≤ H( p(i)p(j) < ψj |ψi >< k|i >< j|k > |k >< k|)
i,k
� �
= H( p(k)|k >< k|) = H({p(k)}) = − p(k).log(p(k))
k k
Chapter 6
Chapter 6
Lectures on Statistical
Lectures onSignal Processing
Statistical
Signal Processing
[1] Large deviation principle applied to the least squares problem of estimating
vector parameters from sequential iid vector measurements in a linear model.
X(n) = H(n)θ + V (n), n = 1, 2, ...
V (n)� s are iid and have the pdf pV . Then θ is estimated based on the measure
ments X(n), n = 1, 2, ..., N using the maximum likelihood method as
N
�
θ̂(N ) = argmaxθ N −1 log(pV (X(n) − H(n)θ))
n=1
Consider now the case when H(n) is a matrix valued stationary process in
dependent of V (.). Then, the above maximum likelihood estimate is replaced
by
�
θ̂(N ) = argmaxθ [ΠN N
n=1 pV (X(n) − H(n)θ)]pH (H(1), ..., H(N ))Πj=1 dH(j)
In the special case when the H(n)� s are iid, the above reduces to
N
� �
θ̂(N ) = argmaxθ N −1 log( pV (X(n) − Hθ)pH (H)dH)
n=1
The problem is to calculate the large deviation properties of θ̂(N ), namely the
rate at which P (|θ̂(N ) − θ| > δ) converges to zero as N → ∞.
95
87
96 CHAPTER 6. LECTURES ON STATISTICAL SIGNAL PROCESSING
6.1 Large deviation analysis of the RLS algo

rithm
Consider the problem of estimating θ from the sequential linear model
X(n) = H(n)θ + V (n), n ≥ 1
sequentiall. The estimate based on data collected upto time N is
N
� N
�
θ(N ) = [ H(n)T H(n)]−1 [ H(n)T X(n)]
n=1 n=1
To cast this algorithm in recursive form, we define
N
� N
�
R(N ) = H(n)T H(n), r(N ) = H(n)T X(n)
n=1 n=1
Then,
R(N +1) = R(N )+H(N +1)T H(N +1), r(N +1) = r(N )+H(N +1)T X(N +1)
Then,
R(N +1)−1
= R(N )−1 −R(N )−1 H(N +1)T (I+H(N +1)H(N +1)T )−1 H(N +1)R(N )−1
so that
θ(N + 1) = θ(N ) + K(N + 1)(X(N + 1) − H(N + 1)θ(N ))
where K(N +1) is a Kalman gain matrix expressible as a function of R(N ), H(N +
1). Our aim is to do a large deviation analysis of this recursive algorithm. Define
δθ(N ) = θ(N ) − θ
Then we have
δθ(N + 1) = δθ(N ) + K(N + 1)(X(N + 1) − H(N + 1)θ − H(N + 1)δθ(N ))
= δθ(N ) + K(N + 1)(V (N + 1) − H(N + 1)δθ(N ))
We assume that the noise process V (.) is small and perhaps even nonGaussian.
This fact is emphasized by replacing V (n) with �.V (n). Then, the problem
is to calculate the relationship between the LDP rate function of δθ(N ) and
δθ(N + 1).
6.2. PROBLEM SUGGESTED BY MY COLLEAGUE PROF. DHANANJAY97
6.2 Problem suggested

Large deviation analysis of magnet with time varying magnetic moment falling
thorugh a coil. Assume that the coil is wound around a tube with n0 turns per
unit length extending from a height of h to h + b. The tube is cylindrical and
extends from z = 0 to z = d. Taking relativity into account via the retarded
potentials, if m(t) denotes the magnetic moment of the magnet, then the vector
potential generated by it is approximately obtained from the retarded potential
formula as
A(t, r) = µm(t − r/c) × r/4πr3
where we assume that the magnet is located near the origin. This approximates
to
A(t, r) = µm(t) × r/4πr3 − µm� (t) × r/4πcr2
Now Assume that the magnet falls through the central axis with its axis always
being oriented along the −ẑ direction. Then, after time t, let its z coordinate
be ξ(t). We have the equation of motion
M ξ �� (t) = F (t) − M g
where F (t) is the z component of the force exerted by the magnetic field gener
ated by the current induced in the coil upon the magnet. Now, the flux of the
magnetic field generated by the magnet through the coil is given by
�
Φ(t) = n0 Bz (t, X, Y, Z)dXdY dZ
2 ,h≤z≤h+b
X2 +Y 2 ≤RT
where
Bz (t, X, Y, Z) = ẑ.B(t, X, Y, Z),
B(t, X, Y, Z) = curlA(t, R − ξ(t)ẑ)
where
R = (X, Y, Z)
and where RT is the radius of the tube. The current through the coil is then
I(t) = Φ� (t)/R0
where R0 is the coil resistance. The force of the magnetic field generated by
this current upon the falling magnet can now be computed easily. To do this,
we first compute the magnetic field Bc (t, X, Y, Z) generated by the coil using
the usual retarded potential formula
Bc (t, X, Y, Z) = curlAc (t, X, Y, Z)

� 2n0 π
Ac (t, X, Y, Z) = (µ/4π) (I(t−|R−RT (x̂.cos(φ)+ŷ.sin(φ))−ẑ(b/2n0 π)φ|/c).
0
.(−x.sin
ˆ (φ) + ŷ.cos(φ))/|R − RT (x̂.cos(φ) + ŷ.sin(φ)) − ẑ(b/2n0 π)φ|)RT dφ
Note that Ac (t, X, Y, Z) is a function of ξ(t) and also of m(t) and m� (t). It follows
that I(t) is a function of ξ(t), ξ � (t), m(t), m� (t), m�� (t). Then we calculate the
interaction energy between the magnetic fields generated by the magnet and by
the coil as
�
� −1
Uint (ξ(t), ξ (t), t) = (2µ) (Bc (t, X, Y, Z), B(t, X, Y, Z))dXdY dZ
and then we get for the force exerted by the coil’s magnetic field upon the falling
magnet as
F (t) = −∂Uint (ξ(t), ξ � (t), t)/∂ξ(t)
Remark: Ideally since the� interaction energy between the two magnetic fields
appears in the Lagrangian (E 2 /c2 − B 2 )d3 R/2µ, we should write down the
EulerLagrange equations using the Lagrangian
L(t, ξ � (t), ξ � (t)) = M ξ � (t)2 /2 − Uint (ξ(t), ξ � (t), t) − M gξ(t)
in the form
d
∂L/∂ξ � − ∂L/∂ξ = 0
dt
which yields
d
M ξ �� (t) − ∂U/∂ξ � + ∂Uint /∂ξ + M g = 0
dt
6.3 Fidelity between two quantum states

Let ρ, σ be two quantum states in the same Hilbert space H1 . Let P (ρ), P (σ)
denote respectively the set of all purifications of ρ and σ w.r.t the same reference
Hilbert space H2 . Thus, if |u >∈ H1 ⊗ H2 is a purification of ρ, ie |u >∈ P(ρ),
then
ρ = T r2 |u >< u|
and likewise for σ. Then, the fidelity between ρ and σ is defined as
F (ρ, σ) = sup|u>∈P (ρ),|v>∈P (σ) | < u|v > |
Theorem:
√ √
F (ρ, σ) = T r(| ρ. σ|)
Proof: Let |u >∈ P (ρ). Then, we can write
�
|u >= |ui ⊗ wi >
i
6.3. FIDELITY BETWEEN TWO QUANTUM STATES 99
where {|wi >} is an onb for H2 and

�
ρ= |ui >< ui |
i
Likewise if |v > is a purification of σ, we can write

�
|v >= |vi ⊗ wi >
i
where �
σ= |vi >< vi |
i
But then �
< u|v >= < ui |vi >
i
Define the matrices

� �
X= |ui >< wi |, Y = |vi >< wi |
i i
Then we have
ρ = XX ∗ , σ = Y Y ∗
Hence, by the polar decomposition theorem, we can write

√ √
X= ρU, Y = σV
where U, V are unitary matrices. Then,

� � √ √
< u|v >= < ui |vi >= T r |vi >< ui | = T r(Y X ∗ ) = T r( σV U ∗ ρ)
i i
√ √ √ √
= T r( ρ σV U ∗ ) = T r( ρ. σW )
where W = V U ∗ is a√unitary matrix. It is then clear from the singular value

√
decomposition of ρ.√ σ that the maximum of | < u|v > | which equals the
√
maximum of |T r( ρ. σW )| as W√varies over all unitary matrices is simply the
√
sum of the singular values of ρ. σ, ie,
√ √
F (ρ, σ) = T r| ρ σ|
√ √ √ √
= T r[( ρσ. ρ)1/2 ] = T r[( σ.ρ. σ)1/2 ]
6.4 Stinespring’s representation of a TPCP map,

ie, of a quantum operation
Let K be a quantum operation. Thus, for any state ρ,
N
� N
�
K(ρ) = Ea ρ.Ea∗ , Ea∗ Ea = 1
a=1 a=1
We claim that K can be represented as
K(ρ) = T r2 (U (ρ ⊗ |u >< u|)U ∗ )
where ρ acts in H1 , |u >∈ H2 and U is a unitary operator in H1 ⊗ H2 . To see

this, we define
V ∗ = [E1∗ , E2∗ , ..., EN
∗
]
Then,
V ∗ V = Ip
In other words, the columns of V are orthonormal. We assume that
ρ ∈ Cp×p
so
V ∈ CN p×p
Note that
V ρV ∗ = ((Ea ρEb∗ ))1≤a,b≤N
as an N p × N p block structured matrix, each block being of size p × p. Now
define
|u >= [1, 0, ..., 0]T ∈ CN
Then, � �
ρ 0
|u >< u| ⊗ ρ = ∈ CN p×N p
0 0
Since the columns of V are orthonormal, we can add columns to it to make it a
square N p × N p unitary matrix U ∗ :
U = [V, W ] ∈ CN p×N p , U ∗ U = IN p
Then, � ��
∗ ρ 0 V∗
U (|u >< u| ⊗ ρ)U = [V, W ]
0 0 W∗
= V ρV ∗
from which it follows that
�
T r1 (U (|u >< u| ⊗ ρ)U ∗ ) = T r1 (V ρV ∗ ) = Ea ρEa∗
a
6.5. RLS LATTICE ALGORITHM, STATISTICAL PERFORMANCE ANALYSIS101
Remark: Consider the block structured matrix
X = ((Bij ))1≤i,j≤N
where each Bij is a p × p matrix. Then

L
T r1 (X) = Bii ∈ Cp×p
i
and
T r2 (X) = ((T r(Bij )))1≤i,j≤N ∈ CN ×N
6.5 RLS lattice algorithm, statistical performance

analysis
The RLS lattice algorithm involves updating the forward and backward pre
diction parameters simultaneously in time and order. In order to carry out
a statistical performance analysis, we first simply look at the block processing
problem involving estimating the forward prediction filter parameter vector aN,p
of order p based on data collected upto time N . Let
x(n) = [x(n), x(n−1), ..., x(0)]T , z −r x(n)
= [x(n−r), x(n−r−1), ..., x(0), 0, ..., 0]T , r = 1, 2, ...
where these vectors are all of size N + 1. Form the data matrix of order p at
time N :
XN,p = [z −1 x(n), ..., z −p x(n)] ∈ RN +1×p
Then the forward prediction filter coefficient estimate is
aN,p = (XTN,p XN,p )−1 XTN,p x(N )
If x(n), n ∈ Z is a zero mean stationary Gaussian process with autocorrelation

R(τ ) = E(x(n + τ )x(n)), then the problem is to calculate the statistics of aN,p .
To this end, we observe that
N −1 (z −i x(N ))T (z −j x(N )) =
N
L
−1
N x(n − i)x(n − j) ≈ R(j − i)
n=max(i,j)
So we write
N −1 (z −i x(N ))T (z −j x(N )) = R(j − i) + δRN (j, i)

94102 CHAPTER
6. LECTURES ONClassical and Quantum
STATISTICAL Physics
SIGNAL and Engineering
PROCESSING
where now δRN (j, i) is a random variable that is a quadratic function of the
Gaussian process {x(n)} plus a constant scalar. Equivalently,
N −1 XTN,p XN,p = Rp + δRN,p
where
Rp = ((R(j − i)))1≤i,j≤p
is the statistical correlation matrix of size p and δRN,p is a random matrix.
Likewise,
N
�
N −1 (z −i x(N ))T x(N ) = N −1 x(n − i)x(n) = R(i) + δRN (0, i)
n=i
Thus,
N −1 XTN,p x(n) = r(1 : p) + δrN (0, 1 : p)
where
r(1 : p) = [R(1), ..., R(p)]T , δrN (1 : p) = [δRN (0, 1), ..., δRN (0, p)]T
6.6 Approximate computation of the matrix el

ements of the principal and discrete series
for G = SL(2, R)
These matrix elements fmn (x, µ) for µ nonintegral are the irreducible princi
pal series matrix elements and fmm (x, m) = fm (x) for k integral generate the
Discrete series Dm modules when acted upon by U (g), the universal enveloping
algebra of G. These matrix elements satisfy the equations
fmn (u(θ1 )xu(θ2 ), µ) = exp(i(mθ1 + nθ2 ))fmn (x, µ),
Ωfmn (x, µ) = −µ2 fmn (x, µ)

the second of which, in view of the first and the expression of the Casimir opera
tor Ω in terms of t, θ1 , θ2 in the singular value decomposition x = u(θ1 )a(t)u(θ2 )
simplifies to an equation of the form
d2 fmn (t, µ)/dt2 − qmn (t)fmn (t, µ) + µ2 fmn (t, µ) = 0
where fmn (t, µ) = fmn (a(t), µ). This determines the matrix elements of the
principal series. For the discrete series matrix elements, we observe that these
are
φkmn (x) =< πk (x)em |en >
6.7. PROOF OF in Classical
THE and Quantum
CONVERSE PARTPhysics
OF THE andCQ
Engineering 95
CODING THEOREM103
where πk is the reducible principal series for integral k. Replacing k by −k

where k = 1, 2, ..., the discrete series matrix elements are
φ−kmn (x) =< π−k (x)em |en >, m, n = −k − 1, −k − 2, ...
with
π(X)e−k−1 = 0
Note that we can using the method of lowering operators, write
π−k (Y )r e−k−1 = c(k, r)e−k−1−r , r = 0, 1, 2, ...
where c(k, r) are some real constants. Thus, writing
fk+1 (x) =< π−k (x)e−k−1 |e−k−1 >, x ∈ G,
we have for the discrete series matrix elements,
φ−k,−k−1−r,−k−1−s (x) =< π−k (xY r )e−k−1 |π−k (Y s )e−k−1 >
=< π−k (Y ∗s xY r )e−k−1 |e−k−1 >= fk+1 (Y ∗s ; x; Y r )

In this way, all the discrete series matrix elements can be determined by applying
differential operators on fk+1 (x).
Now given an image field h(x) on SL(2, R), in order to extract invariant
features, or to estimate the group transformation element parameter from a
noisy version of it, we must first evaluate its matrix elements w.r.t the discrete
and principal series, ie,
� �
h(x)φ−k,−k−1−r,−k−1−s (x)dx, h(x)fmn (x, µ)dx
G G
Now,
� �
h(x)φ−k,−k−1−r,−k−1−s (x)dx = h(x)fk+1 (Y ∗s ; x; Y r )dx
G G
�
= h(Y s ; x; Y ∗r )fk+1 (x)dx
G
6.7 Proof of the converse part of the Cq coding

theorem
Let φ(i), i = 1, 2, ..., N be a code. Note that the size of the code is N . Let s ≤ 0
and define
ps = argmaxp Is (p, W )
96 104 CHAPTER
Stochastic Processes in Classical
6. LECTURES and QuantumSIGNAL
ON STATISTICAL Physics and Engineering
PROCESSING
where
Is (p, W ) = minσ Ds (p ⊗ W |p ⊗ σ)
where
p ⊗ W = diag[p(x)W (x), x ∈ A], p ⊗ σ = diag[p(x)sigma, x ∈ A]
Note that
Ds (p ⊗ W |p ⊗ σ] = log(T r((p ⊗ W )1−s (p ⊗ σ)s )/(−s)
= (1/ − s) log sumx T r(p(x)W (x))1−s (p(x)σ)s )

�
= (1/ − s)log p(x)T r(W (x)1−s σ s )
x
so that
lims→0 Ds (p ⊗ W |p ⊗ σ) =
�
p(x)T r(W (x)log(W (x)) − T r(Wp .log(σ))
x
where �
Wp = p(x)Wx
x
Note that this result can also be expressed as

�
lims→0 Ds (p ⊗ W |p ⊗ σ) = p(x)D(W (x)|σ)
x
We define
�
f (t) = T r(tWx1−s + (1 − t) ps (y)Wy1−s )1/(1−s) , t ≥ 0
y
Note that �
Is (p, W ) = (T r( p(y)Wy1−s )1/(1−s) )1−s
y
and since by definition of ps , we have
Is (ps , W ) ≥ Is (p, W )∀p,
it follows that
f (t) ≤ f (0)∀t ≥ 0
Thus
f � (0) ≤ 0
This gives
� �
T r[(Wx1−s − ps (y)Wy1−s ).( ps (y)Wy1−s )s/(1−s) ] ≤ 0
y y
6.7. PROOF OF in Classical
THE and Quantum
CONVERSE PARTPhysics
OF THEandCQ
Engineering 97
CODING THEOREM105
or equivalently,
� �
T r(Wx1−s .( ps (y)Wy1−s )s/(1−s) ) ≤ T r( ps (y)Wy1−s )1/(1−s)
y y
Now define the state

� �
σs = ( ps (y)Wy1−s )1/(1−s) /T r( ps (y)Wy1−s )1/(1−s)
y y
Then we can express the above inequality as

�
T r(Wx1−s σss ) ≤ [T r( ps (y)Wy1−s )1/(1−s) ]1−s
y
Remark: Recall that

�
Ds (p ⊗ W |p ⊗ σ] = (1/ − s)log p(x)T r(Wx1−s σ s )
x
= (1/ − s)log(T r(Aσ s )

where �
A= p(x)Wx1−s
x
Note that we are all throughout assuming s ≤ 0. Application of the reverse

Holder inequality gives for all states σ
T r(Aσ s ) ≥ (T r(A1/(1−s) ))1−s .(T r(σ))s = (T r(A1/(1−s) ))1−s
with equality iff σ is proportional to A1/(1−s) , ie, iff
σ = A1/(1−s) /T r(A1/(1−s) )
Thus,
minσ Ds (p ⊗ W |p ⊗ σ) =
�
(−1/s)[T r( p(x)Wx1−s )1/(1−s) ]1−s
x
The maximum of this over all p is attained when p = ps and we denote the
corresponding value of σ by σs . Thus,
�
( x ps (x)Wx1−s )1/(1−s)
σs = �
T r( x ps (x)Wx1−s )1/(1−s)
n
� with φ(i) ∈ A . Let Yi , i = 1, 2, ..., N
Now let φ(i), i = 1, 2, ..., N be a code
be detection operators, ie, Yi ≥ 0, i Yi = 1. Let �(φ) be the error probability
corresponding to this code and detection operators. Then
N
�
1 − �(φ) = N −1 T r(Wφ(i) Yi )
i=1
= N −1 T r(S(φ)T )
where
S(φ) = N −1 diag[Wφ(i) , i = 1, 2, ..., N ], T = diag[Yi , i = 1, 2, ..., N ]
Also define for any state σ, the state
S(σ) = N −1 diag[σn , ..., σn ] = N −1 .IN ⊗ σn
We observe that
� N
�
T r(S(σn )T ) == N −1 T r(σn Yi ) = T r(σ.N −1 . Yi )
i i=1
= N −1 T r(σn ) = 1/N
where σn = σ ⊗n . Then for s ≤ 0 we have by monotonicity of quantum Renyi
entropy, (The relative entropy between the pinched states cannot exceed the
relative entropy between the original states)
N −s (1 − �(φ))1−s = (T r(S(φ)T ))1−s (T r(S(σns )T ))s
≤ (T r(S(φ)T ))1−s (T r(S(σns )T ))s + (T r(S(φ)(1 − T )))1−s (T r(S(σns )(1 − T )))s

≤ T r(S(φ)1−s S(σns )s )
Note that the notation σns = σs⊗n is being used. Now, we have
�
(T r(S(φ)1−s S(σns )s )) = (N −1 1−s s
T r(Wφ(i) σns ))
i
1−s s
log(T r(Wφ(i) σns )
n
�
= log(T r(Wφ1l−s s
(i) σs ))
l=1
n
�
= n.n−1 log(T r(Wφ1−s σ s ))
l (i) s
l=1
n
�
≤ n.log(n−1 . T r(Wφ1−s σ s ))
l (i) s
l=1
�
≤ n.log(T r( ps (y)Wy1−s )1/(1−s) ]1−s )
y
�
= n(1 − s).log(T r( ps (x)Wx1−s )1/(1−s) )
x
Thus, �
T r(S(φ)1−s S(σns )s ) ≤ (T r( ps (x)Wx1−s )1/(1−s) )n(1−s)
x
6.7. PROOF OF THE CONVERSE PART OF THE CQ CODING THEOREM107
and hence
�
N −s (1 − �(φ))1−s ≤ (T r( ps (x)Wx1−s )1/(1−s) )n(1−s)
x
or equivalently,
�
1 − �(φ) ≤ exp((s/(1 − s))log(N ) + n.log(T r( ps (x)Wx1−s )1/(1−s) ))
x
�
= exp((s/(1 − s))log(N ) + n.log(T r( ps (x)Wx1−s )1/(1−s) ))
x
�
= exp(ns((1 − s)log(N )/n + s−1 log(T r( ps (x)Wx1−s )1/(1−s) )) − − − (a)
x
where s ≤ 0. Now, for any probability distribution p(x) on A, we have

�
d/dsT r[( p(x)Wx1−s )1/(1−s) ] =
x
� �
(1/(1 − s)2 )T r[( p(x)Wx1−s )1/(1−s) .log( p(x)Wx1−s )]
x x
� �
−(1/(1 − s))[T r( p(x)Wx1−s log(Wx ))].[T r( p(x)Wx1−s )s/(1−s) ]
x x
and as s → 0, this converges to

�
d/dsT r[( p(x)Wx1−s )1/(1−s) ]|s=0 =
x
� �
p(x)T r(Wx .(log(Wp ) − log(Wx ))) = − p(x)D(Wx |Wp )
x x
�
= −(H(Wp ) − (x)H(Wx )) = −(H(Y ) − H(Y |X)) = −Ip (X, Y )
p
where �
Wp = p(x)Wx
x
Write N = N (n) and define
R = limsupn log(N (n))/n, φ = φn
Also define
�
Is (X, Y ) = −(1/s(1 − s))log(T r( ps (x)Wx1−s )1/(1−s) ))
x
Then, we’ve just seen that
lims→0 Is (X, Y ) = I0 (X, Y ) ≤ I(X, Y )

where I0 (X, Y ) = Ip0 (X, Y ) and I(X, Y ) = supp Ip (X, Y ). and
1 − �(φn ) ≤ exp((ns/(1 − s))(log(N (n))/n − Is (X, Y )))
It thus follows that if

R > I(X, Y )
then by taking s < 0 and s sufficiently close to zero we would have
R > Is (X, Y )
and then we would have
limsupn (1 − �(φn )) ≤ limsupn exp((ns/(1 − s))(R − Is (X, Y )) = 0
or equivalently,
liminfn �(φn ) = 1
This proves the converse of the Cq coding theorem, namely, that if the rate of
information transmission via any coding scheme is more that the Cq capacity
I(X, Y ), then the asymptotic probability of error in decoding using any sequence
of decoders will always converge to unity.
6.8 Proof of the direct part of the Cq coding

theorem
Our proof is based on Shannon’s random coding method. Consider the same
scenario as above. Define
πi = {Wφ(i) ≥ 2N Wpn }, i = 1, 2, ..., N
where N = N (n), φ(i) ∈ An with φ(i), i = 1, 2, ..., N being iid with probability
distribution pn = p⊗n on An . Define the detection operators
N
� N
�
Y (φ(i)) = ( π)−1/2 .πi .( πj )−1/2 , i = 1, 2, ..., N
j=1 j=1
�N
Then 0 ≤ Yi ≤ 1 and i=1 Yi = 1. Write
�
S = πi , T = πj
j=i
Then,
Y (φ(i)) = (S + T )−1/2 S(S + T )−1/2 , 0 ≤ S ≤ 1, T ≥ 0
and hence we can use the inequality
1 − (S + T )−1/2 S(S + T )−1/2 ≤ 2 − 2S + 4T

6.8. PROOF OFinTHE
Classical
DIRECTand Quantum
PART OF Physics
THE CQ and CODING
Engineering 101
THEOREM109
Then, �
1 − Y (φ(i)) =≤ 2 − 2πi + 4 πj
j:j=i
and hence
N
�
�(φ) = N −1 T r(Wφ(i) (1 − Y (φ(i))))
i=1
N
� �
≤ N −1 T r(W (φ(i))(2 − 2πi ) + 4N −1 T r(W (φ(i))πj )
i=1 j
i=
Note that since for i = � j, φ(i) and φ(j) are independent random variables in
An , it follows that W (φ(i)) and πj are also independent operator valued random
variables. Thus,
N
�
E(�(φ)) ≤ (2/N ) E(T r(W (φ(i)){W (φ(i)) ≤ 2N Wpn }))
i=1
�
+(4/N ) T r(E(W (φ(i))).E{W (φ(j)) > 2N Wpn })
i=j,i,j=1,2,...,N
�
=2 pn (xn )T r(W (xn ){W (xn ) ≤ 2N.W (pn )})
xn ∈An
�
+4(N − 1) pn (xn )pn (yn )T r(W (xn ){W (yn ) > 2N W (pn )})
xn ,yn ∈An
�
= (2/N ) pn (xn )T r(W (xn ){W (xn ) ≤ 2N.W (pn )})
xn
�
+4(N − 1) pn (yn )T r(W (pn ){W (yn ) > 2N W (pn )})
yn ∈An
�
≤2 pn (xn )T r(W (xn )1−s (2N W (pn ))s )
xn
�
+4N pn (yn )T r(W (yn )1−s W (pn )s (2N )s−1 )
yn
for 0 ≤ s ≤ 1. This gives

E(�(φn )) ≤
�
2s+2 N s pn (xn )T r(W (xn )1−s W (pn )s )
xn ∈An
�
s+2
=2 N s[ p(x)T r(W (x)1−s W (p)s )]n
x∈A
�
= 2s+2 .exp(sn(log(N (n)/n + s−1 .log( p(x)T r(W (x)1−s W (p)s ))))
x
6.9 Induced representations

Let G be a group and H a subgroup. Let σ be a unitary representation of H in
a Hilbert space V . Let F be the space of all f : G → V for which
f (gh) = σ(h)−1 f )(g), h ∈ H, g ∈ G
Then define
U (g)f (x) = ρ(g, g −1 xH)f (g −1 x), g, x ∈ G, f ∈ F
where
ρ(g, x) = (dµ(xH)/dµ(gxH))1/2 , g, x ∈ G
with µ a measure on G/H. U is called the representation of G induced by the
representation σ of H. We define
�
� f �= |f (g)|2 dµ(g)
G
Then, unitarity of U (g) is easy to see:

�
� U (g)f (x) �2 dµ(x) =
G/H
�
(ρ(g, g −1 xH))2 |f (g −1 x)|2 dµ(xH) =
G/H
� �
(dµ(g −1 xH)/dµ(xH))|f (g −1 x)|2 dµ(xH) = |f (g −1 x)|2 dµ(g −1 xH)
G/H
�
= |f (x)|2 dµ(xH)
G/H
proving the claim. Note that
(ρ(g1 , g2 xH)ρ(g2 , xH))2 = (dµ(g2 xH)/dµ(g1 g2 xH))(dµ(xH)/dµ(g2 xH)) =
dµ(xH)/dµ(g1 g2 xH) = ρ(g1 g2 , xH)2

ie
ρ(g1 g2 , xH) = ρ(g1 , g2 xH)ρ(g2 , xH), g1 , g2 , x ∈ G
ie, ρ is a scalar cocycle.
Remark: |f (x)|2 can be regarded a function on G/H since it assumes the
same value on xH for any x ∈ G. Indeed for x ∈ G, h ∈ H, |f (xh)|2 =
|σ(h)−1 f (x)|2 = |f (x|2 since σ(h) is a unitary operator.
There is yet another way to describe the induced representation, ie, in a way
that yields a unitary representation of G that is equivalent to U . The method
is as follows: Let X = G/H and let F1 be the space of all square integrable
6.9. INDUCED REPRESENTATIONS 111
functions f : X → V w.r.t the measure µX induced by µ on X. Let γ be any

cocycle of (G, X) with values in the space of unitary operators on V , ie,
γ : G × X → U(V )
satisfies
γ(g1 g2 , x) = γ(g1 , g2 x)γ(g2 , x), g1 , g2 ∈ G, x ∈ X
Then define a representation U1 of G in F1 by
U1 (g)f (x) = ρ(g, g −1 x)γ(g, g −1 x)f (g −1 x), g ∈ G, x ∈ X, f ∈ F1
Let s : X → G be a section of X = G/H such that s(H) = e. This means that

for any x ∈ X, s(x)H = x, ie, s(x) belong to the coset x. Define
b(g, x) = s(gx)−1 gs(x) ∈ G, g ∈ G, x ∈ X
Then,
b(g, x)H = s(gx)−1 gx = H
because
gs(x)H = gx, s(gx)H = gx
Now observe that
b(g1 , g2 x)b(g2 , x) = s(g1 g2 x)−1 g1 s(g2 x)s(g2 x)−1 g2 s(x) = s(g1 g2 x)−1 g1 g2 s(x)
= b(g1 g2 , x), g1 , g2 ∈ G, x ∈ X
implying that γ = σ(b) is a (G, X, V ) cocycle, ie,
γ(g1 g2 , x) = σ(b(g1 g2 , x)) = σ(b(g1 , g2 x)b(g2 , x))
= σ(b(g1 , g2 x))σ(b(g2 , x)) = γ(g1 , g2 x)γ(g2 , x)

with γ assuming values in the set of unitary operators in V . Now let
f (g) = γ(g, H)−1 f1 (gH) = T (f1 )(g) ∈ V, f1 ∈ F1 , g ∈ G
Then
U (g)T (f1 )(g1 ) = U (g)f (g1 ) = ρ(g, g −1 g1 H)f (g −1 g1 )
= ρ(g, g −1 g1 H)γ(g −1 g1 , H)−1 f1 (g −1 g1 H)
= ρ(g, g −1 g1 H)(γ(g −1 , g1 H)γ(g1 , H))−1 f1 (g −1 g1 H)
On the other hand,
T (U1 (g)f1 )(g1 ) = γ(g1 , H)−1 U1 (g)f1 (g1 H)
= γ(g1 , H)−1 ρ(g, g −1 g1 H)γ(g, g −1 g1 H)f1 (g −1 g1 H)

= ρ(g, g −1 g1 H)γ(g1 , H)−1 γ(g, g −1 g1 H)f1 (g −1 g1 H)
Now,
γ(g1 , H) = γ(gg −1 g1 , H) = γ(g, g −1 g1 H)γ(g −1 g1 , H)
or equivalently,
γ(g1 , H)−1 γ(g, g −1 g1 H) = γ(g −1 g1 , H)−1
Thus,
T (U1 (g)f1 )(g1 ) = ρ(g, g −1 g1 H)γ(g −1 g1 , H)−1 f1 (g −1 g1 H)
= ρ(g, g −1 g1 H)T (f1 )(g −1 g1 ) = U (g)T (f1 )(g1 )
In other words,
T U1 (g) = U (g)T
Thus, U1 and U are equivalent unitary representations.
6.10 Estimating nonuniform transmission line

parameters and line voltage and current
using the extended Kalman filter
Abstract: The distributed parameters of a nonuniform transmission line are
assumed to depend on a finite set of unknown parameters, for example their
spatial Fourier series coefficients. The line equations are setup taking random
voltage and current line loading into account with the random loading being
assumed to be white Gaussian noise in the time and spatial variables. By pass
ing over to the spatial Fourier series domain, these line stochastic differential
equations are expressed as a truncated sequence of coupled stochastic differen
tial equations for the spatial Fourier series components of the line voltage and
current. The coefficients that appear in these stochastic differential equations
are the spatial Fourier series coefficients of the distributed line parameters and
these coefficients in turn are linear functions of the unknown parameters. By
separating out these complex stochastic differential equations into their real and
imaginary part, the line equations become a finite set of sde’s for the extended
state defined to be the real and imaginary components of the line voltage and
current as well as the parameters on which the distributed parameters depend.
The measurement model is assumed to be white Gaussian noise corrupted ver
sions of the line voltage and current sampled at a discrete set of spatial points
on the line and this measurement model can be expressed as a linear transfor
mation of the state variables, ie, of the spatial Fourier series components of the
line voltage and current plus a white measurement noise vector. The extended
Kalman filter equations are derived for this state and measurement model and
our real time line voltage and current and distributed parameter estimates show
excellent agreement with the true values based on MATLAB simulations.
6.11. MULTIPOLE EXPANSION OF THE MAGNETIC FIELD PRODUCED BY A TIME
6.11 Multipole expansion of the magnetic field

produced by a time varying current distri
bution
Let J(t, r) be the current density. The magnetic vector potential generated by
it is given by
�
A(t, r) = (µ/4π) J(t − |r − r� |/c, r� )d3 r� /|r − r� |
� �
= (µ/4π) n
(1/c n!) (∂tn J(t, r� ))|r − r� |n−1 d3 r�
n≥0
For |r| greater than the source dimension and with θ denoting the angle between
the vectors r and r� , we have the expansion
�
|r − r� |α = (r2 + r 2 − 2r.r� )α/2
= rα (1 + (r� /r)2 − 2r.r� /r2 )α/2 = rα (1 + (r� /r)2 − 2(r� /r)cos(θ))α/2

�
= rα Pm,α (cos(θ))(r� /r)m
m≥0
where the generating function of Pm,α (x), m = 0, 1, 2, ... is given by (1 + x2 −

2x.cos(θ))α/2 . Then,
� �
A(t, r) = (µ/4π) (1/cn n!)rn−1−m ∂tn J(t, r� )r� mPn−1 (r.r� /rr� )d3 r�
n,m≥0
A useful approximation to this exact expansion is obtained by taking
|r − r� |α ≈ (r2 − 2r.r� )α/2 ≈ rα (1 − αr.r� /r2 )
= rα − α.rα−2 r.r�
Further,
∂tn J(t, r� )(r.r� ) = (r� × ∂tn J(t, r� )) × r + (∂tn J(t, r� ).r)r�
In the special case when the source is a wire carrying a time varying current
I(t), we can write
J(t, r� )d3 r� = I(t)dr�
and then the above equation gets replaced by
Chapter 7
Chapter 7
Some Aspects of Stochastic
Processes
Some aspects of stochastic
processes
�n[1] In a single server queue, let the arrivalsthtake place at the times Sn =
k=1 Xk , n ≥ 1, let the service time of the n customer be Yn and let his
waiting time in the queue be Wn . Derive the recursive relation
Wn+1 = max(0, Wn + Yn − Xn+1 ), n ≥ 1
where W1 = 0. If the Xn� s are iid and the Yn� s are also iid and independent
of {Xn }, then define Tn = Yn − Xn+1 and prove �nthat Wn+1 has the same
distribution as that of max0≤k≤n Fk where Fn = k=1 Tk , n ≥ 1 and F0 = 0.
When Xn , Yn are respectively exponentially distributed with parameters λ, µ
respectively explain the sequence of steps based on ladder epochs and ladder
heights by which you will calculate the distribution of the equilibrium waiting
time W∞ = limn→∞ Wn in the case when µ > λ. What happens when µ ≤ λ ?
Solution: The (n + 1)th customer arrives at the time Sn+1 . The previous, ie,
the nth customer departs at time Sn + Wn + Yn . If this time of his departure is
smaller than Sn+1 , (ie, Sn + Wn + Yn − Sn+1 = Wn + Yn − Xn+1 < 0), then the
(n + 1)th customer finds the queue empty on arrival and hence his waiting time
Wn+1 = 0. If on the other hand, the time of departure Sn + Wn + Yn of the nth
customer is greater than Sn+1 , the arrival time of the (n + 1)th customer, then
the n + 1th customer has to wait for a time
Wn+1 = (Sn + Wn + Yn ) − Sn+1 = Wn + Yn − Xn+1
before his service commences. Therefore, we have the recursion relation
Wn+1 = max(0, Wn + Yn − Xn+1 ), n ≥ 0
115
107
108
116 Stochastic 7.
CHAPTER Processes
SOMEinASPECTS
Classical and
OFQuantum Physics PROCESSES
STOCHASTIC and Engineering
Note that W1 = 0. This can be expressed as
Wn+1 = max(0, Wn + Tn ), Tn − Yn − Xn+1
By iteration, we find
W2 = max(0, T1 ), W3 = max(0, T2 +max(0, T1 ))

= max(0, max(T2 , T1 +T2 )) = max(0, T2 , T1 +T2 )
W4 = max(0, W3 +T3 ) = max(0, T3 +max(0, max(T2 , T1 +T2 ))
= max(0, max(T3 , max(T2 +T3 , T1 +T2 +T3 ))))
= max(0, T3 , T2 + T3 , T1 + T2 + T3 )
Continuing, we find by induction that
� n
Wn+1 = max(0, Tk , r = 1, 2, ..., n)
k=r
Now the Tk� s are iid. Therefore, Wn+1 has the same distribution as
n
�
max(0, Tn+1−k , r = 1, 2, ..., n)
k=r
r
�
= max(0, Tk , r = 1, 2, ..., n)
k=1
In particular, the equilibrium distribution of the waiting time, ie of W∞ =

limn→∞ Wn is the same as that of
�∞
max(0, Tk )
k=1
provided that this limiting � r.v. exists, ie, the provided that the infinite series
n
converges w.p.1. Let Fn = k=1 Tk , n ≥ 1, F0 = 0
Define τ1 denote the first time k ≥ 1 at which Fk > F0 = 0. For n ≥ 1, let
τn+1 denote the first time k ≥ 1 at which Fk > Fτn , ie, the n + 1th time k at
which Fk exceeds all of Fj , j = 0, 1, ..., k − 1. Then, it is clear that W∞ has the
same distribution as that of limFτn , provided that τn exists for all n. In the
general case, we have
P (W∞ ≤ x) = limn→∞ P (Fτn ≤ x)
where if for a given ω, τN (ω) exists for some positive integer N = N (ω) but
τN +1 (ω) does not exist, then we set τk (ω) = τN (ω) for all k > N . It is also
clear using stopping time theory, that the r.v’s Un+1 = Fτn+1 − Fτn , n = 1, 2, ...
are iid and nonnegative and we can write
∞
�
W∞ = U n , U0 = 0
n=0
117
Thus, the distribution of W∞ is formally the limit of FU∗n where FU is the

distribution of U1 = Fτ1 with τ1 = min(k ≥ 1, Fk > 0). Note also that for
x > 0,
∞
�
1 − FU (x) = P (U > x) = P (maxk≤n Fk ≤ x < Fn+1 )
n=0
[2] Consider a priority queue with two kinds of customers labeled C1 , C2 and
a single server. Customers of type C1 arrive at the rate λ1 and customers of type
C2 arrive at the rate λ2 . A customer of type C1 is served at the rate µ1 while a
customer of type C2 is served at the rate µ2 . If a customer of type C1 is present
in the queue, then his service takes immediate priority over a customer of type
C2 , otherwise the queue runs on a first come first served basis. Let X1 (t) denote
the number of type C1 customers and X2 (t) the number of type C2 customers
in the queue at time t. Set up the ChapmanKolmogorov differential equations
for
P (t, n1 , n2 ) = P r(X1 (t) = n1 , X2 (t) = n2 ), n1 , n2 = 0, 1, ...
and also set up the equilibrium equations for
P (n1 , n2 ) = limt→∞ P (t, n1 , n2 )
Solution: Let θ(n) = 1if n ≥ 0 and θ(n) = 0 if n < 0 and let δ(n, m) = 1 if
n = m else δ(n, m) = 0.
P (t+dt, n1 , n2 ) = P (t, n1 −1, n2 )λ1 dt.θ(n1 −1)+P (t, n1 , n2 −1)λ2 dt.θ(n2 −1)+
P (t, n1 + 1, n2 )µ1 dt + P (t, n1 , n2 + 1)µ2 .dt.δ(n1 , 0)

+P (t, n1 , n2 )(1 − (λ1 + λ2 )dt − µ1 .dt.θ(n1 − 1) − µ2 dt.θ(n2 − 1).δ(n1 , 0))
Essentially, the term δ(n1 , 0) has been inserted to guarantee that the C2 type of
customer is currently being served only when there is no C1 type of customer in
the queue and the term θ(n2 − 1) term has been inserted to guarantee that the
C2 type of customer can depart from the queue only if there is at least one such
in the queue. The product δ(n1 , 0)θ(n2 − 1) term corresponds to the situation
in which there is no C1 type of customer and at least one C2 type of customer
which means that a C2 type of customer is currently being served and hence
will depart immediately after his service is completed. From these equations
follow the ChapmanKolmogorov equations:
∂t P (t, n1 , n2 ) =
P (t, n1 − 1, n2 )λ1 θ(n1 − 1) + P (t, n1 , n2 − 1)λ2 .θ(n2 − 1)+

P (t, n1 + 1, n2 )µ1 + P (t, n1 , n2 + 1)µ2 .δ(n1 , 0)
−P (t, n1 , n2 )(λ1 + λ2 ) − µ1 .θ(n1 − 1) − µ2 .θ(n2 − 1).δ(n1 , 0))
118 CHAPTER 7. SOME ASPECTS OF STOCHASTIC PROCESSES
The equilibrium equations are obtained by setting the rhs to zero, ie ∂t P (n1 , n2 ) =
0.
[3] Write short notes on any three of the following:

[a] Construction of Brownian motion on [0, 1] from the Haar wavelet func
tions including the proof of almost sure continuity of the sample paths.
Solution: Define the Haar wavelet functions as
Hn,k (t) = 2(n−1)/2 for (k − 1)/2n ≤ t ≤ k/2n , Hn,k (t) = −2(n−1)/2 for
k/2n < t ≤ (k + 1)/2n whenever k is in the set I(n) = {k : kodd, k =
0, 1, ..., 2n−1 } for n=1,2,... and put H0 (t) = 1. Here t ∈ [0, 1]. It is then
easily verified that the functions Hnk , k ∈ I(n), n ≥ 1, H00 form an orthonormal
basis for L2 [0, 1]. We set I(0) = {0}.
Define the Schauder functions
� t
Snk (t) = Hnk (s)ds
0
Then, from basic properties of orthonormal bases, we have

�
Hnk (t)Hnk (s) = δ(t − s)
nk
and hence �
Snk (t)Snk (s) = min(t, s)
nk
Let ξ(n, k), k ∈ I(n), n ≥ 1, ξ(0, 0) be iid N (0, 1) random variables and define
the random functions
N
� �
BN (t) = ξ(n, k)Snk (t), N ≥ 1
n=0 k∈I(n)
Then since Snk (t) are continuous functions, BN (t) is also continuous for all N .
�
Further, the Snk s for a fixed n are nonoverlapping functions and
maxt |Snk (t)| = 2−(n+1)/2
Thus, �
maxt |Snk (t)| = 2−(n+1)/2
k∈I(n)
Let
b(n) = max(|ξ(nk)| : k ∈ I(n)}
Then, by the union bound
� ∞ √
n n+1
P (b(n) > n) ≤ 2 .P (|ξ(nk)| > n) = 2 exp(−x2 /2)dx/ 2π
n
≤ K.2n+1 n.exp(−n/ 2)
119
Thus, �
P (b(n) > n) < ∞
n≥1
which means that by the BorelCantelli lemma, for a.e.ω, there is a finite N (ω)
such that for all n > N (ω), b(n, ω) < n. This means that for a.e.ω,
N
� �
|ξ(n, k, ω)||Snk (t)|
n=1 k∈I(n)
N
�
≤ n.2−(n+1)/2
n=1
which converges as N → ∞. This result implies that the sequence of con

tinuous processes BN (t), t ∈ [0, 1] converges uniformly to a continuous process
B(t), t ∈ [0, 1] a.e. Note that continuity of each term and uniform convergence
imply continuity of the limit. That B(t), t ∈ [0, 1] is a Gaussian process with
zero mean and covariance min(t, s) as can be seen from the joint characteristic
function of the time samples of BN (.). Alternately, B(.) being an infinite linear
superposition of Gaussian r.v’s must be a Gaussian process, its mean zero is
obvious since ξ(n, k) have mean zero and the covariance follows from
N
� �
E(BN (t)BN (s)) = E(ξ(nk)ξ(mr))Snk (t)Smr (s)
n,m=0 k,r ∈I(n)
N
� �
= δ(n, m)δ(k, r)Snk (t)Smr (s)
n,m=0 k,r∈I(n)
N
� � ∞
� �
= Snk (t)Snk (s) → Snk (t)Snk (s) = min(t, s)
n=0 k∈I(n) n=0 k∈I(n)
[b] L2 construction of Brownian motion from the KarhunenLoeve expansion.

The autocorrelation of B(t) is
R(t, s) = E(B(t)B(s)) = min(t, s)
Its eigenfunctions φ(t) satisfy

� 1
R(t, s)φ(s)ds = λφ(t), t ∈ [0, 1]
0
Thus,
� t � 1
sphi(s)ds + t φ(s)ds = λφ(t)
0 t
Differentiating,
� 1
tφ(t) + φ(s) − tφ(t) = λφ� (t)
t
Again differentiating
−φ(t) = λφ�� (t)
Let λ = 1/ω 2 . Then, the solution is
φ(t) = A.cos(ωt) + B.sin(ωt)

� t
sφ(s)ds = A((t/ω)sin(ωt) + (cos(ωt) − 1)/ω 2 )+
0
B(−t.cos(ωt)/ω + sin(ωt)/ω 2 )
� 1
φ(s)ds = A(sin(ω) − sin(ωt))/ω + (B/ω)(cos(ωt) − cos(ω))
t
Substituting all these expressions into the original integral equation gives
A((t/ω)sin(ωt) + (cos(ωt) − 1)/ω 2 )+
B(−t.cos(ωt)/ω + sin(ωt)/ω 2 )
+tA(sin(ω) − sin(ωt))/ω + (Bt/ω)(cos(ωt) − cos(ω))
= (A.cos(ωt) + B.sin(ωt))/ω 2
On making cancellations and comparing coefficients,
A.sin(ω) = 0, B.cos(ω) = 0, A/ω 2 = 0,
Therefore,
A = 0, ω = (n + 1/2)π, n ∈ Z
and for normalization � 1
φ2 (t)dt = 1
0
we get
� 1
2
B sin2 (ωt)dt = 1
0
or
(B 2 /2)(1 − sin(2ω)/2ω) = 1
or √
B= 2
Thus, the KL expansion for Brownian motion is given by
��
B(t) = λn ξ(n)φn (t)
n∈Z
121
� √
= ((n + 1/2)π)−2 ξ(n). 2sin((n + 1/2)πt)
n∈Z
where ξ(n), n = 1, 2, ... are iid N (0, 1) r.v’s.
[c] Reflection principle for Brownian motion with application to the evalua
tion of the distribution of the first hitting time of the process at a given level.
[d] The BorelCantelli lemmas with application to proving almost sure con
vergence of a sequence of random variables.
[e] The FeynmanKac formula for solving Schrodinger’s equation using Brow
nian motion with complex time.
[f] Solving Poisson’s equation �2 u(x) = s(x) in an open connected subset of
R with the Dirichlet boundary conditions using Brownian motion stop times.
n
[4] A bulb in a bulb holder is changed when it gets fused. Let Xn denote
the time for which the nth bulb lasts. If n is even, then Xn has the exponential
density λ.exp(−λ.x)u(x) while if n is odd then Xn has the exponential density
µ.exp(−µ.x)u(x). Calculate (a) the probability distribution of the number of
bulbs N (t) = max(n : Sn ≤ t) changed in [0, t] where Sn = X1 + ... + Xn , n ≥ 1
and S0 = 0. Define
δ(t) = t − SN (t) , γ(t) = SN (t)+1 − t
Pictorially illustrate and explain the meaning of the r.v’s δ(t), γ(t). What hap
pens to the distributions of these r.v’s as t → ∞ ?
Solution: Let n ≥ 1.
� �
P (N (t) ≥ n) = P (Sn ≤ t) = P ( Xk + Xk ≤ t)
1≤k≤n,keven 1≤k≤n,nodd
Specifically,
n
� n
�
P (N (t) ≥ 2n) = P ( X2k + X2k−1 ≤ t)
k=1 k=1
= F ∗n ∗ G∗n (t)
where
F (t) = 1 − exp(−λt), G(t) = 1 − exp(−µt)
Note that with f (t) = F � (t), g(t) = G� (t), we have
f n∗ (t) = λn tn−1 exp(−λt)/(n − 1)!,
g n∗ (t) = µn tn−1 exp(−µt)/(n − 1)!

Equivalently, in terms of Laplace transforms,
L(f (t)) = λ/(λ + s), L(g(t)) = µ/(µ + s)

so
L(f ∗n (t)) = λn /(λ + s)n , L(g ∗n (t)) = µn /(µ + s)n
and hence
f ∗n (t) ∗ g ∗n (t) = L−1 ((λµ)2 /(λ + s)n (µ + s)n ) = ψn (t)
say. Then,
� t
P (N (t) ≥ 2n) = ψn (x)dx
0
Likewise,
� t
P (N (t) ≥ 2n − 1) = P (S2n−1 ≤ t) = φn (x)dx
0
where
φn (t) = g n∗ (t) ∗ f ∗(n−1) (t) = L−1 (λn−1 µn /(λ + s)n−1 (µ + s)n )
δ(t) = t − SN (t) is the time taken from the last change of the bulb to the current
time. γ(t) = SN (t)+1 − t is the time taken from the current time upto the next
change of the bulb. Thus, for 0 ≤ x ≤ t,
�
P (δ(t) ≤ x) = P (SN (t) > t − x) = P (Sn > t − x, N (t) = n)
n≥1
� �
= P (t − x < Sn ≤ t < Sn+1 ) = P (t − x < Sn < t, Xn+1 > t − Sn )
n≥1 n≥1
�� t
= P (Xn+1 > t − s)P (Sn ∈ ds)
n≥1 t−x
Likewise,
�
P (γ(t) ≤ x) = P (SN (t)+1 ≤ t + x) = P (Sn+1 ≤ t + x, Sn ≤ t < Sn+1 )
n≥0
� �� t
= P (Sn ≤ t < Sn+1 ≤ t + x) = P (Sn ∈ ds)P (t − s < Xn+1 < t + x)
n≥ 0 n≥0 0
[5] [a] Let {X(t) : t ∈ R} be a real stochastic process with autocorrelation

E(X(t1 ).X(t2 )) = R(t1 , t2 ). Assume that there exists a finite positive real num
ber T such that R(t1 + T, t2 + T ) = R(t1 , t2 )∀t1 , t2 ∈ R. Prove that X(t) is a
mean square periodic process, ie,
E[(X(t + T ) − X(t))2 ] = 0∀t ∈ R

123
Hence show that X(t) has a mean square Fourier series:

N
�
limN →∞ E[|X(t) − c(n)exp(jnω0 t)|2 ] = 0
n=−N
where ω0 = 2π/T and
� T
c(n) = T −1 X(t)exp(−jnω0 t)dt
0
[b] Define a cyclostationary and wide sense cyclostationary process with an

example. Give an example of a wide sense cyclostationary process that is not
cyclostationary.
Solution: A process X(t), t ∈ R is said to be cyclostationary if its finite di
mensional probability distribution functions are invariant under shifts by integer
multiples of some fixed positive T , ie, if
P (X(t1 ) ≤ x1 , ..., X(tN ) ≤ xN ) = P (X(t1 + T ) ≤ x1 , ..., X(tN + T ) ≤ xN )
for all x1 , ..., xN , t1 , ..., tN , N ≥ 1. For example, the process

�
X(t) = c(n)exp(jnωt)
n
where c(n) are any r.v’s is cyclostationary with period 2π/ω. In fact, this process
is a periodic process. A wide sense cyclostationary process is a process whose
mean and autocorrelation function are periodic with period T for some fixed
T > 0, ie, if
µ(t) = E(X(t)), R(t, s) = E(X(t)X(s))
then
µ(t + T ) = µ(t), R(t + T, s + T ) = R(t, s)∀t, s ∈ R
An example of a WSS process that is not stationary is
X(t) = cos(ω1 t + φ1 ) + cos(ω2 t + φ2 ) + cos(ω3 t + φ1 + φ2 )
where ω1 + ω2 = ω3 and φ1 , φ2 are iid r.v’s uniformly distributed over [0, 2π).
This process has second moments that are invariant under time shifts but its

third moments are not invariant under time shifts. Likewise, an example of a
WScyclostationary process that is not cyclostationary is the process X(t)+Y (t)
with Y (t) having the same structure as X(t) but independent of it with frequen
cies ω4 , ω5 , ω6 (and phases φ4 , φ5 , φ4 + φ5 with φ1 , φ2 , φ4 , φ5 being independent
and uniform over [0, 2π)) such that (ω1 + ω2 − ω3 )/(ω4 + ω5 − ω6 ) is irrational.
This process is also in fact WSS. To see why it is not cyclostationary, we compute
its third moments:
E(X(t)X(t+t1 )X(t+t2 ))
= E(cos(ω3 t+φ1 +φ2 ).cos(ω1 (t+t1 )+φ1 ).cos(ω2 (t+t2 )+φ2 ))
+E(cos(ω3 t + φ1 + φ2 ).cos(ω1 (t + t2 ) + φ1 ).cos(ω2 (t + t1 ) + φ2 ))

= (1/4)(cos((ω1 + ω2 − ω3 )t + ω1 t1 + ω2 t2 )
+cos((ω1 + ω2 − ω3 )t + ω2 t1 + ω1 t2 ))
and likewise
E(Y (t)Y (t + t1 )Y (t + t2 )) =
= (1/4)(cos((ω4 + ω5 − ω6 )t + ω4 t1 + ω5 t2 )
+cos((ω4 + ω5 − ω6 )t + ω5 t1 + ω4 t2 ))
Thus, if
Z(t) = X(t) + Y (t)
then
E(Z(t)Z(t+t1 )Z(t+t2 )) = E(X(t)X(t+t1 )X(t+t2 ))+E(Y (t)Y (t+t1 )Y (t+t2 ))
= (1/4)(cos((ω1 + ω2 − ω3 )t + ω1 t1 + ω2 t2 )
+cos((ω1 + ω2 − ω3 )t + ω2 t1 + ω1 t2 ))
+(1/4)(cos((ω4 + ω5 − ω6 )t + ω4 t1 + ω5 t2 )
+cos((ω4 + ω5 − ω6 )t + ω5 t1 + ω4 t2 ))
which is clearly not periodic in t since (ω1 + ω2 − ω3 )/(ω4 + ω5 − ω6 ) is irrational.
[6] Explain the construction of the Lebesgue measure as a countably additive

nonnegative set function on the Borel subsets of R whose value on an interval
coincides with the interval’s length.
Solution: We first define the Lebesgue measure µ on the field F of finite

disjoint unions of intervals of the form (a, b] in (0, 1] as the sum of lengths.
Then, we define the outer measure µ∗ on the class of all subsets of (0, 1] as
�
µ∗ (E) = inf { µ(En ) : En ∈ F, E ⊂ En }

n n
We prove countable subadditivity of µ∗ . Then, we say that E is a measurable

subset of (0, 1] if
µ∗ (EA) + µ∗ (E c A) = µ∗ (A)∀A ⊂ (0, 1]
Then, if M denotes the class of all measurable subsets, we prove that M is a

field and that µ∗ is finitely additive on M. We then prove that M is a σ field
and that µ∗ is countably additive on M. Finally, we prove that M contains
the field F and hence the σfield generated by F, namely, the Borel σfield on
(0, 1].
125
[7] [a] What is a compact subset of Rn ? Give an example and characterize it

means of (a) the HeineBorel property and (b) the BolzanoWeierstrass property.
Prove that the countable intersection of a sequence of decreasing nonempty
compact subsets of Rn is nonempty.
A compact subset E of Rn is a closed and bounded subset. By bounded, we
mean that
sup{|x| : x ∈ E} < ∞
By closed, we mean that if xn ∈ E is a sequence in E such that limxn = x ∈ Rn
exists, then x ∈ E. For example,
E = {x ∈ Rn : |x| ≤ 1}
It can be shown that E is compact iff every open cover of it has finite subcover,
ie, if Oα , α ∈ I are open sets in Rn such that

E⊂ Oα
α∈I
then there is a finite subset

F = {α1 , ..., αN } ⊂ I
such that
N

E⊂ O αk
k=1
(b) What is a regular measure on (Rn , B(Rn ))?. Show that every probability
measure on (Rn , B(Rn )) is regular.
Solution: A measure µ on (Rn , B(Rn )) is said to be regular if given any
B ∈ B(Rn ) and � > 0, there exists an open set F and a closed set C in Rn such
that C ⊂ B ⊂ F and µ(F − C) < �. Let us now show that any probability
measure (and hence any finite measure) on (Rn , B(Rn )) is regular. Let P be a
probability measure on the above space. Define C to be the family of all sets
B ∈ B(Rn ) such that ∀� > 0, there exists a closed set C and an open set F such
that C ⊂ B ⊂ F and P (F −C) < �. We prove that C is a σfield and then since it
contains all closed sets (If C is a closed set and Fn = {x ∈ Rn : d(x, C) < 1/n},
then Fn is open, C ⊂ C ⊂ Fn and P (Fn − C) → 0 because Fn ↓ C), it
will follow that C = B(Rn ) which will complete the proof of the result. Let
B ∈ C and let � > 0. Then, there exists a closed C and an open F such that
C ⊂ B ⊂ F and P (F − C) < �. Then F c is closed, C c is open, F c ⊂ B c ⊂ C c
and P (C c − F c ) = P (F − C) < �, thereby proving that B c ∈ C, ie, C is closed
under complementation. Now, let Bn ∈ C, n = 1, 2, ... and let � > 0. Then, for
each n we can choose a closed NCn and an open Fn such that Cn ⊂ Bn ⊂F∞ n with
P (Fn − Cn ) < �/2n .
Then, n=1 Cn is closed for any finite integer N , n=1 Fn
is open and for B = n Bn , we have that
N

Cn ⊂ B ⊂ Fn , N ≥ 1
n=1 n
118126 Stochastic 7.
CHAPTER Processes
SOMEinASPECTS
Classical and
OFQuantum Physics PROCESSES
STOCHASTIC and Engineering
and further

N

N

P( Fn − Cn ) ≤ P (( (Fn − Cn )) Fn )
n n=1 n=1 n>N
N
� � N
� �
≤ P (Fn − Cn ) + P (Fn ) ≤ �/2n + P (Fn )
n=1 n>N n=1 n>N
which converges as N → ∞ to � and hence for sufficiently large N , it must be

smaller than 2� proving thereby that B ∈ C, ie, C is closed under countable
unions. This completes the proof that C is a σfield and hence the proof of the
result.
(c) What is a consistent family of finite dimensional probability distributions
on (RZ+ , B(RZ+ ))?
Solution: Such a family is a family of probability distributions
F (x1 , ..., xN , n1 , ..., nN )
on RN for each set of positive integers n1 , ..., nN and N = 1, 2, ... satisfying
(a)
F (xσ1 , ..., xσN , nσ1 , ..., nσN ) = F (x1 , ..., xN , n1 , ..., nN )
for all permutations σ of (1, 2, ..., N ) and
(b)
limxN →∞ F (x1 , ..., xN , n1 , ..., nN ) = F (x1 , ..., xN −1 , n1 , ..., nN −1 )
(d) Prove Kolmogorov’s consistency theorem, that given any consistent fam
ily of finite dimensional probability distributions on (RZ+ , B(RZ+ )), there exists
a probability space and a real valued stochastic process defined on this space
whose finite dimensional distributions coincide with the given ones.
Let F (x1 , ..., xN ; n1 , ..., nN ), n1 , ..., nN ≥ 0, x1 , ..., xN ∈ R, N ≥ 1 be a con
sistent family of probability distributions.
Thus F (x1 , ..., xN ; 1, 2, ..., N ) = FN (x1 , ..., xN )
is a probability distribution on RN with the property that
limxN →∞ FN (x1 , ..., xN ) = FN −1 (x1 , ..., xN −1 )
The aim is to construct a (countably additive)
probability measure P on (RZ+ , B(RZ+ ) such that
P (B × RZ+ ) = PN (B), B ∈ B(RN )
where �
PN (B) = FN (x1 , ..., xN )dx1 ...dxN
B
Define
Q(B × RZ+ ) = PN (B), B ∈ B (RN )
Then by the consistency of FN , N ≥ 1, Q is a well defined finitely additive
probability measure on the field

F= (B(RN ) × RZ+ )
N ≥1
127
Note that
BN = B(RN ) × RZ+
is a field on RZ+ for every N ≥ 1 and that
BN ⊂ BN +1 , N ≥ 1
Ñ ∈
It suffices to prove that Q is countably additive on F , or equivalently, if B
˜ ˜
BN is a sequence such that BN ↓ φ, then Q(BN ) ↓ 0. Equivalently, it suffices
to show that if BN ∈ B(RN ) is a sequence such that
˜ N = BN × R Z + ↓ φ
B
then
PN (BN ) ↓ 0
Suppose that this is false. Then, without loss of generality, we may assume that
there is a δ > 0 such that PN (BN ) ≥ δ∀N ≥ 1. By regularity of probability
measures on RN for N finite, we can find for each N ≥ 1, a compact set
KN ∈ B(RN ) such that KN ⊂ BN and PN (BN − KN ) < δ/2N . Now consider
K̃N = (K1 × RN −1 ) ∩ (K2 ∩ RN −2 ) ∩ ... ∩ (KN −1 × R) × KN
Then
K̃N ⊂ KN ⊂ RN
and K˜ N being a closed subset of the compact set KN is also compact. K ˜ N is
closed because each Kj , j = 1, 2, ..., N is compact and hence closed. Define
˜ N × RZ + , N ≥ 1
ˆN = K
K
ˆ N is decreasing and that
Then, it is clear that K
˜ N = BN × R Z +
K̂N ⊂ B
Further,
ˆ N ) = Q(B
Q(K Ñ ) − Q(B
Ñ − K
ˆN )
Now,
Ñ ) = PN (BN ) ≥ δ
Q(B
and
N
�
Ñ − K
Q(B ˆ N ) = PN (BN − (Kk × RN −k ))
k=1
N

= PN ( (BN − (Kk × RN −k ))
k=1
N

≤ PN ( (Bk − Kk ) × RN −k )
k=1
N
� N
�
≤ Pk (Bk − Kk ) = δ/2k ≤ δ/2
k=1 k=1
Thus,
ˆ N ) ≥ δ/2
Q(K
and hence K ˆ N is nonempty for each N . Hence, so is K
˜ N for each N . Now
ˆ ˜ ˆ
KN ⊂ BN ↓ φ. Hence KN ↓ φ and we shall obtain a contradiction to this as
˜ N is nonempty, we may choose
follows: Since K
˜ N , N = 1, 2, ...
(xN 1 , ..., xN N ) ∈ K
Then,
xN 1 ∈ K1 , (xN 1 , xN 2 ) ∈ K2 , ..., (xN 1 , ..., xN N ) ∈ KN
Since K1 is compact, xN 1 , N ≥ 1 has a convergent subsequence say {xN1 ,1 }.
Now, (xN1 ,1 , xN1 ,2 ) ∈ K2 and since K2 is compact, this has a convergent subse
quence say (xN2 ,1 , xN2 ,2 ). Then, {xN2 ,1 } is a subsequence of {xN1 ,1 } and hence
also converges. Continuing this way, we finally obtain a convergent sequence
(yn,1 , ..., yn,N ), n = 1, 2, ... in KN such that (yn,1 , ...yn,j ) ∈ Kj , j = 1, 2, ..., n
and of course (yn,1 , ..., yn,j ), n ≥ 1 converges. Let
yj = limn yn,j , j = 1, 2, ...
Then, it is clear that since Kj is closed,
(y1 , ..., yj ) ∈ Kj , j = 1, 2, ...
Hence
(y1 , y2 , ...yN ) ∈ K̃N , N ≥ 1
and hence,
ˆN , N ≥ 1
(y1 , y2 , ...) ∈ K
which means that �
(y1 , y2 , ...) ∈ ˆN
K
N ≥1
which contradicts the fact that the rhs is an empty set. This proves the Kol
mogorov consistency theorem.
7.1. PROBLEM SUGGESTED BY PROF.DHANANJAY 129
7.1 Problem suggested

When the magnet falls within the tube of radius R0 with the coil having n0
turns per unit length and extending from z = h to z = h + a and with the
tube extending from z = 0 to z = d, we write down the equations of motion
of the magnet. We usually may assume that at time t, the magnetic moment
vector of the magnet is m and that its centre does not deviate from the central
axis of the tube. This may be safely assumed provided that the tube’s radius is
small. However suppose we do not make this assumption so that we aim a the
most general possible nonrelativistic situation in which the magnetic fields are
strong which may cause the position of the CM of the magnet to deviate from
the central axis of the cylinder. At time t, assume that the CM of the magnet
is located at r0 (t) = (ξ1 (t), ξ2 (t), ξ3 (t)). We require a differential equation for
ξ(t). The magnet at time t is assumed to have experienced a rotation R(t) (a
3 × 3 rotation matrix). Thus, it moment at time t is given by
m(t) = m0 R(t)ẑ
assuming that at time t = 0, the moment pointed along the positive z direction.
The magnetic field produced by the magnet at time t is given by
Bm (t, r) = curl(µ/4π)(m(t) × (r − r0 (t))/|r − r0 (t)|3 )
and the flux of this magnetic field through the coil is

1 h+a 1
Φ(t) = n0 dz Bmz (t, x, y, z)dxdy
h x2 +y 2 ≤R02
The angular momentum vector of the magnet is proportional to its magnetic

moment. So by the torque equation, we have
dm(t)/dt = γ.m(t) × Bc (t, r0 (t)) − − − (1)
where Bc (t, r) is the magnetic field produced by the coil. The coil current is
I(t) = Φ' (t)/RL
where RL is the coil resistance. Then, the magnetic field produced by the coil
is
1 2n0 π
Bc (t, r) = (µ/4π)I(t) [(−ˆ
x.sin(φ)+ˆ
y.cos(φ))×(r−R0 (cos(φ)ˆ
x
0
+sin(φ)ŷ)−(a/2πn0 )φẑ]
/|r − R0 (cos(φ)x̂ + sin(φ)ŷ) − (a/2πn0 )φẑ|3 R0 dφ

This expression can be used in (1). Note that I(t) depends on Φ(t) which in
turn is a function of m(t). Since in fact I(t) is proportional to Φ' (t), it follows
that I(t) is function of m(t) and m' (t). Note that Bc (t, r) is proportional to
I(t) and is therefore also a function of m(t) and m' (t). However, we also note
that Bm (t, r) is also a function of r0 (t). Hence, Φ(t) is also a function of r0 (t)
and hence I(t) is also a function of r0 (t) and r�0 (t). Thus, Bc (t, r) is a function
of r0 (t), r�0 (t), m(t), m� (t), ie, we can express it as
Bc (t, r|r0 (t), r�0 (t), m(t), m� (t))
Thus, we write (1) as
dm(t)/dt = γ.m(t) × Bc (t, r0 (t)|r0 (t), r0� (t), m(t), m� (t)) − − − (2)
Next, the force exerted on the magnet by the magnetic field produced by the
coil is given by
F(t) = �r0 (t) (m(t).Bc ((t, r0 (t)|r0 (t), r0� (t), m(t), m� (t))
which means that the rectilinear equation of motion of the magnet must be
given by
M r0�� (t) = F(t) − M gẑ
However to be more precise, the term V = −m(t).Bc (t, r0 (t)|r0 (t), r0� (t), m(t), m� (t))
represents a potential energy of interaction between the magnet and the reac
tion field produced by the coil. So its recilinear equation of motion should be
given by
d/dt∂L/∂r0� (t) − ∂L/∂r0 (t) = 0
where L, the Lagrangian is given by
L = M |r0� (t)|2 /2 − V − mgz0 (t)
Thus the rectilinear equation of motion of the coil is given by
M r0�� (t) + d/dt(�r�0 (t) m(t).Bc (t, r0 (t)|r0 (t), r0� (t), m(t), m� (t)))
−�r0 (t) (m(t).Bc (t, r0 (t)|r0 (t), r0� (t), m(t), m� (t))) + mgẑ = 0 − − − (3)
(2) and (3) constitute the complete set of six coupled ordinary differential equa
tions for the six components of m(t), r0 (t).
7.2 Converse of the Shannon coding theorem

Let A be an alphabet. Let u1 , ..., uM ∈ An with
Pu i = P
where for u ∈ An , Pu (x) = N (x|u)/n, x ∈ A with N (x|u) denoting the number

of times that x has occurred in u. Let νx (y), x ∈ A, y ∈ B be the discrete
memory channel transition probability from the alphabet A to the alphabet B.
Define �
Q(y) = P (x)νx (y), y ∈ B
x
7.2. CONVERSE OF THE SHANNON CODING THEOREM 131
let D1 , ..., DM be disjoint sets in B n such that νuk (Dk ) > 1 − �, k = 1, 2, ..., n,
where νu (v) = Πnk=1 νu(k) (v(k)) for u = (u(1), ..., u(n)), v = (v(1), ..., v(n)). Let
Ek = Dk ∩ T (n, νuk , δ), k = 1, 2, ..., M
where T (n, f, δ) is the set of δBernoulli typical sequences of length n having

probability distribution f . For u ∈ An with Pu = P , define T (n, νu , δ) to consist
of all v ∈ B n for which
vx ∈ T (N (x|u), νx , δ)∀x ∈ A
Then we claim that if v ∈ T (n, νu , δ), then

√
v ∈ T (n, Q, δ a)
The notation used is as follows. For x ∈ A, and u ∈ An , v ∈ B n , define
vx = (v(k) : u(k) = x, k = 1, 2, ..., n)
Note that vx ∈ B N (x|u) . To prove this claim, we note that for y ∈ B,

� �
|N (y|v) − nQ(y)| = | N (y|vx ) − n P (x)νx (y)|
x x
� � �
=| N (y|vx ) − N (x|u)νx (y)| ≤ |N (y|vx ) − N (x|u)νx (y)|
x x x
� �
≤ δ N (x|u)νx (y)(1 − νx (y))
x
�
√ �
≤ δ a. N (x|u)νx (y)(1 − νx (y))
x
�
√ �
= δ an. P (x)νx (y)(1 − νx (y))
x
Now, �
P (x)νx (y) = Q(y),
x
� �
P (x)νx (y)2 ≥ ( P (x)νx (y))2 = Q(y)2
x x
and therefore,
√ �
|N (y|v) − nQ(y)| ≤ δ an Q(y)(1 − Q(y)
proving the claim. For v ∈√Ek , we have v ∈ T (n, νuk , δ) and hence by the result
just proved, v ∈ T (n, Q, δ a).
Now let v ∈ T (n, Q, δ).

It follows that since
Qn (v) = Πy∈B Q(y)N (y|v)
and since �
|N (y|v) − nQ(y)| ≤ δ nQ(y)(1 − Q(y)), y ∈ B
we have
� �
nQ(y) − δ nQ(y)(1 − Q(y)) ≤ N (y|v) ≤ nQ(y) + δ nQ(y)(1 − Q(y)), y ∈ B
and therefore,
√ √
Πy Q(y)nQ(y)+δ nQ(y)(1−Q(y))
≤ Qn (v) ≤ Πy Q(y)nQ(y)−δ nQ(y)(1−Q(y))
Thus summing over all v ∈ T (n, Q, δ), we get

√ √
2−nH(Q)−c n
µ(T (n, Q, δ)) ≤ Qn (T (n, Q, δ) ≤ 2−nH(Q)+c n
µ(T (n, Q, δ))
where ��
c = −δ Q(y)(1 − Q(y))log2 (Q(y))
y
and where µ(E) denotes the cardinality of a set E. In particular,
µ(T (n, Q, δ)) ≤ 2nH(Q)+c.sqrtn
It follows that
M
�
√ √
µ(Ek ) = µ( Ek ) ≤ µ(T (n, Q, δ a)) ≤ 2nH(Q)+c an
k=1 k
On the other hand,

νuk (Dk ) > 1 − �
and by the Chebyshev inequality combined with the union bound,
νx,N (x|u) (T (N (x|u), νx , δ)) ≥ 1 − b/δ 2
and hence
νu (T (n, νu , δ)) ≥ 1 − ab/δ 2
for any u ∈ An . In particular,
νuk (Ek ) ≥ 1 − ab/δ 2 − �, k = 1, 2, ..., M
Further, by the above argument, we have for v ∈ T (n, νu , δ),

√
νx,N (x|u) (vx ) ≤ 2N (x|u)H(νx )+c n
and hence, � √
νu (v) ≤ 2− x N (x|u)H(νx )−c n
7.2. CONVERSE OF THE SHANNON CODING THEOREM 133
√
= 2−nI(P,ν)+c n
where �
I(P, ν) = P (x)H(νx ), P = Pu
x
From this we easily deduce that

√
νuk (Ek ) ≤ 2−nI(P,ν)+c n
µ(Ek ), k = 1, 2, ..., M
Thus, √
(1 − ab/δ 2 − �)2nI(P,ν)−c n
≤ µ(Ek ), k = 1, 2, ..., M
so that
√ � √
M (1 − ab/δ 2 − �)2nI(P,ν)−c n
≤ µ(Ek ) ≤ 2nH(Q)+c n
or equivalently,
√
log(M )/n ≤ H(Q) − I(P, ν) + O(1/ n)
which proves the converse of Shannon’s coding theorem.

Chapter 8
Chapter 8
Problems in
Problems inElectromagnetics
and Gravitation
electromagnetics and
gravitation
The magnetic field produced by a magnet of given magnetic dipole moment in

a background gravitational field taking spacetime curvature into account. The
Maxwell equations in curved spacetime are
√ √
(F µν −g),ν = J µ −g
or equivalently √ √
(g µα g νβ −gFαβ ),ν = J µ −g
Writing
gµν (x) = ηµν + hµν (x)
where η is the Minkowski metric and h is a small perturbation, we have approx
imately
g µν = ηµν − hµν , hµν = ηµα ηνβ hαβ
and √
−g = 1 − h/2, h = ηµν hµν
We have
Fµν = Aν,µ − Aµ,ν
and writing
Aµ = A(0) (1)
µ + Aµ
(1)
where Aµ(0) is the zeroth order solution, ie, without gravity and Aµ the first
order correction to the solution, ie, it is a linear functional of hµν , we obtain
using first order perturbation theory
(0) (1)
Fµν = Fµν + Fµν
135
127
136CHAPTER 8. PROBLEMS IN ELECTROMAGNETICS AND GRAVITATION
where
(k)
Fµν = A(k) (k)
ν,µ − Aµ,ν , k = 0, 1
√
g µα g νβ −g =
(ηµα − hµα )(ηνβ − hνβ )(1 − h/2)
= ηµα ηνβ − fµναβ
where
fµναβ = ηµα ηνβ h/2 + ηµα hνβ
+ηνβ hµα
Then √
g µα g νβ −gFαβ =
(0)
ηµα ηνβ Fαβ
(1) (0)
+ηµα ηνβ Fαβ − fµναβ Fαβ
= F 0µν + F 1µν
where
(0)
F 0µν = ηµα ηνβ Fαβ ,
(1) (0)
F 1µν = ηµα ηνβ Fαβ − fµναβ Fαβ
Also √
J µ −g = J 0µ + J 1µ , J 0µ = J µ , J 1µ = −J µ h/2
The gauge condition √
(Aµ −g),µ = 0
becomes √
0 = (g µν −gAν ),µ =
[(ηµν − hµν )(1 − h/2)Aν ],µ
= [(ηµν − kµν )(A(0) (1)
ν + Aν )],µ
where
kµν = hµν − hηµν /2
and hence equating coefficients of zeroth and first orders of smallness on both
sides gives
ηµν A(0)
ν,µ = 0,
(kµν Aν(0) ),µ = ηµν Aν,µ

(1)
Chapter 9 Chapter 9
Someaspects
Some more More Aspects
of of
Quantum information
quantum Information Theory
theory
9.1 An expression for state detection error

Let ρ, σ be two states and let their spectral decompositions be
� �
ρ= |ea > p(a) < ea |, σ = |fa > q(a) < fa |
a a
so that < ea |eb >= δ(a, b), < fa |fb >= δ(a, b). The relative entropy between
these two states is
D(ρ|σ) = T r(ρ.(log(ρ) − log(σ)))
� �
= p(a)log(p(a)) − p(a)log(qb )| < ea |fb > |2
a a,b
�
= (p(a)| < ea |fb > |2 (log(p(a)| < ea |fb > |2 ) − log(qb | < ea |fb > |2 ))
a,b
�
= (P (a, b)log(P (a, b)) − log(Q(a, b))) = D(P |Q)
a,b
where P, Q are two bivariate probability distributions defined by
P (a, b) = p(a)| < ea |fb > |2 , Q(a, b) = q(b)| < ea |fb > |2
Now let 0 ≤ T ≤ 1, ie, T is a detection operator. Consider the detection error

probabiity
2P (�) = T r(ρ(1 − T )) + T r(σT )
�
= (p(a) < ea |1 − T |ea > −q(a) < fa |T |fa >)
a
137
129
130138CHAPTER
Stochastic
9. SOME Processes
MOREin Classical and
ASPECTS Quantum Physics
OF QUANTUM and Engineering
INFORMATION THEORY
� �
= p(a)(1− < ea |T |ea >) − q(a)| < fa |eb > |2 < eb |T |fa >
a a,b
2 ∗
Suppose T is a projection, ie, T = T = T . Then,
� �
T r(ρ(1 − T )) = p(a) < ea |1 − T )|ea >= p(a) < ea |(1 − T )2 |ea >
a a
�
= p(a)| < ea |1 − T |fb > |2
a,b
Likewise, �
T r(σT ) = q(b) < ea |T |fb > |2
a,b
Hence,
�
2P (�) = (p(a)| < ea |1 − T |fb > |2 + q(b) < ea |T |fb > |2 )
a,b
�
≥ min(p(a), q(b))(| < ea |1 − T |fb > |2 + q(b) < ea |T |fb > |2 )
a,b
�
≥ (1/2)min(p(a), q(b))| < ea |fb > |2
a,b
where we have used

2(|z|2 + |w|2 ) ≥ |z + w|2
for any two complex numbers z, w. Note that
�
(1/2)min(p(a), q(b))| < ea |fb > |2 =
a,b
�
(1/2)min0≤t(a,b)≤1,a,b=1,2,...,n [p(a)| < ea |fb > |2 (1−t(a, b))+
a,b
q(b)| < ea |fb > |2 t(a, b))]
because if u, v are positive real numbers, then as t varies over [0, 1], tu + (1 − t)v
attains is minimum value of v when t = 0 if u ≥ v and u when t = 1 if u < v.
In other words,
mint∈[0,1] (tu + (1 − t)v) = min(u, v)
In other words we have proved that
�
2P (�) ≥ (1/2)min0≤t(a,b)≤1,a,b=1,2,...,n [P (a, b)(1 − t(a, b)) + Q(a, b)t(a, b)]
a,b
The quantity on the rhs is the minimum error probability for the classical hy
pothesis testing problem between the two bivariate probability distributions
P, Q.
9.2. OPERATOR CONVEX FUNCTIONS AND OPERATOR MONOTONE FUNCTIONS1
9.2 Operator convex functions and operator mono

tone functions
[1] Let f be operator convex. Then, if K is a contraction matrix, we have
f (K ∗ XK) ≤ K ∗ f (X)K
for all Hermitian matrices X. By contraction we mean that K ∗ K ≤ I. To see

this define the following positive (ie nonnegative) matrices
√ √
L = 1 − K ∗ K, M = 1 − KK ∗
Then by Taylor expansion and the identity

K(K ∗ K)m = (KK ∗ )m K, m = 1, 2, ...,
we have
KL = M K, LK ∗ = K ∗ M
Define the matrix � �
K −M
U=
L K∗
and observe that since L∗ = L, M ∗ = M and
K ∗ K + L2 = I, −K ∗ M + LK ∗ = 0, −M K + KL = 0, M 2 + KK ∗ = I
it follows that U is a unitary matrix. Likewise define the unitary matrix

� �
K M
V =
L −K ∗
Also define for two Hermitian matrices A, B

� �
A 0
T = ,
0 B
Then, we obtain the identity
� ∗ �
∗ ∗ K AK + LBL 0
(1/2)(U T U + V T V ) =
0 KBK ∗ + M AM
Using operator convexity of f , we have
f ((1/2)(U ∗ T U +V ∗ T V )) ≤ (1/2)(f (U ∗ T U )+f (V ∗ T V ))

= (1/2)(U ∗ f (T )U +V ∗ f (T )V )
which in view of the above identity, results in
f (K ∗ AK + LBL) ≤ K ∗ f (A)K + Lf (B)L
and
f (KBK ∗ + M AM ) ≤ Kf (B)K ∗ + M f (A)M
140CHAPTER 9. SOME MORE ASPECTS OF QUANTUM INFORMATION THEORY
In particular, taking B = 0, the first gives

f (K ∗ AK) ≤ K ∗ f (A)K + f (0)L2
and choosing A = 0, the second gives
f (KB K ∗ ) ≤ Kf (B)K ∗ + f (0)M 2
In particular, if in addition to being operator convex, f satisfies f (0) ≤ 0, then
we get
f (K ∗ AK) ≤ K ∗ f (A)K, f (KBK ∗ ) ≤ Kf (B)K ∗
for any contraction K. Now we show that if f is operator convex and f (0) ≤ 0,
then for operators K1 , ..., Kp that satisfy
p
�
Kj∗ Kj ≤ I
j=1
we have
p
� p
�
f( Kj∗ Aj Kj ) ≤ Kj∗ f (Aj )Kj
j=1 j=1
In fact, defining ⎛ ⎞
K1 0 ... 0
⎜ K2 0 ... 0 ⎟⎟
K=⎜
⎝ .. .. ... .. ⎠
Kp 0 ... 0
we observe that � �p �
Kj∗ Kj 0
K ∗K = j=1 ≤I
0 0
ie K is a contraction and hence by the above result
f (K ∗ T K) ≤ K ∗ f (T )K
Taking
T = diag[A1 , A2 , ..., Ap ]
we get � �p �
Kj∗ Aj Kj 0
K ∗T K = j=1
0 0
and hence, � �p �
f( j=1 Kj∗ Aj Kj ) 0
0 f (0)
� �p �
Kj∗ f (Aj )Kj 0
j=1
≤
0 0
which results in the desired inequality. Note that from the above discussion, we
also have � �
f( Kj Bj Kj∗ ) ≤ Kj f (Bj )Kj∗
j j
9.3. A SELECTION OF MATRIX INEQUALITIES 141
9.3 A selection of matrix inequalities

[1] If A, B are square matrices the eigenvalues of AB are the same as those of
BA. If A is nonsingular, then this is obvious since
A−1 (AB)A = BA
Likewise if B is nonsingular then also it is obvious since
B −1 (BA)B = AB
If A is singular, we can choose a sequence cn → 0 so that A + cn I is nonsingular

for every n and then
(A + cn I)−1 (A + cn I)B(A + cn I) = B(A + cn I)∀n
Thus, (A + cn I)B has the same eigenvalues as those of B(A + cn I) for each n.
Taking the limit n → ∞, the result follows since the eigenvalues of (A + cI)B
are continuous functions of c.
[2] Let A be any square diagonable matrix. Let λ1 , ..., λn be its eigenvalues.
We have for any unit vector u,
| < u, Au > |2 ≤< u, A∗ Au >
Let e1 , ..., en be a basis of unit vectors of A corresponding to the eigenvalues

λ1 , ..., λn respectively. Assume that
|λ1 >≥ |λ2 | ≥ ... ≥ |λn |
Let M be any k dimensional subspace and consider the n − k + 1 dimensional

subspace W = span{ek , ek+1 ..., en }. Then dim(M ∩ W ) ≥ 1 and hence we can
choose a unit vector u ∈ M ∩ W . Then writing u = ck ek + ... + cn en , it follows
that
| < u, Au > | = |ck λk ek + ... + cn λn en | ≤ |λk |.(|ck | + ... + |cn |)
Suppose we GramSchmidt orthonormalize ek , ek+1 , ..., en in that order. The

resulting vectors are
wk = ek , wk+1 ∈ span(ek , ek+1 ), ..., wn ∈ span(ek , ..., en )
Specifically,
wk+1 = r(k + 1, k)ek + r(k + 1, k + 1)ek+1 , ..., wn = r(n, k)ek + ... + r(n, n)en
Then we can write
u = dk wk +...+dm wn , |dk |2 +...+|dn |2 = 1, < wr , ws >= δ(r, s), r, s = k, k+1, ..., n

134142CHAPTER
Stochastic
9. SOME Processes
MOREin Classical and
ASPECTS Quantum Physics
OF QUANTUM and Engineering
INFORMATION THEORY
and we find that
Awk = λk wk , Awk+1 = r(k + 1, k)λk ek + r(k + 1, k + 1)λk+1 ek+1 ,
Awn = r(n, k)λk ek + ... + r(n, n)λn en

Diagonalize A so that
n
�
A= λk ek fk∗ , fk∗ ej = δ(k, j)
k=1
Then,
< fk , Aek >= λk
Thus,
|λk | ≤� fk � . � Aek �
Let A⊗a k denote the kfold tensor product of A with itself acting on the
k fold antisymmetric tensor product of Cn . Then, the eigenvalues of A⊗a n
are λi1 ...λik with n ≥ i1 > i2 > ... > ik ≥ 1 and likewise its singular values
are si1 ...sik , n ≥ i1 > i2 > ... > ik ≥ 1 where s1 ≥ s2 ≥ ... ≥ sn ≥ 0 are
the singular values of A. Now if T is any operator whose eigenvalue having
maximum magnitude is µ1 and whose maximum singular value is s1 , then we
have
|µ1 | = spr(T ) = limn � T n �1/n ≤� T �= s1
Applying this result to T = A⊗a k gives us
Πki=1 |λi | ≤ Πki=1 si , k = 1, 2, ..., n
[3] If A, B are such that AB is normal, then
� AB �≤� BA �
In fact since AB is normal, its spectral radius coincides with its spectral norm
and further, since AB and BA have the same eigenvalues, the spectral radii of
AB and BA are the same and finally since the spectral radius of any matrix
is always smaller than its spectral norm (� T n �1/n ≤� T �), the result follows
immediately:
� AB �= spr(AB) = spr(BA) ≤� BA �
[1] Show that if S, T ≥ 0 and S ≤ 1, then
1 − (S + T )−1/2 S(S + T )−1/2 ≤ 2 − 2S + 4T

9.3. A SELECTION OF MATRIX INEQUALITIES 143
[2] Show that if Sk , Tk , Rk , k = 1, 2 are positive operators in a Hilbert space

such that (a) [S1 , S2 ] = [T1 , T2 ] = [[R1 , R2 ] = 0 and Sk + Tk ≤ Rk , k = 1, 2,,
then for any 0 ≤ s ≤ 1, we have Lieb’s inequality:
R11−s R2s ≥ S11−s S2s + T11−s T2s
First we show that this inequality holds for s = 1/2, then we show that if it
holds for s = µ, ν, then it also holds for s = (µ + ν)/2 and that will complete
the proof.
1/2 1/2 1/2 1/2
| < y|S1 S2 + T1 T2 |x > |2
1/2 1/2 1/2 1/2
≤ (| < S1 y, S2 x > | + | < T1 y, T2 x > |)2
1/2 1/2 1/2 1/2
≤ [� S1 y � . � S2 x � + � T1 y � . � T2 x �]2
1/2 1/2
≤ [� S1 y �2 + � T1 y �2 ]
1/2 1/2
×[� S2 x �2 + � T2 x �2 ]
=< y, (S1 + T1 )y >< x, (S2 + T2 )x >≤< y, R1 y >< x, R2 x >
−1/4 1/4 −1/4 1/4
Then, replacing y by R1 R2 x and x by R2 R1 x in this inequality gives
us (recall that R1 and R2 are assumed to commute)
−1/4 1/4 1/2 1/2 1/2 1/2 −1/4 1/4
< x|R1 R2 (S1 S2 + T1 T2 )R2 R1 |x >
1/2 1/2
≤< x, R1 R2 x >
Now define the matrices
−1/4 1/4 1/2 1/2 1/2 1/2 −1/4 1/4
B = R1 R2 (S1 S2 + T1 T 2 , A = R 2 R1
Then it is clear that

1/2 1/2 1/2 1/2
AB = S1 S2 + T1 T 2
is a Hermitian positive matrix because [S1 , S2 ] = [T1 , T2 ] = 0 and we also have
the result that the eigenvalues of AB are the same as those of BA. Further,
from the above inequality, the k th largest eigenvalue of BA is ≤ the k th largest
1/2 1/2
eigenvalue of R1 R2 . It follows therefore that the k th largest eigenvalue of the
Hermitian matrix AB is ≤ the k th largest eigenvalue of the Hermitian matrix
1/2 1/2
R1 R2 . However, from this result, we cannot deduce the desired inequality.
To do so, we note that writing
1/2 1/2 1/2 1/2
S 1 S2 + T1 T 2
we have proved that
| < y, Cx > |2 ≤< y, R1 y >< x, R2 x >
This implies
−1/2 −1/2
| < R1 y, CR2 x > | ≤� y � . � x �
for all x, y and hence

−1/2 −1/2
| < y, R1 CR2 x > |/ � y � . � x �≤ 1
for all nonzero x, y. Taking supremum over x, y yields

−1/2 −1/2
� R1 CR2 �≤ 1
and by normality of
−1/4 −1/4 −1/4 −1/4
R1 R2 CR1 R2 =
1/4 −1/4 −1/2 −1/4 −1/4
(R1 R2 )(R1 CR1 R2 ) = PQ
where
1/4 −1/4 −1/2 −1/4 −1/4
P = (R1 R2 ), Q = (R1 CR1 R2 )
we get
� P Q �≤� QP �=
−1/2 −1/2
� R1 CR2 �≤ 1
which implies since P Q is positive that
PQ ≤ I
or equivalently
−1/4 −1/4 −1/4 −1/4
(R1 R2 CR1 R2 )≤I
from which we deduce that (Note that R1 and R2 commute)
1/2 1/2
C ≤ R1 R2
proving the claim.

Now suppose it is valid for some s = µ, ν. Then we have
R11−µ R2µ ≥ S11−µ S2µ + T11−µ T2µ ,
and
R11−ν R2ν ≥ S11−ν S2ν + T11−ν T2ν ,
Then, replacing R1 by R11−µ R2µ , R2 by R11−ν R2ν , S1 by S11−µ S2µ , S2 by S11−ν S2ν ,
T1 by T11−µ T2µ and T2 by T11−ν T2ν and noting that these replacements satisfy all
the required conditions of the theorem, gives us
1−(µ+ν)/2 (µ+ν)/2 1−(µ+ν)/2 (µ+ν)/2 1−(µ+ν)/2 (µ+ν)/2
R1 R2 ≥ S1 S2 + T1 T2 ,
ie the result also holds for s = (µ + ν)/2. Since it holds for s = 0, 1/2, 1 we
deduce by induction that it holds for all dyadic rationals in [0, 1], ie, for all s of
the form k/2n , k = 0, 1, ..., 2n , n = 0, 1, .... Now every real number in [0, 1] is a
limit of dyadic rationals and hence by continuity of the inequality, the theorem
has been proved.
Stochastic
9.3. A Processes
SELECTIONin Classical and Quantum
OF MATRIX Physics and Engineering
INEQUALITIES 137
145
An application example:
Remark: Let T be a Hermitian matrix of size n × n. Let M be an n − k + 1
dimensional subspace and let λ1 ≥ ... ≥ λn be the eigenvalues of T arranged in
decreasing order. Then,
supx∈M < x, T x >≥ λk
and thus
infdimN =n−k+1 supx∈N < x, T x >= λk
Likewise, if M is a k dimensional subspace, then
infx∈M < x, T x >≤ λk
and hence
supdimN =k infx∈N < x, T x >= λk
Thus,
λk (AB) = λk (BA) ≤ λk (C), k = 1, 2, ..., n
where
1/2 1/2
C = R1 R2
Let A, B be two Hermitian matrices. Choose a subspace M of dimension

n − k + 1 so that
λk (A)↓ = supx∈M < x, Ax >
Then,
λk (A − B)↓ = infdimN =n−k+1 supx∈N < x, (A − B)x >
≤ supx∈M < x, (A − B)x >= supx∈M (< x, Ax > − < x, Bx >)
≤ supx∈M < x, Ax > −infx∈M < x, Bx >
= λk (A)↓ − infx∈M < x, Bx >
λk (A − B)↓ ≤ λk (A)↓ − infx∈M < x, Bx >
≤ λk (A)↓ − λn (B)↓
Now let M be any n − k + 1 dimensional subspace such that
infx∈M < x, Bx >= λn−k+1 (B)↓ = λk (B)↑
Then,
λk (A−B)↓ ≤ supx∈M < x, (A−B)x >≤ supx∈M < x, Ax > −infx∈M < x, Bx >
≤ λk (A)↓ − λk (B)↑
This is a stronger result than the previous one.
9.4 Matrix Holder Inequality

Now let A, B be two matrices. Consider (AB)⊗a k . its maximum singular value
square is
� (AB)⊗a k �2
which is the maximum eigenvalue of ((AB)∗ (AB))⊗a k . This maximum eigen
value equals the product of the first k largest eigenvalues of (AB)∗ (AB), ie, the
product of the squares of the first k largest singular values of AB. Thus, we can
write
� (AB)⊗a k �= Πkj=1 sj (AB)↓
On the other hand,
� (AB)⊗a k �=� A⊗a k . � B ⊗a k �
≤� A⊗a k � . � B ⊗a k �
Now, in the same way, ≤� A⊗a k � equals Πkj=1 sj (A)↓ and likewise for B. Thus
we obtain the fundamental inequality
Πkj=1 sj (AB)↓ ≤ Πkj=1 sj (A)↓ sj (B)↓ , k = 1, 2, ..., n
and hence we easily deduce that
s(AB) ≤ s(A).s(B)
which is actually a shorthand notation for

k
� k
�
sj (AB)↓ ≤ sj (A)↓ .sj (B)↓ , k = 1, 2, ..., n
j=1 j=1
In particular, taking k = n gives us for p, q ≥ 1 and 1/p + 1/q = 1,

n
� � �
T r(|AB|)) = sj (AB) ≤ ( sj (A)p )1/p .( sj (B)q )1/q
j=1 j j
Now, �
T r(|A|1/p ) = sj (A)1/p
j
and likewise for B. Therefore, we have proved the Matrix Holder inequality
T r(|AB|) ≤ (T r(|A|p ))1/p .(T r(|B|q ))1/q
for any two matrices A, B such that AB exists and is a square matrix. In
fact, the method of our proof implies immediately that if ψ(.) is any unitarily
invariant matrix norm, then
ψ(AB) ≤ ψ(|A|p ).1/p .ψ(|B|q )1/q

9.4. MATRIX HOLDER INEQUALITY 147
This is seen as follows. Since ψ is a unitarily invariant matrix norm,

ψ(A) = Φ(s(A))
where Φ() is a convex mapping from Rn+ into R+ that satisfies the triangle
inequality and in fact all properties of a symmetric vector norm and also since
ap /p + bq /q ≥ ab, a, b > 0
we get for t > 0,
Φ(x.y) = φ(tx.y/t) ≤ φ(tp xp )/p + φ(y q /tq )/q
≤ tp φ(xp )/p + φ(y q )/qtq
Now utp + v/tq is minimized when
putp−1 − qv/tq+1 = 0
or equivalently, when
tp+q = qv/pu
Substituting this value of t immediately gives
Φ(x.y) ≤ Φ(xp )1/p .Φ(y q )1/q
and hence since
s(A.B) ≤ s(A).s(B)
(in the sense of majorization), it follows that for two matrices A, B,
ψ(A.B) = Φ(s(A.B)) ≤ Φ(s(A).s(B)) ≤ Φ(s(A)p )1/p .Φ(s(B)q )1/q
= ψ(|A|p )1/p .Φ(|B|q )1/q
Note that the eigenvalues (or equivalently the singular values) of |A|p are same
as s(A)p .
Remark: We are assuming that if x, y are positive vectors such that x < y
in the sense of majorization, then
Φ(x) ≤ Φ(y)
This result follows since if a < b are two positive real numbers, then
a = tb + (1 − t)0
for t = a/b ∈ [0, 1] and by using the convexity of Φ and the fact that Φ(0) =
�k �k
0. If x < y, then i=1 xi ≤ i=1 yi , k = 1, 2, ..., n and hence writing ξk =
�k �k
i=1 xi , ηk = i=1 yi , k = 1, 2, ..., n, we get by regarding Φ(x) as a function of
ξk , k = 1, 2, ..., n and noting that convexity of Φ in x implies convexity of Φ in
ξ and the above result that
Φ(ξ) = Φ(ξ1 , ..., ξn ) ≤ Φ(η1 , ξ2 , ..., ξn ) ≤
Φ(η1 , η2 , ξ3 , ..., ξn ) ≤ ... ≤ Φ(η1 , ..., ηn ) = Φ(η)
Chapter 10
Chapter 10
Lecture Plan for Statistical
Lecture PlanSignal
for Statistical
Processing
Signal Processing
[1] Revision of basic probability theory: Probability spaces, random vari
ables, expectation, variance, covariance, characteristic function, convergence of
random variables, conditional expectation. (four lectures)
[2] Revision of basic random process theory:Kolmogorov’s existence theo
rem, stationary stochastic processes, autocorrelation and power spectral density,
higher order moments and higher order spectra, Brownian motion and Poisson
processes, an introduction to stochastic differential equations. (four lectures)
[3] Linear estimation theory: Estimation of parameters in linear models and
sequential linear models using least squares and weighted least squares methods,
estimation of parameters in linear models with coloured Gaussian measurement
noise using the maximum likelihood method, estimation of parameters in time
series models (AR,MA,ARMA) using least squares method with forgetting fac
tor, Estimation of parameters in time series models using the minimum mean
square method when signal correlations are known. Linear estimation theory as
a computation of an orthogonal projection of a Hilbert space onto a subspace,
the orthogonality principle and normal equations. (four lectures)
[4] Estimation of nonrandom parameters using the Maximum likelihood
method, estimation of random parameters using the maximum aposteriori method,
basic statistical hypothesis testing, the CramerRao lower bound. (four lectures)
[5] nonlinear estimation theory: Optimal nonlinear minimum mean square
estimation as a conditional expectation, applications to nonGaussian models.
(four lectures)
[6] Finite order linear prediction theory: The YuleWalker equations for esti
mating the AR parameters. Fast order recursive prediction using the Levinson
Durbin algorithm, reflection coefficients and lattice filters, stability in terms of
reflection coefficients. (four lectures)
[7] The recursive least squares algorithm for parameter estimation linear
149
141
150CHAPTER 10. LECTURE PLAN FOR STATISTICAL SIGNAL PROCESSING
models, simultaneous time and order recursive prediction of a time series using
the RLSLattice filter. (four lectures)
[8] The noncausal Wiener filter, causal Wiener filtering using Wiener’s spec
tral factorization method and using Kolmogorov’s innovations method, Kol
mogorov’s formula for the mean square infinite order prediction error. (four
lectures)
[9] An introduction to stochastic nonlinear filtering: The KushnerKallianpur
filter, the Extended Kalman filter approximation in continuous and discrete
time, the KalmanBucy filter for linear state variable stochastic models. (four
lectures)
[10] Power spectrum estimation using the periodogram and the windowed
periodogram, mean and variance of the periodogram estimate, high resolution
power spectrum estimation using eigensubspace methods like Pisarenko har
monic decomposition, MUSIC and ESPRIT algorithms. Estimation of the power
spectrum using time series models. (four lectures)
[11] Some new topics:
[a] Quantum detection and estimation theory, Quantum Shannon coding
theory. (four lectures)
[b] Quantum filtering theory (four lectures)
Total: Forty eight lectures.
[1] Large deviations for the empirical distribution of a stationary Markov

process.
[2] Proof of the GartnerEllis theorem.
[3] Large deviations in the iid hypothesis testing problem.
[4] Large deviations for the empirical distribution of Brownian motion pro
cess.
Let Xn , n ≥ 0 be a Markov process in discrete time. Consider
N
�
MN (f ) = Eexp( f (Xn ))
n=0
Then, given X0 = x
MN (f ) = (πf )N (1)
where π is the transition probability operator kernel:
�
π(g)(x) = π(x, dy)g(y)
and �
πf (g)(x) = exp(f (x))π(x, dy)g(y)
ie
πf (x, dy) = exp(f (x))π(x, dy)
151
If we assume that the state space E is discrete, then π(x, y) is a stochastic

matrix and πf (x, y) = exp(f (x))π(x, y). Let Df denote the diagonal matrix
diag[f (x) : x ∈ E]. Then
πf = exp(Df )π
and
MN (f ) = (exp(Df )π)N (1)
The GartnerEllis limiting logarithmic moment generating function is then
Λ(f ) = limN →∞ N −1 .log((exp(Df )π)N (1)) = log(λm (f ))
where λm (f ) is the maximum eigenvalue of the matrix exp(Df )π. Now let
�
I(µ) = supu>0 µ(x)log((πu)(x))/u(x))
x
where µ is a probability distribution on E. We wish to show that

�
I(µ) = supf ( f (x)µ(x) − λm (f ))
x
= supf (< f, µ > −λm (f ))

10.1 EndSemester Exam, duration:four hours,

Pattern Recognition
[1] [a] Consider N iid random vectors Xn , n = 1, 2, ..., N where Xn has the prob
�K
ability density k=1 π(k)N (x|µk , Σk ), ie, a Gaussian mixture density. Derive
the optimal ML equations for estimating the parameters θ = {π(k), µk , σk ) : k =
1, 2, ..., K} from the measurements X = {Xn : n = 1, 2, ..., N } by maximizing
N
L K
L
ln(p(X|θ)) = ln( π(k)N (Xn |µk , Σk ))
n=1 k=1
[b] By introducing latent random vectors z(n) = {z(n, k) : k = 1, 2, ..., K}, n =

1, 2, ..., N which are iid with P (z(n, k) = 1) = π(k) such that for each n, exactly
one of the z(n, k)' s equals one and the others are zero, cast the ML parameter
estimation equation in the recursive EM form, namely as
θm+1 = argmaxθ Q(θ, θm )
where L
Q(θ, θm ) = ln(p(X, Z|θ)).p(Z|X, θm )
z
with Z = {z(n) : n = 1, 2, ..., N }. Specifically, show that

L
Q(θ, θm ) = γnk (X, θm )ln(π(k)N (Xn |µ(k), Σk )))p(Z|X, θm )
θm+1 = argmaxθ Q(θ, θm )
where L
Q(θ, θm ) = ln(p(X, Z|θ)).p(Z|X, θm )
z
with Z = {z(n) : n = 1, 2, ..., N }. Specifically, show that

L
Q(θ, θm ) = γnk (X, θm )ln(π(k)N (Xn |µ(k), Σk )))p(Z|X, θm )
n,k
where
γn,k (X, θm ) = E(z(n, k)|X, θm )
L L
= z(n, k)p(Z|X, θm ) = z(n, k)p(z(n, k)|X, θm )
Z z(n,k)
Derive a formula for p(z(n, k)|X, θm ). Derive the optimal equations for θm+1
by maximizing Q(θ, θm ).
[2] Let Xn , n = 1, 2, ..., N be vectors in RN . Choose K << N and vectors

{µ(k) : k = 1, 2, ..., K} and a partition
P = E1 ∪ E2 ∪ ... ∪ EK
of {1, 2, ..., N } into K disjoint nonempty subsets such that
LK L
E= � Xn − µ(k) �2
10.1. ENDSEMESTER EXAM, DURATION:FOUR
k=1 n∈E
HOURS, PATTERN RECOGNITION
k
is a minimum. Interpret this ”Clustering principle” from the viewpoint of data

compression and image segmentation. Explain how you would obtain an itera
tive solution to this clustering problem by defining
r(n, k) = 1, n ∈ Ek , r(n, k) = 0, n ∈
/ Ek
and expressing �
E= r(n, k) � Xn − µ(k) �2
n,k
Relate this clustering problem to the problem of calculating the µ(k)� s and the
π(k)� s in the Gaussian mixture model. Specifically, compare the equations for
calculating r(n, k), µ(k) in this clustering problem to the equations for calcu
lating the same in the GMM. Explain how the GMM model is a ”smoothened
version” of the clustering model by remarking that the event Xn ∈ Ek is to
be interpreted in the GMM context by saying that Xn has the N (x|µ(k), Σk )
�N
distribution and that n=1 r(n, k)/N = card(Ek )/N corresponds to the prob
ability π(k) of the k th component in the GMM occurring. Derive the explicit
value that r(n, k) corresponds to in the GMM by showing that the optimal π(k)
in the GMM satisfies
N
� π(k)N (Xn |µk , Σk )
π(k) = �K
n=1 m=1 π(m)N (Xn |µ(m), Σm )
�N �
Compare this with the quantity π(k) = n=1 (r(n, k)/N ), n,k r(n, k) = N in
the clustering problem.
[3] Let {Z(n) : n = 1, 2, ..., N } be a Markov chain in discrete state space
value that r(n, k) corresponds to in the GMM by showing that the optimal π(k)
in the GMM satisfies
N
�and Quantum
Stochastic Processes inπ(k)
Classical
= Physics
π(k)N (X n |µk ,and
Σk )Engineering 145
�K
n=1 m=1 π(m)N (Xn |µ(m), Σm )
�N �
Compare this with the quantity π(k) = n=1 (r(n, k)/N ), n,k r(n, k) = N in
the clustering problem.
[3] Let {Z(n) : n = 1, 2, ..., N } be a Markov chain in discrete state space
and let π(Z(n), Z(n + 1)|θ) be its one step transition probability distribution.
Let π0 (Z(0)) denote its initial distribution. Let X = (X(n) : n = 0, 1, ..., N ) be
the HMM emission random variables with the conditional probability density
p(X(n)|Z(n), θ) defined.
[a] Show that the EM algorithm for estimating θ recursively amounts to
maximizing Q(θ, θ0 ) where
Q(θ, θ0 ) =
N
�
E(ln(p(X(n)|Z(n), θ))|X, θ0 )
n=0
N
� −1
+ E(ln(π(Z(n), Z(n + 1)|θ0 ))|X, θ0 )
n=0
[b] Show that for evaluating the conditional expectation in [a], we require the
conditional probabilities p(Z(n)|X, θ0 ) and p(Z(n), Z(n+1)|X, θ0 ). Explain how
you would evaluate these conditional probabilities recursively by first proving
the relations
p(X|Z(n), θ0 ) = p(X(N ), X(N −1), ..., X(n)|Z(n), θ0 ).p(X(n−1),
X(n−2), ..., X(0)|Z(n), θ0 )
and
p(X|Z(n + 1), Z(n), θ0 ) =
p(X(N ), ..., X(n + 1)|Z(n + 1, θ0 ).p(X(n − 1), ..., X(0)|Z(n), θ0 )
4 [a] How would you approximate (based on matching only first and second
order statistics) an N dimensional random vector X by a p << N dimensional
random vector S using the linear model
ˆ + V, X
X ≈ WS + µ + V = X ˆ = WS + µ
where S, V are zero mean mutually uncorrelated random vectors with covari
ances ΣS , ΣV respectively and W is an N × p matrix to be determined. To
solve this problem, you must assume that E(X) is well approximated by µ and
Cov(X) is well approximated by WΣS WT + ΣV .
[b] Now consider the maximum likelihood version of the above PCA in which
the data is modeled by an iid sequence:
Xn = WSn + µ + Vn , n = 1, 2, ..., N
where the S�n s are iid N (0, ΣS ) random vectors, the Vn� s are iid N (0, Σv ) ran
dom vectors with the two sequences being mutually independent. Calculate
the equations that determine the optimal choice of W, µ that would maximize
p(Xn , n = 1, 2, ..., N |W, µ). Also by assuming the S�n s to be latent random
4 [a] How would you approximate (based on matching only first and second
order statistics) an N dimensional random vector X by a p << N dimensional
random vector S using the linear model
X ≈ WS + µ + V = X̂ + V, X̂ = WS + µ
where S, V are zero mean mutually uncorrelated random vectors with covari
ances ΣS , ΣV respectively and W is an N × p matrix to be determined. To
solve this problem, you must assume that E(X) is well approximated by µ and
Cov(X) is well approximated by WΣS WT + ΣV .
[b] Now consider the maximum likelihood version of the above PCA in which
the data is modeled by an iid sequence:
Xn = WSn + µ + Vn , n = 1, 2, ..., N
where the S�n s are iid N (0, ΣS ) random vectors, the Vn� s are iid N (0, Σv ) ran
dom vectors with the two sequences being mutually independent. Calculate
the equations that determine the optimal choice of W, µ that would maximize
p(Xn , n = 1, 2, ..., N |W, µ). Also by assuming the S�n s to be latent random
vectors, derive the EM algorithm recursive equations for estimating W, µ from
the Xn , n = 1, 2, ..., N .
Solution:
p({Xn }N N
n=1 , {Sn }n=1 |W, µ)
= p({Xn }N N
n=1 |{Sn }, W, µ)p({Sn }n=1 )
= ΠN
n=1 [fV (Xn − WSn − µ)fS (Sn )]
So in the EM algorithm, the function to be maximized is
Q(θ, θ0 )
where
θ = (W, µ), θ0 = (W0 , µ0 )
and
N
�
N
Q(θ, θ0 ) = E[log(fV (Xn − WSn − µ))|{Xm }m=1 , W0 , µ0 )]
n=1
N
�
N
10.1. ENDSEMESTER log(fS (S
+ EXAM, n ))|{Xm }m=1 , W0 ,HOURS,
DURATION:FOUR µ0 ) PATTERN RECOGNITION
n=1
N
�
= E[log(fV (Xn − WSn − µ))|Xn , W0 , µ0 )]
n=1
N
�
+ log(fS (Sn ))|Xn , W0 , µ0 )
n=1
Note that only the first sum in this expression must be maximized since the
second one does not depend on W, µ. However, if ΣS , Σv also have to be
estimated then the parameter vector would be
θ = (W, µ, ΣS , Σv )
and then the second term would also count. Note further, that
p({Sn }N |{Xn }N , W, µ) =
Note that only the first sum in this expression must be maximized since the
second one does not depend on W, µ. However, if ΣS , Σv also have to be
estimated then the parameter vector would be
θ = (W, µ, ΣS , Σv )
and then the second term would also count. Note further, that
p({Sn }N N
n=1 |{Xn }n=1 , W, µ) =
[ΠN N
n=1 fV (Xn − WSn − µ)fS (Sn )]/Πn=1 fX (Xn |W, µ)
with fX being N (µ, WΣs WT + Σv )
[c] If ΣV = σ 2 I, then derive the EM recursions for estimating W, µ, ΣS , σ 2

and compare with the exact maximum likelihood equations. Explain why this
process is called principle component analysis and explain how much data com
pression we have achieved by this process.
[d] Consider now the following data matrix version of PCA. X is a given
N × M real matrix. Show that it has a singular value decomposition (SVD)
X = UΣV
where U and V are respectively N × N and M × M real orthogonal matrices

and Σ is a diagonal N × M matrix with positive diagonal entries either N or
M in number, depending on whether N or M is smaller. Let r < min(N, M )
and then using the SVD, construct an N�× M matrix Y of rank r so that
� X − Y �2F is a10.
156CHAPTER minimum
LECTURE � Z �FFOR
wherePLAN T r(ZT Z) is theSIGNAL
= STATISTICAL Frobenius norm of
PROCESSING
the real matrix Z.
[5] Consider the binary hypothesis testing problem in which under the hy
pothesis H1 , each sample of the iid sequence X(n), n ≥ 1 has the probability
density p1 (X) while under H0 , each sample of this iid sequence has the prob
ability density p0 (X). Show that the NeymanPearson criterion of decision
based on the observations X(n), n = 1, 2, ..., N leads to the test of deciding H1
�N
in the case when N −1 n=1 L(X(n)) ≥ η and deciding H0 in the case when
�N
N −1 n=1 L(X(n)) < η where
L(X) = ln(p1 (X)/p0 (X))
and the threshold η(N ) is such that

� N
�
−1
0<α= ΠN
n=1 (p0 (X(n))dX(n)) = P (N L(X(n)) > η|H0 )
Z1 n=1
Here, Z1 is the region where H1 is decided. Estimate this probability by the

LDP and calculate using the LDP the value of η for given α. Hence use this
threshold η to calculate the other error probability
� N
�
−1
β(N ) = ΠN
n=1 (p1 (X(n))dX(n)) = P (N L(X(n)) < η|H1 )
Z1c n=1
and show that for any α > 0 however small, β(N ) if η = η(N ) is obtained as
above for the given α, then β(N ) converges to zero at the rate of D(p0 |p1 ), ie,
N −1 log(β(N )) → D(p0 |p1 ), N → ∞

LDP and calculate using the LDP the value of η for given α. Hence use this
threshold η to calculate the other error probability
� N
�
148 β(N ) = Stochastic
ΠN (p
n=1 1
Processes in
(X(n))dX(n))Classical
= P and
(N Quantum
−1 Physics
L(X(n)) < and
η|H1Engineering
)
Z1c n=1
and show that for any α > 0 however small, β(N ) if η = η(N ) is obtained as
above for the given α, then β(N ) converges to zero at the rate of D(p0 |p1 ), ie,
N −1 log(β(N )) → D(p0 |p1 ), N → ∞
where �
D(p0 |p1 ) = p0 (X).log(p0 (X)/p1 (X)).dX
[6] This problem deals with nonparametric density estimation. Let p1 (X), p0 (X)
be two different probability densities of a random vector X ∈ Rn . To estimate
p1 at a given point X, we take samples from this distribution and draw a sphere
of minimum radius R1 with centre at X, for which the first sample point falls
within this sphere. Let V (R1 ) denote the volume of such a sphere. Then, esti
mate p1 (X) as 1/V (R1 ). Likewise do so for estimating p0 (X). Let the minimum
radius for this be R0 . Justify using the law of large numbers that such a non
parametric density estimate is reliable. Show that given a sample point X, in
order to test whether X came from p1 or from p0 , we use the ML criterion,
namely decide p1 if p̂1 (X) ≥ p̂0 (X) and decide p0 otherwise, where p̂ denotes
the estimate of p obtained using the above scheme. Show that this MLE method
is the same as the minimum distance method, in the sense that if the first sam
10.1. ENDSEMESTER
ple point that we decide EXAM,
p1 if R1DURATION:FOUR HOURS,
(X) < R0 (X) and decide p0 PATTERN RECOGNITION1
otherwise where
R1 (X) is the distance of the first point taken from the p1 sample from X and
likewise R0 (X) is the distance of the first point taken from the p0 sample from
X.
[7] Let a group G act on a set M and let f0 , f1 : M → R be two image fields
defined on the set M such that
f1 (x) = f0 (g0−1 x) + w(x), x ∈ M
where g0 ∈ G and w is a zero mean Gaussian noise field. Choose a point x0 ∈ M

and replace the above model by
f˜1 (g) = f˜0 (g0−1 g) + w(g), g ∈ G
where
f˜1 (g) = f1 (gx0 ), f˜0 (g) = f0 (gx0 )
Let π be a unitary representation of G in a the vector space Cp . Define the
Fourier transform of a function F on G evaluated at π by
�
F(F )(π) = F (g)π(g)dg
G
where dg is the left invariant Haar measure on G. Show that the above image
model on G gives on taking Fourier transforms
F(f˜1 )(π) = π(g0 )F(f˜0 )(π) + F(w)(π)

Let π be a unitary representation of G in a the vector space Cp . Define the
Fourier transform of a function F on G evaluated at π by
�
F(F )(π) = F (g)π(g)dg
G
where dg is the left invariant Haar measure on G. Show that the above image
model on G gives on taking Fourier transforms
F(f˜1 )(π) = π(g0 )F(f˜0 )(π) + F(w)(π)
Let H be the subgroup of G consisting of all h ∈ G for which hx0 = x0 . Show

that f˜k (gh) = f˜k (g), k = 1, 2, g ∈ g, h ∈ H. Hence, deduce that if
�
PH = π(h)dh
H
where dh denotes the Haar measure on H, then PH is the orthogonal projection

onto the set of all vectors v ∈ Cp for which π(h)v = v∀h ∈ H provided that H is
unimodular, ie, its left invariant Haar measure coincides with its right invariant
Haar measure. Deduce further that
F(f˜k )(π) = F̃ (f˜k )(π)PH , k = 0, 1
Hence show that the above model for the image Fourier transforms is equivalent
to
F(f˜1 )(π)PH = π(g0 )F(f˜0 )(π)PH + F(w)(π)PH
From this model, explain how by varying the representation π, we can estimate
the transformation g0 by minimizing
�
E(g0 )10. ˜1 )(πk )PH − πk (g0 )F(f˜0 )(πk )PH �2
W (k) � F(f
= LECTURE
158CHAPTER PLAN FOR STATISTICAL SIGNAL PROCESSING
k
where � . � is the Frobenius norm and W (k), k ≥ 1 are positive weights. By

starting with an initial guess estimate g00 for g0 and writing g0 = (1 + δX).g00
where δX is small and belongs to the Lie algebra of G, derive a set of linear
equations for δX and solve them. For this, you can use the approximate identity
π((1 + δX)g00 ) = π(exp(δX)g00 ) = exp(dπ(δX))π(g00 ) = (1 + dπ(δX))π(g00 )
where dπ is the differential of π, ie, it is the representation of the Lie algebra

of G corresponding to the representation π of G. Note that dπ is a linear map
and hence if {L1 , ..., LN } denotes a basis for the Lie algebra of G then
N
� N
�
δX = x(k)Lk , dπ(δX) = x(k)dπ(Lk )
m=1 m=1
The problem of estimating δX is then equivalent to estimating x(k), k = 1, 2, ..., N .

Also explain how you would approximately evaluate the mean and covariance
of the random variables x(k), k = 1, 2, ..., N in terms of the noise correlations.
Estimating the EEG parameters from the speech parameters using the EKF.
The EEG model is
XE (t + 1) = A(θ)XE (t) + WE (t + 1) ∈ RN
δX = x(k)Lk , dπ(δX) = x(k)dπ(Lk )
m=1 m=1
The problem of estimating δX is then equivalent to estimating x(k), k = 1, 2, ..., N .

150 Stochastic
Also explain how Processes
you would in Classicalevaluate
approximately and Quantum Physics
the mean andand Engineering
covariance
of the random variables x(k), k = 1, 2, ..., N in terms of the noise correlations.
Estimating the EEG parameters from the speech parameters using the EKF.
The EEG model is
XE (t + 1) = A(θ)XE (t) + WE (t + 1) ∈ RN
where
p
�
A(θ) = A0 + θ(k)Ak
k=1
θ = (θ(k) : k = 1, 2, ..., p)T ∈ Rp

is the EEG parameter vector to be estimated. The speech model on the other
hand is
m
�
XS (t) = rs(k)XS (t − k) + rs (m + 1) + WS (t) ∈ R
k=1
The speech parameter vector is
rs = (rs (k); k = 1, 2, ..., m + 1)T ∈ Rm+1
We assume that the speech and EEG parameters are correlated, ie, there exists
an affine relationship connecting the two of the form
rs = Cθ + d
The regression matrices C, d are assumed to be fixed, ie, they do not vary
from patient to patient. These matrices can be estimated using the following
10.1. ENDSEMESTER
procedure: Choose a bunchEXAM,
of KDURATION:FOUR
patients numbered HOURS,
1, 2, ..., KPATTERN RECOGNITION
having various
kinds of brain diseases. For each patient, collect his EEG and speech data and
apply the EKF separately to these data to using the above models and noisy
measurement models of the form
ZE (t) = XE (t) + VE (t), ZS (t) = XS (t) + VS (t)
Denote the parameter estimates for the k th patient by θˆk , rs

ˆ k for k = 1, 2, ..., , p.
Then the regression matrices C, d are estimated by minimizing
K
�
E(C, d) = ˆ k − Cθˆk − d �2
� rs
k=1
For convenience of notation, we write
θk = θˆk , rsk = r̂sk
and then the optimal equations for the regression matrices obtained by setting
the gradient of E(C, d) to zero are given by
K
�
(rsk − Cθk − d)θkT = 0,
k=1
E(C, d) = � r̂sk − Cθ̂k − d �
k=1
For convenience of notation, we write

θk = θ̂k , rsk = r̂sk
and then the optimal equations for the regression matrices obtained by setting
the gradient of E(C, d) to zero are given by
K
�
(rsk − Cθk − d)θkT = 0,
k=1
K
�
(rsk − Cθk − d) = 0
k=1
which can be expressed as
CRθ + dµTθ = Rsθ ,
Cµθ + d = µs
where
K
�
Rθ = K −1 . θk .θkT ∈ Rp×p
k=1
K
�
µθ = K −1 θk ∈ Rp×1
k=1
K
�
Rsθ = K −1 rsk θkT ∈∈ R(m+1)×(m+1) ,
k=1
K
�
−1
µs = K rsk ∈ R(m+1)×1
k=1
Eliminating d from these equations gives

CRθ + (µs − Cµθ )µθT = Rsθ
giving
C = (Rsθ − µs µTθ )(Rθ − µθ µθT )−1
and
µs − Cµθ
This solves the problem of determining the regression coefficient matrices. Now
given a fresh patient, we wish to estimate his EEG parameters from only his
speech data. To this end, we substitute into the speech model the expression
for the speech parameters in terms of the EEG parameters using our regres
sion scheme: First transform the speech model into a state variable model by
introducing state variables:
ξ1 (t) = XS (t − 1), ..., ξm (t) = XS (t − m)
and
ξ(t) = [ξ1 (t), ..., ξm (t)]T
The speech model then becomes in state variable form
for the speech parameters in terms of the EEG parameters using our regres
sion scheme: First transform the speech model into a state variable model by
introducing state variables:
ξ1 (t) = XS (t − 1), ..., ξm (t) = XS (t − m)
and
ξ(t) = [ξ1 (t), ..., ξm (t)]T
The speech model then becomes in state variable form
ξ(t + 1) = B(rs )ξ(t) + u.rsm+1 + bWs (t)
where ⎛ ⎞
rs1 r2 rs3 ... rsm
⎜ 1 0 0 ... 0 ⎟
B(rs) = ⎜
⎝ ..
⎟
.. .. ... .. ⎠
0 0 0.. 1 0
We can write
m
�
B(rs) = rsk Bk
k=1
and hence our speech model becomes

m
�
ξ(t + 1) = ( rsk Bk )ξ(t) + u.rsm+1 + bWs (t)
k=1
Now
rsk = (Cθ)k + dk , k = 1, 2, ..., m + 1
Writing
m
�
F1 (ξ, θ) = rsk Bk ξ + u.rsm+1
k=1
m
�
= ((Cθ)k + dk )Bk ξ + u(Cθ)m+1 + dm+1 )
k=1
we find that
�m
∂F1 /∂ξ = [ ((Cθ)k + dk )Bk ]
10.2. QUANTUM STEIN’S THEOREM k=1 161
m
� m
�
∂F1 /∂θ = [ Ck1 Bk ξ+Cm+1,1 u, Ck2 Bk ξ
k=1 k=1
m
�
+Cm+1,2 u, ..., Ckp Bk ξ+Cm+1,p u]
k=1
10.2 Quantum Stein’s theorem

We wish to discriminate between two states ρ, σ using a sequence of iid mea
surements, ie, measurements taken when the states become ρ⊗n and σ ⊗n , n =
1, 2, .... Let 0 ≤ Tn ≤ 1 be the detection operators. If the state is ρ⊗n , then the
probability of a wrong decision is P1 (n) = T r(ρ⊗n (1 − Tn )) while if the state is
σ ⊗n , the probability of a wrong decision is P2 (n) = T r(σ ⊗n Tn ). The theorem
that we prove is the following: If the measurement operators {Tn } are such that
P2 (n) → 0, then P1 (n) cannot go to zero at a rate faster than exp(−nD(σ|ρ)),
∂F1 /∂θ = [ Ck1 Bk ξ+Cm+1,1 u, Ck2 Bk ξ
k=1 k=1
�m
Stochastic Processes in Classical and Quantum
+CPhysicsu,and
..., Engineering
C B ξ+C 153
m+1,2 kp k m+1,p u]
k=1
10.2 Quantum Stein’s theorem

We wish to discriminate between two states ρ, σ using a sequence of iid mea
surements, ie, measurements taken when the states become ρ⊗n and σ ⊗n , n =
1, 2, .... Let 0 ≤ Tn ≤ 1 be the detection operators. If the state is ρ⊗n , then the
probability of a wrong decision is P1 (n) = T r(ρ⊗n (1 − Tn )) while if the state is
σ ⊗n , the probability of a wrong decision is P2 (n) = T r(σ ⊗n Tn ). The theorem
that we prove is the following: If the measurement operators {Tn } are such that
P2 (n) → 0, then P1 (n) cannot go to zero at a rate faster than exp(−nD(σ|ρ)),
ie,
liminfn→ n−1 .log(P1 (n)) ≥ −D(σ|ρ)
and conversely, for any δ > 0, there there exists a sequence {Tn } of detection
operators such that P2 (n) → 0 and simultaneously
limsupn n−1 .log(P1 (n)) ≤ −D(σ|ρ) + δ
To prove this, we first use the monotonicity of the Renyi entropy to deduce that
for any detection operator Tn ,
(T r(ρ⊗n (1 − Tn )))s T r(σ ⊗n (1 − Tn ))1−s
≤ T r(ρ⊗ns) σ ⊗n(1−s) )
for any s ≤ 0. Therefore, if
T r(σ ⊗n Tn ) → 0
then
T r(σ ⊗n (1 − Tn )) → 1
and we get from the above on taking logarithms,
limsupn−1 .log(T r(ρ⊗n (1 − Tn )) ≥ log(T r(ρs σ 1−s ))/s
or equivalently writing
α = limsupn−1 .log(P1 (n))
we get
α ≥ s−1 T r(ρs σ 1−s )∀s ≤ 0
Taking lims ↑ 0 in this expression gives us
α ≥ −D(σ|ρ) = −T r(σ(log(σ) − log(ρ)))
This proves the first part. For the second part, we choose Tn in accordance with
the optimal NeymanPearson test. Specifically, choose
Tn = {exp(nR)ρ⊗n ≥ σ ⊗n }
where the real number R will be selected appropriately.

10.3. QUESTION PAPER, SHORT TEST, STATISTICAL SIGNAL PROCESSING, M.TE
10.3 Question paper, short test, Statistical sig

nal processing, M.Tech
[1] Consider the AR(p) process
p
�
x[n] = − a[k]x[n − k] + w[n]
k=1
where w[n] is white noise with autocorrelation

2
E(w[n]w[n − k]] = σw δ[n − kj
Assume that
p
�
A(z) = 1 + a[k]z −k = Πrm=1 (1 − zm z −1 )Πr+l
m=r+1 (1 − zm z
−1
)(1 − z̄m z −1 )
k=1
where p = r + 2l and z1 , ..., zr+l are all within the unit circle with z1 , ..., zr being
real and zr+1 , ..., zr+l being complex with nonzero imaginary parts. Assume
that the measured process is
y[n] = x[n] + v[n]
where v[.] is white, independent of w[.] and with autocorrelation σv2 δ[n]. Assume
that
2
σw + σv2 A(z)A(z −1 ) = B(z)B(z −1 )
with
p
B(z) = KΠm=1 (1 − pm z −1 )
where p1 , ..., pp are complex numbers within the unit circle. Find the value of K
2
in terms of σw , σv2 , a[1], ..., a[p]. Determine the transfer function of the optimal
noncausal and causal Wiener filters for estimating the process x[.] based on y[.]
and cast the causal Wiener filter in time recursive form. Finally, determine the
mean square estimation errors for both the filters.
[2] Consider the state model
X[n + 1] = AX[n] + bw[n + 1]
and an associated measurement model
z[n] = Cx[n] + v[n]
Given an AR(2)process
x[n] = −a[1]x[n − 1] − a[2]x[n − 2] + w[n]

and a measurement model
z[n] = x[n] + v[n]
with w[.] and v[.] being white and mutually uncorrelated processes with vari
2
ances of σw and σv2 respectively. Cast this time series model in the above state
variable form with A being a 2×2 matrix, X[n] a 2×1 state vector, b a 2×1 vec
tor and C a 1 × 2 matrix. Write down explicitly the Kalman filter equations for
this model and prove that in the steady state, ie, as n → ∞, the Kalman filter
converges to the optimal causal Wiener filter. For proving this, you may assume
an appropriate factorization of σv2 A(z)A(z −1 ) + σw
2
into a minimum phase and
a nonminimum phase part where A(z) = 1 + a[1]z −1 + a[2]z −2 is assumed to
be minimum phase. Note that the steady state Kalman filter is obtained by
setting the state estimation error conditional covariance matrices for prediction
and filtering, namely P[n + 1|n], P[n|n] as well as the Kalman gain K[n] equal
to constant matrices and thus express the Kalman filter in steady state form as
ˆ
X̂[n + 1|n + 1] = DX[n|n] + f z[n + 1]
where D, f are functions of the steady state values of P[n|n] and P[n + 1|n]
which satisfy the steady state Riccati equation.
[3] Let Xk , k = 1, 2, ..., N and θ be random vectors defined on the same

probability space with the conditional distribution of X1 , ..., XN given θ being
that of iid N (θ, R1 ) random vectors. Assume further that the distribution of
θ is N (µ, R2 ). Calculate (a) p(X1 , ..., XN |θ),(b) p(θ|X1 , ..., XN ), (3) the ML
estimate of θ given X1 , ..., XN and (4) the MAP estimate of θ given X1 , ..., XN .
Also evaluate
Cov(θˆM L − θ), Cov(θM AP − θ)
Which one has a larger trace ? Explain why this must be so.
[4] Let X1 , ..., XN be iid N (µ, R). Evaluate the Fisher information matrix
elements
∂2
−E[ ln(p(X1 , ..., XN |µ, R)]
∂µa ∂µb
∂2
−E[ ln(p(X1 , ..., XN |µ, R)]
∂µa ∂Rbc
∂2
−E[ ln(p(X1 , ..., XN |µ, R)]
∂Rab ∂Rcd
Taking the special case when the X�k s are 2dimensional random vectors, evalu
ˆ ab ), a, b = 1, 2
ate the CramerRao lower bound for V ar(µ̂a ), a = 1, 2 and V ar(R
ˆ
where µ̂ and R are respectively unbiased estimates of µ and R.
10.3. QUESTION PAPER, SHORT TEST, STATISTICAL SIGNAL PROCESSING, M.TE
[5] Let ρ(θ) denote a mixed quantum state in CN dependent upon a classical
parameter vector θ = (θ1 , ...θp )T . Let X, Y be two observables with eigende
compositions
N
� N
�
X= |ea > x(a) < ea |, Y = |fa > y(a) < fa |, < ea |eb >=< fa |fb >= δab
a=1 a=1
First measure X and note its outcome x(a). Then allowing for state collapse,
allow the state of the system to evolve from this collapsed state ρc to the new
state
m
� m
�
T (ρc ) = Ek ρc Ek∗ , Ek∗ Ek = I
k=1 k=1
Then measure Y in this evolved state and note the outcome y(b). Calculate the
joint probability distribution P (x(a), y(b)|θ). Explain by linearizing about some
θ0 , how you will calculate the maximum likelihood estimate of θ as a function
of the measurements x(a), y(b). Also explain how you will calculate the CRLB
for θ. Finally, explain how you will select the observables X, Y to be measured
in such a way that the CRLB is minimum when θ is a scalar parameter.
[6] For calculating time recursive updates of parameters in a linear AR model,

we require to apply the matrix inversion lemma to a matrix of the form
(λ.R + bbT )−1
while for calculating order recursive updates, we require to use the projection
operator update formula in the form
Pspan(,ξ) = PW + PP⊥
Wξ
and also invert a block structured matrices of the form

� �
A b
bT c
and
� �
c bT
b A
in terms of A−1 . Explain these statements in the light of the recurisive least
squares lattice algorithm for simultaneous time and order recursive updates of
the prediction filter parameters.
10.4 Question paper on statistical signal pro

cessing, long test
10.4.1 Atom excited by a quantum field
[1] Let the state at time t = 0 of an atom be ρA (0) and that of the field be the
coherent state |φ(u) >< φ(u)| so that the state of both the atom and the field
at time t = 0 is
ρ(0) = ρA (0) ⊗ |φ(u) >< φ(u)|
Let the Hamiltonian of the atom and field in interaction be
�
H(t) = HA + ω(k)a(k)∗ a(k) + δ.V (t)
k
�
where HA acts in the atom Hilbert space, HF = k ω(k)a(k)∗ a(k) acts in the
field Boson Fock space and the interaction Hamiltonian V (t) acts in the tensor
product of both the spaces. Assume that V (t) has the form
�
V (t) = (Xk (t) ⊗ ak + Xk (t)∗ ⊗ a(k)∗ )
k
where Xk (t) acts in the atom Hilbert space. Note that writing
exp(itHA )Xk (t)exp(−itHA ) = Yk (t),
exp(itHF )ak .exp(−itHF ) = ak .exp(−iω(k)t) = ak (t)

we have on defining the unperturbed (ie, in the absence of interaction) system
plus field Hamiltonian
�
H0 = HA + HF = HA + ω(k)a(k)∗ a(k)
k
that
�
W (t) = exp(itH0 )V (t).exp(−itH0 ) = (Yk (t) ⊗ ak (t) + Yk (t)∗ ⊗ ak (t)∗ )
k
show that upto O(δ 2 ) terms, the total system ⊗ field evolution operator U (t)
can be expressed as
U (t) = U0 (t).S(t)
where
U0 (t) = exp(−itH0 ) = exp(−itHA ).exp(−itHF )
is the unperturbed evolution operator and
� t �
2
S(t) = 1 − iδ W (s)ds − δ W (s1 )W (s2 )ds1 ds2
0 0<s2 <s1 <t
10.4. QUESTION PAPER ON STATISTICAL SIGNAL PROCESSING, LONG TEST167
= 1 − iδS1 (t) − δ 2 S2 (t)

Show that with this approximation, the state of the system and bath at time T
is given by
ρ(T ) = U0 (T )(1 − iδS1 (T ) − δ 2 S2 (T ))ρ(0).(1 + iδS1 (T ) − δ 2 S2 (T )∗ )U0 (T )∗
Evaluate using this expression, the atomic state at time T , as
ρA (T ) = T r2 ρ(T )
Use the fact that ρ(0) = ρA (0) ⊗ |φ(u) >< φ(u)| and that
T r(a(k)|φ(u) >< φ(u)|) =< φ(u)|a(k)|φ(u) >= u(k),
T r(a(k)∗ |φ(u) >< φ(u)|) =< φ(u)|a(k)∗ |φ(u) >= ū(k),

T r(a(k)a(m)|φ(u) >< φ(u)|) =< φ(u)|a(k)a(m)|φ(u) >= u(k)u(m),
T r(a(k)a(m)∗ |φ(u) >< φ(u)|) =< φ(u)|a(k)a(m)∗ |φ(u) >
=< φ(u)|[a(k), a(m)∗ ]|φ(u) > + < φ(u)|a(m)∗ a(k)|φ(u) >
= δ(k, m) + u(k)ū(m)
and
T r(a(k)∗ a(m)|φ(u) >< φ(u)|) =< φ(u)|a(k)∗ a(m)|φ(u) >=
= ū(k)u(m)
Remark: More generally, we can expand any general initial state of the
system (atom) and bath (field) as
�
ρ(0) = FA (u, ū) ⊗ |φ(u) >< φ(u)|dudū
�
= FA (u, ū)exp(−|u|2 ) ⊗ |e(u) >< e(u)|du.dū
In this case also, we can calculate assuming that ρ(0) is the initial state, final
state at time T using the formula
ρ(T ) = U0 (T )(1 − iδS1 (T ) − δ 2 S2 (T ))ρ(0).(1 + iδS1 (T ) − δ 2 S2 (T )∗ )U0 (T )∗
by evaluating this using the fact that if Y1 , Y2 are system operators, then
�
(Y1 ⊗ ak )( F (u, ū) ⊗ |φ(u) >< φ(u)|dudū)(Y2 ⊗ am )
�
= Y1 F (u, ū)Y2 exp(−|u|2 ) ⊗ ak |e(u) >< e(u)|am dudū − − − (1)
Now
ak |e(u) >= u(k)|e(u) >
a∗k |e(u) >= ∂|e(u) > /∂u(k)
so (1) evaluates to after using integration by parts,
�
Y1 F1 (u, ū)Y2 u(k)exp(−|u|2 ) ⊗ (∂/∂ ū(m))(|e(u) >< e(u)|)du.dū
�
= (−∂/∂ ū(m))(Y1 F (u, ū)Y2 ).exp(−|u|2 u(k))) ⊗ |e(u) >< e(u)|dudū
�
= (−∂/∂ ū(m) + u(m))(Y1 F (u, ū)Y2 u(k))|φ(u) >< φ(u)|du.dū
Likewise,
�
(Y1 ⊗ a∗k )( F (u, ū) ⊗ |φ(u) >< φ(u)|dudū)(Y2 ⊗ am )
�
= Y1 F (u, ū)Y2 ⊗ a∗k |φ(u) >< φ(u)|am dudū
�
= Y1 F (u, ū)Y2 exp(−|u|2 ) ⊗ (∂/∂u(k)).(∂/∂ ū(m))(|e(u) >< e(u)|)du.dū
�
= (∂ 2 /∂u(k)∂ū(m))(Y1 F (u, ū)Y2 .exp(−|u|2 )) ⊗ |e(u) >< e(u)|du.dū
�
= (exp(|u|2 )(∂ 2 /∂u(k)∂ū(m))(Y1 F (u, ū)Y2 .exp(−|u|2 )))⊗|φ(u) >< φ(u)|du.dū
�
= (∂/∂u(k) + ū(k))(∂/∂ ū(m) − u(m))(Y1 F (u, ū)Y2 ) ⊗ |φ(u) < φ(u)|dudū
and likewise terms like

�
(Y1 ⊗ a∗k )( F (u, ū) ⊗ |φ(u) >< φ(u)|dudū)(Y2 ⊗ a∗m )
and �
(Y1 ⊗ ak )( F (u, ū) ⊗ |φ(u) >< φ(u)|dudū)(Y2 ⊗ a∗m )
We leave it as an exercise to the reader to evaluate these terms and hence derive
upto the second order Dyson series approximation for the evolution operator
the approximate final state of the atom and field and then by partial tracing
over the field variables, to derive the approximate final atomic state under this
interaction of atom and field (ie, system and bath).
[2] If the state of the system is given by ρ = ρ0 + δ.ρ1 , then evaluate the
VonNeumann entropy of ρ as a power series in δ.
Hint:
10.4.2 Estimating the power spectral density of a station

ary Gaussian process using an AR model:Performance
analysis of the algorithm
The process x[n] has mean zero and autocorrelation R(τ ). It’s pth order predic
tion error process is given by
p
�
e[n] = x[n] + a[k]x[n − k]
k=1
where the a[k]� s are chosen to minimize Ee[n]2 . The optimal normal equations
are
E(e[n]x[n − m]) = 0, m = 1, 2, ..., p
and these yield the YuleWalker equations
p
�
R[m − k]a[k] = −R[m], m = 1, 2, ..., p
k=1
or equivalently in matrix notation,
a = −R−1
p rp
where
Rp = ((R[m − k]))1≤m,k≤p , rp = ((R[m]))pm=1
The minimum mean square prediction error is then
p
�
σ 2 = E(e[n]2 = E(e[n]x[n]) = R[0] + a[k]R[k] = R[0] + aT rp
k=1
= R[0] − rTp R−1

p rp
If the order of the AR model p is large, then e[n] is nearly a white process, in
fact in the limit p → ∞, e[n] becomes white, ie, E(e[n]x[n − m]) becomes zero
for all m ≥ 1 and hence so does E(e[n]e[n − m]). The ideal known statistics AR
spectral estimate is given by
σ2
SAR,p (ω) = S(ω) = ,
|A(ω)|2
A(ω) = 1 + aT e(ω),
e(ω) = ((exp(−jωm)))pm=1
Since in practice, we do not have the exact correlations available with us, we
estimate the AR parameters using the least squares method, ie, by minimizing
N
�
(x[n] + aT xn )2
n=1
170CHAPTER 10.in LECTURE
Classical and Quantum
PLAN FORPhysics and Engineering
STATISTICAL 161
SIGNAL PROCESSING
where
xn = ((x(n − m)))pm=1
The finite data estimate of Rp is
N
�
ˆ p = N −1
R xn nT
n=1
and that of rp is
N
�
−1
r̂p = N x[n]xn
n=1
We write R for Rp and R̂ = R+δR for R̂p . Likewise, we write r for rp and r +δr
for r̂p for notational convenience. Note that R̂ and r̂ are unbiased estimates of R
and r respectively and hence δR and δr have zero mean. Now the least squares
estimate â = a + δa of a is given by
â = a + δa = −R̂−1 r̂ = −(R + δR)−1 (r + δr)
= −(R−1 − R−1 δRR−1 + R−1 δR.R−1 δR.R−1 )(r + δr)

= a − R−1 δr − R−1 δRa + R−1 δR.R−1 δr + R−1 δR.R−1 δR.a
upto quadratic orders in the data correlation estimate errors. Thus, we can
write
δa = −R−1 δr − R−1 δRa + R−1 δR.R−1 δr + R−1 δR.R−1 δR.a
Likewise, the estimate of the prediction error energy σ 2 is given by
σ̂ 2 = σ 2 + δσ 2 = R̂[0] + âT r̂
and hence
δσ 2 = δR[0] + aT δr + δaT r + δaT δr
= δR[0] + aT δr + (−R−1 δr − R−1 δRa + R−1 δR.R−1 δr + R−1 δR.R−1 δr)T δr
+(−R−1 δr − R−1 δRa)T δr
again upto quadratic orders in the data correlation estimate errors. In terms
of the fourth moments of the process x[n], we can now evaluate the bias and
mean square fluctuations of â and σ̂ 2 w.r.t their known statistics values a, σ 2
respectively. First we observe that the bias of â is
E(δa) = E(−R−1 δr − R−1 δRa + R−1 δR.R−1 δr + R−1 δR.R−1 δR.a)
= E(R−1 δR.R−1 δr) + E(R−1 δR.R−1 δR.a)

Specifically, for the ith component of this, we find that with P = R−1 , and using
Einstein’s summation convention over repeated indices,
E(δai ) =
E(Pij δRjk Pkm δrm ) + E(Pij δRjk Pkm δRml al )

= Pij Pkm E(δRjk δrm ) + Pij Pkm al E(δRjk δRml )
Now,
N
�
E(δRjk .δrm ) = N −2 E[(x[n − k]x[n − j] − R[j − k])(x[l]x[l − m] − R[m])]
n,l=1
N
�
= N −2 (E(x[n − k]x[n − j]x[l]x[l − m]) − R[j − k]R[m])
n,l=1
upto fourth degree terms in the data. Note that here we have not yet made any
assumption that x[n] is a Gaussian process, although we are assuming that the
process is stationary at least upto fourth order. Writing
E(x[n1 ]x[n2 ]x[n3 ]x[n4 ]) = C(n2 − n1 , n3 − n1 , n4 − n1 )
we get
E(δRjk .δrm ) =
N
�
N −2 C(k − j, l − n + k, l − n + k − m)
n,l=1
10.4.3 Statistical performance analysis of the MUSIC and

ESPRIT algorithms
Let
R̂ = R + δR
be an estimate of the signal correlation matrix R. Let v1 , ..., vp denote the
signal eigenvectors and vp+1 , ..., vN the noise eigenvectors. The signal subspace
eigenmatrix is
VS = [v1 , ..., vp ] ∈ CN ×p
and the noise subspace eigenmatrix is
VN = [vp+1 , ..., vN ]
so that
V = [VS |VN ]
is an N × N unitary matrix. The signal correlation matrix model is
R = EDE ∗ + σ 2 I
σ 2 is estimated as σ̂ 2 which is the average of the N − p smallest eigenvalues of

R. In this expression,
E = [e(ω1 ), ..., e(ωp )]
is the signal manifold matrix where
e(ω) = [1, exp(jω), ..., exp(j(N − 1)ω)]T
is the steering vector along the frequency ω. Note that since N > p, E has full
column rank p. The signal eigenvalues of R are
λi = µi + σ 2 , i = 1, 2, ..., p
while the noise eigenvalues are
λi = σ 2 , i = p + 1, ..., N
Thus,
Rvi = λi vi , i = 1, 2, ..., N
The known statistics exact MUSIC spectral estimator is
P (ω) = (1 − N −1 � VS∗ e(ω) �2 )−1
which is infinite when and only when ω ∈ {ω1 , ..., ωp }. Alternately, the zeros of
the function
Q(ω) = 1/P (ω) = 1 − N −1 � VS∗ e(ω) �2
are precisely {ω1 , ..., ωp } because, the VN∗ e(ω) = 0 iff e(ω) belongs to the signal
subspace span{e(ωi ) : 1 ≤ i ≤ p} iff {e(ω), e(ωi ), i = 1, 2, ..., p} is a linearly
dependent set iff ω ∈ {ω1 , ..., ωp }, the last statement being a consequence of the
fact that N ≥ p + 1 and the VanDerMonde determinant theorem. Now, owing
to finite data effects, vi gets perturbed to vi + δvi for each i = 1, 2, ..., p so that
correspondingly, VS gets perturbed to VˆS = VS + δVS . Then Q gets perturbed
to (upto first orders of smallness, ie, upto O(δR)
Q(ω) + δQ(ω) = 1 − N −1 � (VS + δVS )∗ e(ω) �2
so that
δQ(ω) = (−2/N )Re(e(ω)∗ δVS VS∗ e(ω))
Let ω be one of the zeros of Q(ω), ie, one of the frequencies of the signal. Then,
owing to finite data effects, it gets perturbed to ω + δω where
Q� (ω)δω + δQ(ω) = 0
ie,
δω = −δQ(ω)/Q� (ω)
and
Q� (ω) = (−2/N )Re(e(ω)∗ PS e� (ω)) = (−2/N )Re < e(ω)|PS |e� (ω) >
16410.4. QUESTION
Stochastic
PAPERProcesses in Classical andSIGNAL
ON STATISTICAL QuantumPROCESSING,
LONG TEST173
where
PS = VS VS∗
is the orthogonal projection onto the signal subspace. Note that by first order
perturbation theory
� |vj >< vj |δR|vi >

|δvi >= , i = 1, 2, ..., N
λ i − λj
j=i
and
δλi =< vi |δR|vi >
So �
δVS = column( |vj >< vj |δR|vi > /(λi − λj ) : i = 1, 2, ..., p]
j=i
= [Q1 δR|v1 >, ..., Qp δR|vp >]

where �

Qi = |vj >< vj |/(λi − λj )
j=i
We can express this as
δVS = [Q1 , ..., Qp ](Ip ⊗ δR.VS )
and hence
δQ(ω) = (−2/N )Re(< e(ω)|δVS VS∗ |e(ω)) >)
= (−2/N )Re(< e(ω)S(Ip ⊗ δR).VS VS∗ |e(ω) >)
= (−2/N )Re(< e(ω)|S(Ip ⊗ δR)PS |e(ω) >)
where as before PS = VS VS∗ is the orthogonal projection onto the signal subspace
and
S = [Q1 , Q2 , ..., Qp ]
The evaluation of E((δω)2 ) can therefore be easily done once we evaluate
E(δR(i, j)δR(k, m)) and this is what we now propose to do.

174CHAPTER 10.inLECTURE
Classical and Quantum
PLAN FORPhysics and Engineering
STATISTICAL 165
SIGNAL PROCESSING
Statistical performance analysis of the ESPRIT algorithm. Here, we have

two correlation matrices R, R1 having the structures
R = EDE ∗ + σ 2 I, R1 = EDΦ∗ E ∗ + σ 2 Z
where E has full column rank, D is a square, nonsingular positive definite

matrix and Φ is a diagonal unitary matrix. The structure of Z is known. We
have to estimate the diagonal entries of Φ which give us the signal frequencies.
Let as before VS denote the matrix formed from the signal eigenvectors of R
and VN that formed from the noise eigenvectors of R. V = [VS |VN ] is unitary.
When we form finite data estimates, R is replaced by R̂ = R + δR and R1 by
R̂1 = R1 + δR1 . The estimate of σ 2 is σ̂ 2 = σ 2 + δσ 2 obtained by averaging the
ˆ The estimate of RS = EDE ∗ is then
N − p smallest eigenvalues of R.
R̂S = RS + δRS = R̂ − σ̂ 2 I = R − σ 2 I + δR − δσ 2
Thus,
δRS = δR − δσ 2 I
Likewise, the estimate of RS1 = EDΦ∗ E ∗ is
R̂S1 = RS1 + δRS1 = R̂1 − σ̂ 2 Z = R1 − σ 2 Z + δR1 − δσ 2 Z
so that
δRS1 = δR1 − δσ 2 Z
The rank reducing numbers of the matrix pencil
RS − γRS1 = ED(I − γΦ∗ )E ∗
are clearly the diagonal entries of Φ. Since R(E) = R(VS ), it follows that these
rank reducing numbers are the zeroes of the polynomial equation
det(VS∗ (RS − γRS1 )VS ) = 0
or equivalently of
det(M − γVS∗ RS1 VS ) = 0
where
M = VS∗ RS VS = diag[µ1 , ..., µp ]
with
µk = λk − σ 2 , kj = 1, 2, ..., p
being the nonzero eigenvalues of RS . When finite data effects are taken into
consideration, these become the rank reducing numbers of
det(M + δM − γ̂(VS + δVS )∗ (RS1 + δRS1 )(VS + δVS )) = 0
or equivalently, upto first order perturbation theory, the rank reducing number
γ gets perturbed to γ̂ = γ + δγ, where
det(M − γX + δM − γ.δX − δγX) = 0

166 Stochastic
10.5. QUANTUM NEURALProcesses in ClassicalFOR
NETWORKS and Quantum PhysicsEEG
ESTIMATING and Engineering
PDF FROM SPEECH
where
X = VS∗ RS1 VS , δX = δVS∗ RS1 VS + VS∗ RS1 δVS + VS∗ δRS1 VS
Let ξ be a right eigenvector of M − γX and η ∗ a left eigenvector of the same

corresponding to the zero eigenvalue. Note that M, X, δM, δX are all Hermitian
but M − γX is not because γ = exp(jωi ) is generally not real. Then, let ξ + δξ
be a right eigenvector of
M − γX + δM − γδX − δγX
corresponding to the zero eigenvalue. In terms of first order perturbation theory,
0 = (M − γX + δM − γδX − δγX)(ξ + δξ)
= (M − γX)δξ + (δM − γδX)ξ − δγXξ

Premultiplying by η ∗ gives us
δγη ∗ Xξ = η ∗ (δM − γδX)ξ
or equivalently,
< η|δM − γδX�ξ >
δγ =
< η|X|ξ >
which is the desired formula.
10.5 Quantum neural networks for estimating

EEG pdf from speech pdf based on
Schrodinger’s equation
Quantum neural networks for estimating the marginal pdf of one signal from
the marginal pdf of another signal. The idea is that the essential characteristics
of either speech or EEG signals are contained in their empirical pdf’s, ie, the
proportion of time that the signal amplitude spends within any Borel subset
of the real line. This hypothesis leads one to believe that by estimating the
pdf of the EEG data from that of speech data alone, we can firstly do away
with the elaborate apparatus required for measuring EEG signals and secondly
we can extract from the EEG pdf, the essential characteristics of the EEG sig
nal parameters which determine the condition of a diseased brain. It is a fact
that since the Hamiltonian operator H(t) in the time dependent Schrodinger
equation is Hermitian, the resulting evolution operator U (t, s), t > s is unitary
causing thereby the wave function modulus square to integrate to unity at any
time t if at time t = 0 it does integrate to unity. Thus, the Schrodinger equation
naturally generates a family of pdf’s. Further, based on the assumption that the
speech pdf and the EEG pdf for any patient are correlated in the same way (a
particular kind of brain disease causes a certain distortion in the EEG signal and
a correlated distortion in the speech process with the correlation being patient
independent), we include a potential term in Schrodinger’s equation involving
speech pdf and hence while tracking a given EEG pdf owing to the adaptation
algorithm of the QNN, the QNN also ensures that the output pdf coming from
Schrodinger’s equation will also maintain a certain degree of correlation with
the speech pdf. The basic idea behind the quantum neural network is to mod
ulate the potential by a ”weight process” with the weight increasing whenever
the desired pdf exceeds the pdf generated by Schrodinger’s equation using the
modulus of the wave function squared. A hieuristic proof of why the pdf gener
ated by the Schrodinger equation should increase when the weight process and
hence th modulated potential increases is provided at the end of the paper.
First consider the following model for estimating the pdf p(t, x) of one signal
via Schrodinger’s wave equation which naturally generates a continuous family
of pdf’s via the modulus square of the wave function:
ih∂t ψ(t, x) = (−h2 /2)∂x2 ψ(t, x) + W (t, x)V (x)ψ(t, x)
The weight process W (t, x) is generated in such a way that |ψ(t, x)|2 tracks the
given pdf p(t, x). This adaptation scheme is
∂t W (t, x) = −βW (t, x) + α(p(t, x) − |ψ(t, x)|2 )
where α, β are positive constants. This latter equation guarantees that if W (t, x)
is very large and positive (negative), then the term −βW (t, x) is very large and
negative (positive) causing W to decrease (increase). In other words, −β.W
is a threshold term that prevents the weight process from getting too large in
magnitude which could cause the breakdown of the quantum neural network.
We are assuming that V (x) is a nonnegative potential. Thus, increase in W
will cause an increase in W V which as we shall see by an argument given below,
will cause a corresponding increase in |ψ|2 and likewise a decrease in W will
cause a corresponding decrease in |ψ|2 . Now we add another potential term
p0 (t, x)V0 (x) to the above Schrodinger dynamics where p0 is the pdf of the
input signal. This term is added because we have the understanding that the
output pdf p is in some way correlated with the input pdf. Then the dynamics
becomes
ih∂t ψ(t, x) = (−h2 /2)∂x2 ψ(t, x) + W (t, x)V (x)ψ(t, x) + p0 (t, x)V0 (x)ψ(t, x)
Statistical performance analysis of the QNN: Suppose p0 (t, x) fluctuates to

p0 (t, x) + δp0 (t, x) and p(t, x) to p(t, x) + δp(t, x). Then the corresponding QNN
output wave function ψ(t, x) will fluctuate to ψ(t, x)+δψ(t, x) where δψ satisfies
the linearized Schrodinger equation and likewise the weight process W (t, x) will
168 Stochastic
10.5. QUANTUM NEURAL Processes in Classical
NETWORKS FOR and Quantum Physics
ESTIMATING EEG and Engineering
PDF FROM SPEECH
fluctuate to W (t, x) + δW (t, x). The linearized dynamics for these fluctuations
are
ih∂t δψ(t, x) = (−h2 /2)∂x2 δψ(t, x) + V (x)(W δψ + ψδW ) + V0 (x)(p0 δψ + ψδp0 )

¯
∂t δW = −βδW + α(δp − ψ.δψ − ψ.δ ψ̄)
Writing
ψ(t, x) = A(t, x).exp(iS(t, x)/h)
we get
δψ = (δA + iAδS/h)exp(iS/h)
so that with prime denoting spatial derivative,
δψ " = (δA" + iA" δS/h + iAδS " /h + iS " δA/h − AS " δS/h2 )exp(iS/h)
δψ "" = (δA"" +iA"" δS/h+2iA" δS " /h+iAδS "" /h+2iS " δA" /h+iS "" δA/h−AS " δS " /h2
d d
−A" S " δS/h2 − AS "" δS/h2 − AS " δS " /h2 − S 2 δA/h2 − iAS 2 δS/h3 ).exp(iS/h)
Thus we obtain the following differential equation for the wave function fluctu
ations caused by random fluctuations in the EEG and speech pdf’s:
ih∂t δA − ∂t (AδS) − δA.∂t S − AδS∂t S/h
= (−h2 /2)δA"" − ihA"" δS/2 − ihA" δS " − ihA.δS "" /2 − ihS " δA"
d d
−ihS "" δA/2+AS " δS " /2+A" S " δS/2+AS "" δS/2+AS " δS " /2+S 2 δA/2+iAS 2 δS/2h
+V0 (x)p0 (t, x)(δA + iAδS/h) + V0 (x)Aδp0 (t, x)
+V (x)W (t, x)(δA + iAδS/h) + V (x)AδW )
On equating real and imaginary parts in this equation, we get two pde’s for the
two functions A and S. However, it is easier to derive these equations by first
deriving the unperturbed pde’s for A, S and then forming the perturbations.
The unperturbed equations are (See Appendix A.1)
h∂t A = −h∂x A.∂x S − (1/2)A∂x2 S
−A∂t S = −((h2 /2)∂x2 A − (1/2)A(∂x S)2 ) + W V A + p0 V0 A

∂t W (t, x) = −β(W (t, x) − W0 (x)) + α(p(t, x) − A2 )
Forming the variation of this equation around a fixed operating point p0 (x), p(x) =
|ψ(t, x)|2 and W (x) with ψ(t, x) = A(x).exp(−iEt + iS0 (x)) gives us
h∂t δA = −hδA" .S0" − hA" δS " − (1/2)S0"" δA − (1/2)AδS "" ,

d
−A∂t δS + EδA = (−h2 /2)δA"" − (1/2)S02 δA − AS0" δS ""
+V W δA + V AδW + V0 Aδp0 + V0 p0 δA,
∂t δW = −βδW + α.δp − 2αAδA
Appendix
A.1
Proof that increase in potential causes increase in probability density deter
mined as modulus square of the wave function.
Consider Schrodinger’s equation in the form
ih∂t ψ(t, x) = (−h2 /2)∂x2 ψ(t, x) + V (x)ψ(t, x)
We write
ψ(t, x) = A(t, x).exp(iS(t, x)/h)
where A, S are real functions with A nonnegative. Thus, |ψ|2 = A2 . Substitut
ing this expression into Schrodinger’s equation and equating real and imaginary
parts gives
h∂t A = −h∂x A.∂x S − (1/2)A∂x2 S)
−A∂t S = −((h2 /2)∂x2 A − (1/2)A(∂x S)2 ) + V A
We write
S(t, x) = −Et + S0 (x)
so that E is identified with the total energy of the quantum particle. Then, the
second equation gives on neglecting O(h2 ) terms,
�
(∂x S0 ) = ± 2(E − V (x))
and then substituting this into the first equation with ∂t A = 0 gives
S0�� /S0� = −2hA� /A
or
ln(S0� ) = −2h.ln(A) + C
so
A = K1 (S0� )−1/2 = K2 (E − V (x))−1/4
which shows that as long as V (x) < E, A(x) and hence A(x)2 = |ψ(x)|2 will
increase with increasing V (x) confirming the fact that by causing the driving
potential to increase when pE − |ψ|2 is large positive, we can ensure that |ψ|2
will increase and viceversa thereby confirming the a validity of our adaptive
algorithm for tracking the EEG pdf.
A.2 Calculating the empirical distribution of a stationary stochastic process.
Let x(t) be a stationary stochastic process. The empirical density estimate
of N samples of this process spaced τk , k = 1, 2, ..., n − 1 seconds apart from the
first sample, ie, an estimate of the joint density of x(t), x(t+τk ), k = 1, 2, ..., N −1
is given bylog(
� T
−1
p̂T (x1 , ..., xN ; τ1 , ..., τN −1 ) = T ΠN
k=1 δ(x(t + τk ) − xk )dt
0
10.5. QUANTUM NEURAL NETWORKS FOR ESTIMATING EEG PDF FROM SPEECH
where τ0 = 0. The GartnerEllis theorem in the theory of large deviations pro

vides a reliability estimate of how this empirical density performs. Specifically,
if f is a function of N variables, then the moment generating functional of
p̂T (., τ1 , ..., τN −1 ) is given by
�
E[exp( p̂T (x1 , ..., xN ; τ1 , ..., τN −1 )f (x1 , ..., xN )dx1 ...dxN )]
� T
= E[exp(T −1 f (x(t), x(t + τ1 ), ..., x(t + τN −1 ))dt)]
0
= MT (f |τ1 , ..., τN −1 )
say. Then, the rate functional for the family of random fields p̂T (., τ1 , ...τN −1 ), T →
∞ is given by
�
I(p) = supf ( f (x1 , ..., xN )p(x1 , ..., xN )dx1 ...dxN − Λ(f ))
where
Λ(f ) = limT →∞ T −1 log(MT (T.f ))
assuming that this limit exists. It follows that as T → ∞, the asymptotic
probability of p̂T taking values in a set E is given by
T −1 log(P r(p̂T (., τ1 , ..., τN −1 ) ∈ E) ≈ −inf (I(p) : p ∈ E)
or equivalently,
P r(p̂T (., τ1 , ..., τN −1 ) = p) ≈ exp(−T I(p)), T → ∞
Appendix A.3
We can also use the electroweak theory or more specially, the DiracYang
Mills nonAbelian gauge theory for matter and gauge fields to track a given pdf.
Such an algorithm enables us to make use of elementary particles in a particle
accelerator to simulate probability densities for learning about the statistics of
signals.
Let Aaµ (x)τa be a nonAbelian gauge field assuming values in a Lie algebra
g which is a Lie subalgebra of u(N ). Here, τa , a = 1, 2, ..., N are Hermitian
generators of g. Denote their structure constants by {C(abc)}, ie,
[τa , τb ] = iC(abc)τc
Then the gauge covariant derivative is
�µ = ∂µ + ieAaµ τa
and the corresponding YangMills antisymmetric gauge field tensor, also called
the curvature of the gauge covariant derivative is given by
ieFµν = [�mu , �ν ] =
a
= ie(Aν,µ − Aaµ,ν ) − eC(abc)Abµ Acν )τa
so that
a
Fµν = Fµν τa
where
a a
Fµν = Aν,µ − Aaµ,ν ) − eC(abc)Abµ Acν
The action functional for the matter field ψ(x) having 4N components and the
gauge potentials Aaµ (x) is then given by (S.Weinberg, The quantum theory of
fields, vol.II) �
S[A, ψ] = Ld4 x,
¯ µ (i∂µ + eAa τa ) − m)ψ − (1/4)F a F aµν

L = ψ(γ µ µν
where
ψ¯ = ψ ∗ γ 0
The equations of motion obtained by setting the variation of S w.r.t the matter
wave function ψ and the gauge potentials A are given by
[γ µ (i∂µ + eAaµ τa ) − m]ψ = 0,
Dν F µν = J µ
where Dµ is the gauge covariant derivative in the adjoint representation, ie,
[∂ν + ieAν , F µν ] = J µ
or equivalently in terms of components,
∂ν F aµν − eC(abc)Abν F cµν = J aµ
with J aµ being the DiracYangMills current:

¯ µ ⊗ τa )ψ
J aµ = −eψ(γ
it is clear that ψ satisfies a DiracYangmills matter wave equation and it can

be expressed in the form of Dirac’s equation with a selfadjoint Hamiltonian
owing to which J aµ is a conserved current and in particular, the |ψ|2 integrates
over space to give a constant at any time which can be set to unity by the
initial condition. In other words, |ψ(x)|2 is a probability density at any time
t where x = (t, r). Now it is easy to make the potential in this wave equation
adaptive, ie, depend on the error p(x) − |ψ(x)|2 . Simply add an interaction
term W (x)(p(x) − |ψ(x)|2 )V (x) to to the Hamiltonian. This is more easily
accomplished using the electroweak theory where there is in addition to the
10.5. QUANTUM NEURAL NETWORKS FOR ESTIMATING EEG PDF FROM SPEEC
YangMills gauge potential terms a photon four potential Bµ with B0 (x) =

W (x)(p(x) − |ψ(x)|2 )V (x). This results in the following wave equation
[γ µ (i∂µ + eAaµ τa + eBµ ) − m]ψ = 0
The problem with this formalism is that a photon Lagrangian term (−1/4)(Bν,µ −
Bµ,ν )(B ν,µ − B µ,ν ) also has to be added to the total Lagrangian which results
in the Maxwell field equation
¯ µψ
F,νphµν = −eψγ
where
ph
Fµν = Bν,µ − Bµ,ν
is the electromagnetic field. In order not to violate these Maxwell equations, we
must add another external charge source term to the rhs so that the Maxwell
equation become
¯ µ ψ + ρ(x)δ µ
F,νphµν = −eψγ 0
ρ(x) is an external charge source term that is given by
−eψ ∗ ψ + ρ(x) = −�2 (W (x)(p(x) − |ψ(x)|2 )V (x)) = −�2 B0
so that
B0 = W (p − |ψ|2 )V
The change in the Lagrangian produced by this source term is −ρ(x)B0 (x).
This control charge term can be adjusted for appropriate adaptation. Thus in
effect our Lagrangian acquires two new terms, the first is
¯ 0 B0 ψ = eψ ∗ ψ.B0
eψγ
and the second is

−ρ.B0
where
ρ = −�2 (W (p − ψ ∗ ψ)) + eψ ∗ ψ
The dynamics of the wave function ψ then acquires two new terms, the first
coming the variation w.r.t ψ¯ of the term eψ ∗ ψ.B0 = eψγ
¯ 0 ψ.B0 , this term is
0
eB0 γ ψ, the second is the term coming from the variation w.r.t ψ̄ of the term
−ρ.B0 = B0 �2 (W (p − ψγ¯ 0 ψ)). This variation gives
(�2 B0 )(W γ 0 ψ)
The resulting equations of motion for the wave function get modified to
[γ µ (i∂µ + eAaµ τa + eBµ ) − m + �2 B0 .W γ 0 ]ψ = 0
This equation is to be supplemented with the Maxwell and YangMills gauge

field equations and of course the weight update equation for W (x).
10.6 Statistical Signal Processing,
[1] [a] Prove that the quantum relative entropy is convex, ie, if (ρi , σi ), i = 1, 2
are pairs of mixed states in Cd , then
D(p1 ρ1 + p2 ρ2 |p1 σ1 + p2 σ2 ) ≤ p1 D(ρ1 |σ1 ) + p2 D(ρ2 |σ2 )
for p1 , p2 ≥ 0, p1 + p2 = 1.
hint: First prove using the following steps that if K is a quantum operation,
ie, TPCP map, then for two states ρ, σ, we have
D(K(ρ)|K(σ)) ≤ D(ρ|σ) − − − (a)
Then define � �
p1 ρ1 0
ρ= ,
0 p2 ρ2
� �
p1 σ1 0
σ= ,
0 p2 σ 2
and K to be the operation given by
K(ρ) = E1 ρ.E1∗ + E2 ρ.E2∗
where
E1 = [I|0], E2 = [0|I] ∈ Cd×2d
Note that
E1∗ E1 + E2∗ E2 = I2d
To prove (a), make use of the result on the asymptotic error rate in hypothesis
testing on tensor product states: Given that the first kind of error, ie, probability
of detecting ρ⊗n given that the true state is σ ⊗n goes to zero, the minimum
(NeymanPearson test) probability of error of the second kind goes to zero at the
rate of −D(ρ|σ) (this result is known as quantum Stein’s lemma). Now consider
any sequence of tests Tn , ie 0 ≤ Tn ≤ I on (Cd )⊗n . Then consider using this
test for testing between the pair of states K ⊗n (ρ⊗n ) and K ⊗n (σ ⊗n ). Observe
that the optimum performance for this sequential hypothesis testing problem is
the same as that of the dual problem of testing ρ⊗n versus σ ⊗n using the dual
test K ∗⊗n (Tn ). This is because
T r(K ⊗n (ρ⊗n )Tn ) = T r(ρ⊗n K ∗⊗n (Tn ))
etc. It follows that the optimum NeymanPearson error probability of the

first kind for testing ρ⊗n versus σ ⊗n will be much smaller than that of test
ing K ⊗n (ρ⊗n ) versus K ⊗n (σ ⊗n ) since K ∗⊗n (Tn ) varies only over a subset of
tests as Tn varies over all tests in (Cd )⊗n . Hence going to the limit gives the
result
exp(−nD(ρ|σ)) ≤ exp(−nD(K(ρ)|K(σ))
10.6. STATISTICAL SIGNAL PROCESSING, M.TECH, LONG TESTMAX MARKS:501
asymptotically as n → ∞ which actually means that
D(ρ|σ) ≥ D(K(ρ)|K(σ))
[b] Prove using [a] that the VonNeumann entropy H(ρ) = −T r(ρ.log(ρ)) is
a concave function, ie if ρ, σ are two states, then
H(p1 ρ + p2 σ) ≥ p1 H(ρ) + p2 H(σ), p1 , p2 ≥ 0, p1 + p2 = 1
hint: Let K be any TPCP map. Let

� �
p1 ρ1 0
ρ= ∈ C2d×2d ,
0 p2 ρ2
and
σ = I2d /2d
where ρk ∈ C d×d
, k = 1, 2. Then choose
E1 = [I|0], E2 = [0|I] ∈ Cd×2d
and
K(X) = E1 XE1∗ + E2 XE2∗ ∈ Cd×d , X ∈ C2d×2d
Note that
E1∗ E1 + E2∗ E2 = Id
Show that
K(σ) = K(I2d /2d) = Id /d
Then we get using [a],
D(K(ρ)|K(σ)) ≤ D(ρ|sigma)
Show that this yields
H(p1 ρ1 + p2 ρ2 ) + log(d) ≥ p1 H(ρ1 ) + p2 H(ρ2 ) − H(p) + log(2d)
where
H(p) = −p1 log(p1 ) − p2 log(p2 )
Show that
H(p) ≤ log(2)
and hence deduce the result.
[2] [a] Let X[n], n ≥ 0 be a M vector valued AR process, ie, it satisfies the
first order vector difference equation
X[n] = AX[n − 1] + W[n], n ≥ 1

Stochastic Processes10.
184CHAPTER in Classical
LECTURE andPLAN
Quantum
FORPhysics and Engineering
STATISTICAL 175
SIGNAL PROCESSING
Assume that W[n], N ≥ 1 is an iid N (0, Q) process independent of X[0] and

that X[0] is a Gaussian random vector. Calculate the value of
R[0] = E(X[0]X[0]T )
in terms of A and Q so that X[n], n ≥ 0 is widesense stationary in the sense

that
E(X[n]X[n − m]T ) = R[m]
does not depend on n. In such a case, evaluate the ACF R[m] of {X[n]} in
terms of A and Q. Assume that A is diagonable with all eigenvalues being
strictly inside the unit circle. Express your answer in terms of the eigenvalues
and eigenprojections of A.
[b] Assuming X[0] known, evaluate the maximum likelihood estimator of A

given the data {X[n] : 1 ≤ n ≤ N }. Also evaluate the approximate mean and
covariance of the estimates of the matrix elements of A upto quadratic orders in
the correlations {R[m]} based on the data X[n], 1 ≤ n ≤ N and upto O(1/N ).
In other words, calculate using perturbation theory, E(δAij ) and E(δAij δAkm )
upto O(1/N ) where
Â − A = δA
where Aˆ is the maximum likelihood estimator of A.
hint: Write
N
�
−1
N X[n − i]X[n − j]T − R[j − i] = δR(j, i)
n=1
and then evaluate δA upto quadratic orders in δR(j, i). Finally, use the fact
that since X[n], n ≥ 0 is a Gaussian process,
E(Xa [n1 ]Xb [n2 ]Xc [n3 ]Xd [n4 ])

= Rab [n1 −n2 ]Rcd [n3 −n4 ]+Rac [n1 −n3 ]Rbd [n2 −n4 ]+Rad [n1 −n4 ]Rbc [n2 −n3 ]
where
((Rab (τ )))1≤a,b≤M = R(τ ) = E(X[n]X[n − τ ]T )
[c] Calculate the Fisher information matrix for the parameters {Aij } and
hence the corresponding CramerRao lower bound.
[3] Let X[n] be a stationary M vector valued zero mean Gaussian process
with ACF R(τ ) = E(X[n]X[n−τ ]T ) ∈ RM ×M . Define its power spectral density
matrix by
�∞
S(ω) = R(τ )exp(−jωτ ) ∈ CM ×M
τ =−∞
Prove that S(ω) is a positive definite matrix. Define its windowed periodogram
estimate by
SˆN (ω) = X
ˆ N (ω)(X
ˆ N (ω))∗
17610.6. STATISTICAL
Stochastic SIGNAL
ProcessesPROCESSING,
in Classical and M.TECH,
Quantum Physics
LONGand Engineering
TESTMAX MARKS:501
where
N
� −1
ˆ N (ω) = N −1/2
X W(n)X(n)exp(−jωn)
n=0
where W(n) is an M × M matrix valued window function. Calculate
EŜN (ω), Cov((SN (ω1 ))ab , (SN (ω2 ))cd )
and discuss the limits of these as N → ∞. Express these means and covariances
in terms of
∞
�
Ŵ (ω) = W (τ )exp(−jωτ )
τ =−∞
[4] This problem deals with causal Wiener filtering of vector valued sta
tionary processes. Let X(n), S(n), n ∈ Z be two jointly WSS processes with
correlations
RXX (τ ) = E(X(n)X(n − τ )T ), RSX (τ ) = E(S(n)X(n − τ )T )
Let �
SXX (z) = RXX (τ )z −τ
τ
Prove that
RXX (−τ ) = RXX (τ )T
and hence
SXX (z −1 ) = SXX (z)T
Assume that this power spectral density matrix can be factorized as
SXX (z) = L(z)L(z −1 )T
where L(z) and G(z) = L(z)−1 admit expansions of the form

�
L(z) = l[n]z −n , |z | ≥ 1
n≥0
�
G(z) = g[n]z −n , |z| ≥ 1
n≥0
with �
� l[n] �< ∞,
n≥0
�
� g[n] �< ∞
n≥0
Then suppose �
H(z) = h[n]z −n
n≥0
is the optimum causal matrix Wiener filter that takes X[.] as input and outputs
ˆ of S[.] in the sense that
an estimate S[.]
�
Ŝ[n] = h[k]X[n − k]
k≥0
with {h[k]} chosen so that

ˆ �2 ]
E[� S[n] − S[n]
is a minimum, then prove that h[.] satisfies the matrix WienerHopf equation
�
h[k]RXX [m − k] = RSX [m], m ≥ 0
k≥0
Show that this equation is equivalent to
[H(z)SXX (z) − SSX (z)]+ = 0
Prove that the solution to this equation is given by
H(z) = [SSX (z)L(z −1 )−T ]+ L(z)−1
= [SSX (z)G(z −1 )T ]+ G(z)

where
[SSX (z)G(z −1 )T ]+
�
=[ RSX [n]g[m]T z −n+m ]+ =
� �
z −n RSX [n + m]g[m]T
n≥0 m≥0
Prove that this is a valid solution for a stable Causal matrix Wiener filter, it is
sufficient that �
� RSX [n] �< ∞
n≥0
[5] Write short notes on any two of the following:

[a] Recursive least squares lattice filter for prediction with time and order
updates of the prediction filter coefficients.
[b] Kolmogorov’s formula for the entropy rate of a stationary Gaussian pro
cess in discrete time in terms of the power spectral density and its relationship
to the infinite order one step prediction error variance.
[c] The Cq Shannon coding theorem and its converse and a derivation of the
classical Shannon noisy coding theorem and its converse from it.
[d] CramerRao lower bound for vector parameters and vector observations.
[e] The optimum detector in a binary quantum hypothesis testing problem
between two mixed states.
10.7. MONOTONOCITY OF QUANTUM RELATIVE RENYI ENTROPY187
10.7 Monotonocity of quantum relative Renyi

entropy
Let A, B be positive definite matrices and let 0 ≤ t ≤ 1, then prove that
(tA + (1 − t)B)−1 ≤ t.A−1 + (1 − t).B −1 − − − (1)
ie, the map A → A−1 is operator convex on the space of positive matrices.
Solution: Proving (1) is equivalent to proving that
A1/2 (tA + (1 − t)B)−1 .A1/2 ≤ tI + (1 − t)A1/2 B −1 A1/2
which is the same as proving that
(tI + (1 − t)X)−1 ≤ tI + (1 − t)X −1 − − − (2)
where
X = A−1/2 BA−1/2
It is enough to prove (2) for any positive matrix X. By the spectral theorem
for positive matrices, to prove (2), it is enough to prove
(t + (1 − t)x)−1 ≤ t + (1 − t)/x − − − (3)
for all positive real numbers x. Proving (3) is equivalent to proving that
(t + (1 − t)x)(tx + 1 − t) ≥ x
for all x > 0, or equivalently,
(t2 + (1 − t)2 )x + t(1 − t)x2 + t(1 − t) ≥ x
or equivalently, adding and subtracting 2t(1 − t)x on both sides to complete the
square,
−2t(1 − t)x + t(1 − t)x2 + t(1 − t) ≥ 0
or equivalently,
1 + x2 − 2x ≥ 0
which is true.
Problem: For p ∈ (0, 1), show that x → −xp , x → xp−1 and x → xp+1 are
operator convex on the space of positive matrices.
Solution: Consider the integral
� ∞
I(a, x) = ua−1 du/(u + x)
0
It is clear that this integral is convergent when x ≥ 0 and 0 < a < 1. Now
� ∞
−1
I(a, x) = x ua−1 (1 + u/x)−1 du
0
� ∞
= xa−1 y a−1 (1 + y)−1 dy
0
on making the change of variables u = xy. Again make the change of variables
1 − 1/(1 + y) = y/(1 + y) = z, y = 1/(1 − z) − 1 = z/(1 − z)
Then
� 1 � 1
I(a, x) = xa−1 z a−1 (1−z)−a+1 (1−z)(1−z)−2 dz = xa−1 z a−1 (1−z)−a dz
0 0
a−1 1−a a−1

=x β(a, 1 − a) = x Γ(a)Γ(1 − a) = x π/sin(πa)
Thus,
� ∞
a−1
x = (sin(πa)/π)I(a, x) = (sin(πa)/π) ua−1 du/(u + x)
0
and since for u > 0, as we just saw, x → (u + x)−1 is operator convex, the proof
that f (x) = xa−1 is operator convex when a ∈ (0, 1). Now, Now,
ua−1 x/(u + x) = ua−1 (1 − u/(u + x)) = ua−1 − ua /(u + x)
Integrating from u = 0 to u = ∞ and making use of the previous result gives

� ∞
xa = (sin(πa)/π) (ua−1 − ua /(u + x))du
0
from which it immediately becomes clear that x → −xa is operator convex on

the space of positive matrices. Now consider xa+1 . We have
ua−1 x2 /(u + x) = ua−1 ((u + x)2 − u2 − 2ux)/(u + x) =
ua−1 (u+x)−ua+1 /(u+x)−2ua x/(u+x)

= ua−1 (u+x)−ua+1 /(u+x)−2ua (1−u/(u+x))
and integration thus gives
� ∞
xa+1 = (sin(πa)/π) (ua−1 x + ua+1 /(u + x) − ua )du
0
from which it becomes immediately clear that xa+1 is operator convex.

Monotonicity of Renyi entropy: Let s ∈ R. For two positive matrices A, B,
consider the function
Ds (A|B) = log(T r(A1−s B s ))
Our aim is to show that for s ∈ (0, 1) and a TPCP map K, we have for two
states A, B,
Ds (K(A)|K(B)) ≥ Ds (A|B)
while for s ∈ (−1, 0), we have
Ds (K(A)|K(B)) ≤ Ds (A|B)
For a ∈ (0, 1), and x > 0 we’ve seen that

� ∞
xa+1 = sin(πa)/π) (ua−1 x + ua+1 /(u + x) − ua )du
0
and hence for s ∈ (−1, 0), replacing a by −s gives

� ∞
x1−s = −(sin(πs)/π) (ua−1 x + ua+1 /(u + x) − ua )du
0
Let s ∈ (−1, 0). Then the above formula shows that x1−s is operator convex.
Hence, if A, B are any two density matrices and K a TPCP map, then
K(A)1−s ≤ K(A1−s )
whence
B s/2 K(A)1−s B s/2 ≤ B s/2 K(A1−s )B s/2
for any positive B and taking taking trace gives
T r(K(A)1−s B s ) ≤ T r(K(A1−s )B s )
Replacing B by K(B) here gives
T r(K(A)1−s K(B)s ) ≤ T r(K(A1−s )K(B)s )
Further since s ∈ (−1, 0) it follows that xs is operator convex. Hence
K(B)s ≤ K(B s )
and so
A1/2 K(B)s A1/2 ≤ A1/2 K(B s )A1/2
for any positive A and replacing A by K(A1−s ) and then taking trace gives
T r(K(A1−s )K(B)s ) ≤ T r(K(A1−s )K(B s ))
More generally, we have the following:

If X is any matrix, and s ∈ (0, 1) and A, B positive, and K any TPCP map
T r(X ∗ K(A)1−s XK(B)s ) ≥ T r(X ∗ K(A)1−s XK(B s ))
(since K(B)s ≥ K(B s ) because B → B s is operator concave and since K(A)1−s X

is positive)
= T r(K(A)1−s XK(B s )X ∗ ) ≥ T r(K(A1−s )XK(B s )X ∗ )

(since K(A)1−s ≥ K(A1−s ) because A → A1−s is operator concave and since

XK(B s )X ∗ is positive)
= T r(X ∗ K(A1−s )XK(B s ))
Since X is an arbitary complex square matrix, this equation immediately can

be written as
K(A)1−s ⊗ K(B)s ≥ K(A1−s ) ⊗ K(B s )
for all positive A, B and s ∈ (0, 1). We can express this equation as
K(A)1−s ⊗ K(B)s ≥ (K ⊗ K)(A1−s ⊗ B s )
and hence if I denotes the identity matrix of the same size as A, B, then
T r(K(A)1−s K(B)s ) ≥ T r((A1−s ⊗ B s )(K ⊗ K)(V ec(I).V ec(I)∗ ))
Note that if {ek : k = 1, 2, ..., n} is an onb for Cn , the Hilbert space on which
A, B act, then
� �
I= |ek >< ek |, V ec(I) = |ek ⊗ ek >= |ξ >
k k
�
V ec(I).V ec(I)∗ = |ξ >< ξ| = Eij ⊗ Eij
ij
where
Eij = |ei >< ej |
Thus, for s ∈ (0, 1),
�
T r(K(A)1−s K(B)s ) ≥ T r(Kij A1−s Kji B s )
i,j
where
Kij = K(Eij )
Likewise, if s ∈ (0, 1) so that s − 1 ∈ (−1, 0), 2 − s ∈ (1, 2), then both the maps
A → As−1 and A → A2−s are operator convex and hence for any TPCP map
K , K(A)s−1 ≤ K(A)s−1 , K(A)2−s ≤ K(A2−s ), from which we immediately
deduce that
T r(X ∗ K(A)2−s XK(B)s−1 ) ≤ T r(X ∗ K(A2−s )XK(B s−1 ))
or equivalently,
K(A)2−s ⊗ K(B)s−1 ≤ K(A2−s ) ⊗ K(B s−1 )
which result in the inequality

�
T r(K(A)2−s K(B)s−1 ) ≤ T r(Kij A2−s Kji B s−1 )
i,j
Remark: Lieb’s inequality states that for s ∈ (0, 1), the map
(A, B) → A1−t ⊗ B t = ft (A, B)
is joint operator concave, ie, if A1 , A2 , B1 , B2 are positive and p, q > 0, p + q = 1

then with
A = pA1 + qA2 , B = pB1 + qB2
we have
ft (A, B) ≥ pft (A1 , B1 ) + qft (A2 , B2 )
From this, we get the result that
(A, B) → T r(X ∗ A1−t XB t ) = Ht (A, B|X)
is jointly concave. Now let K denote the pinching
K(A) = (A + U AU ∗ )/2
where U is a unitary matrix. Then, we get
Ht (K(A), K(B)|X) ≥ (1/2)Ht (A, B|X) + (1/2)Ht (U AU ∗ , U BU ∗ |X)
and replacing X by I in this inequality gives
T r(K(A)1−t K(B)t ) ≥ T r(A1−t B t )
which proves the monotonicity of the Renyi entropy when t ∈ (0, 1) under
pinching operations K of the above kind. Now, this inequality will also hold for
any pinching K since any pinching can be expressed as a composition of such
elementary pinchings of the above kind. We now also wish to show the reverse
inequality when t ∈ (−1, 0), ie show that if t ∈ (0, 1), then
T r(K(A)2−t K(B)t−1 ) ≤ T r(A2−t B t−1 )

Chapter 11 Chapter 11
My commentary
My commentary on the on
TheGita
Ashtavakra Ashtavakra Gita
A couple of years ago, my colleague and friend presented me with a verse to

verse translation of the celebrated Ashtavakra Gita in English along with a
commentary on each verse. I was struck by the grandeur and the simplicity of
the Sanskrit used by Ashtavakra for explaining to King Janaka, the true meaning
of selfrealization, how to attain it immediately without any difficulty and also
King Janaka’s reply to Ashtavakra about how he has attained immeasurable
joy and peace by understanding this profound truth. Ashtavakra was a sage
who lived during the time in which the incidents of the Raamayana took place
and hence his Gita dates much much before even the Bhagavad Gita spoken
by Lord Krisnna to Arjuna was written down by Sage Vyasa through Ganapati
the scribe. One can see in fact many of the slokas in the Bhagavad Gita have
been adapted from the Ashtavakra Gita and hence I feel that it is imperative
for any seeker of the absolute truth and everlasting peace, joy and tranquility
to read the Ashtavakra Gita, so relevant in our modern times where there are
so many disturbances. Unlike the Bhagavad Gita, Ashtavkra does not bring
in the concept of a personal God to explain true knowledge and the process
of selfrealization. He simply says that you attain selfrealization immediately
at this moment when you understand that the physical universe that you see
around you is illusory and the fact that your soul is the entire universe. Every
living being’s soul is the same and it fills the entire universe and that all the
physical events that take place within our universe as perceived by our senses
are false, ie, they do not occur. In the language of quantum mechanics, these
events occur only because of the presence of the observer, namely the living
individuals with their senses. The absolute truth is that each individual is a
soul and not a body with his/her senses and this soul neither acts nor causes to
act. Knowledge is obtained immediately at the moment when we understand the
truth that the impressions we have from our physical experiences are illusory,
ie false and that the truth is that our soul fills the entire universe and that this
193
183
194 CHAPTER 11. MY COMMENTARY ON THE ASHTAVAKRA GITA
soul is indestructible and imperishable as even the Bhagavad Gita states. The
Bhagavad Gita however distinguishes between two kinds of souls, the jivatma
and the paramatma, the former belonging to the living entities in our universe
and the latter to the supreme personality of Godhead. The Bhagavad Gita
tells us that within each individual entity, both the jivatma and the paramatma
reside and when the individual becomes arrogant, he believes that his jivatma is
superior to the paramatma and this then results in self conflict owing to which
the individual suffers. By serving the paramatma with devotion as outlined in
the chapter on Bhakti yoga, the jivatma is able to befriend the paramatma and
then the individual lives in peace and tranquility. To explain selfknowledge
more clearlh without bringing in the notion of a God, Ashtavakra tells Raja
Janaka that there are two persons who are doing exactly the same kind of work
every day both in their offices as well as in their family home. However, one
of them is very happy and tranquil while the other is depressed and is always
suffering. What is the reason for this ? The reason, Ashtavakra says is that the
second person believes that his self is doing these works and consequently the
reactions generated by his works affect him and make him either very happy
or very sad. He is confused and does not know how to escape from this knot
of physical experiences. The first person, on the other hand, knows very well
that he is compelled to act in the way he is acting owing to effects of his past
life’s karma’s and that in fact his soul is pure and has nothing at all to do with
the results and effects of his actions. He therefore neither rejoices nor does he
get depressed. The physical experiences that he goes through do not leave any
imprints on his self because he knows that the physical experiences are illusory
and that his soul which pure and without any blemish fills the entire universe
and is the truth. Ashtavarka further clarifies his point by saying that the person
in ignorance will practice meditation and try to cultivate detachment from the
objects of this world thinking that this will give him peace. Momentarily such
a person will get joy but when he comes of the meditation, immediately all
the desires of this illusory physical world will come back to him causing even
more misery. He will try to run to a secluded place like a forest and meditate
to forget about his experiences, but again when his meditation is over, misery
will come back to him. However, once he realizes that he is actually not his
body but his soul and that the soul is free of any physical quality and that
it is the immeasurable truth, then he will neither try forcefully to cultivate
detachment nor will he try forcefully to acquire possessions. Both the extremes
of attachment and detachment will not affect him. In short, he attains peace
even living in this world immediately at this moment when true knowledge
comes to him. According to Ashtavkara, the seeker of attachement as well as
the seeker of detachment are both unhappy because they have not realized that
the mind must not be forced to perform an activity. Knowledge is already
present within the individual, he just has to remain in his worldly position as
it is, performing his duties with the full knowledge that this compulsion is due
to his past karmas and then simultaneously be conscious of the fact that his
soul is the perfect Brahman and it pervades the entire universe. At one point,
Ashtavakra tells Janaka that let even Hari (Vishnu) or Hara (Shiva) be your
195
guru, even then you cannot be happy unless you have realized at this moment
the falsity of the physical world around you and the reality of your soul as
that which pervades the entire universe. Such a person who has understood
this, will according to Ashtavakra, be unmoved if either if death in some form
approaches him or if temptations of various kinds appear before him. He will
remain neutral to both of these extremes. Such a realized soul has eliminated all
the thought processes that occur within his mind. When true knowledge dawns
upon a person, his mind will become empty and like a dry leaf in a strong wind,
he will appear to be blown from one place to other doing different kinds of
karma but always with the full understanding that these karmas have nothing
at all to do with his soul which is ever at bliss, imperturbable, imperishable
and omnisicient. In the quantum theory of matter and fields, we learn about
how the observer perturbs a system in order to measure it and gain knowledge
about the system. When the system is thus perturbed, one measures not the
true system but only the perturbed system. This is the celebrated Heisenberg
uncertainty principle. Likewise, in Ashtavakra’s philosophy, if we attempt to
gain true knowledge about the universe and its living entities from our sensory
experience, we are in effect perturbing the universe and hence we would gain
only perturbed knowledge, not true knowledge. True knowledge comes only
with the instant realization that we are not these bodies, we are all permanent
souls. Even from a scientific standpoint, as we grow older, all our body cells
eventually get replaced by new ones, then what it is that in our new body after
ten years that we are able to retain our identity ? What scientific theory of
the DNA or RNA can explain this fact ? What is the difference between the
chemical composition of our body just before it dies and just after it dies ?
Absolutely no difference. Then what is it that disinguishes a dead body from
a living body ? That something which distinguishes the two must be the soul
which is not composed of any chemical. It must be an energy, the life force
which our modern science is incapable of explaining.
LDP-UKF
LDPUKF Basedcontroller
based Controller
State equations:
x(n + 1) = f (x(n), u(n)) + w(n + 1)
Measurement equations:
z(n) = h(x(n), u(n)) + v(n)
w, v are independent white Gaussian processes with mean zero and covariances
Q, R respectively. xd (n) is the desired process to be tracked and it satisfies
xd (n + 1) = f (xd (n), u(n))
UKF equations: Let {ξ(k) : 1 ≤ k ≤ K} and {η(k) : 1 ≤ k ≤ K} be independent

iid N (0, I) random vectors. The UKF equations are
K
� �
x̂(n + 1|n) = K −1 f (x̂(n|n) + P (n|n)ξ(k), u(n))
k=1
x̂(n + 1|n + 1) = x̂(n + 1|n) + K(n)(z(n + 1) − ẑ(n + 1|n))

where
K(n + 1) = ΣXY Σ−1
YY
Where
K �
� �
−1
ΣXY = K [ P (n + 1|n)ξ(k)(h(x̂(n+1|n)+ P (n + 1|n)η(k))−ẑ(n+1|n))T ],
k=1
K
� �
ẑ(n + 1|n) = K −1 h(x̂(n + 1|n) + P (n + 1|n)η(k), u(n + 1))
k=1
K
� �
ΣY Y = K −1 . (h(x̂(n + 1|n) + P (n + 1|n)η(k))(, , )T
k=1
197
187
198 CHAPTER 12. LDPUKF BASED CONTROLLER
K
� �
P (n + 1|n) = K −1 [(f (x̂(n|n) + P (n|n)ξ(k), u(n)) − x̂(n + 1|n))(, , )T ] + Q
k=1
P (n + 1|n + 1) = ΣXX − ΣXY .Σ−1 T

Y Y ΣXY
where
ΣXX = P (n + 1|n)
Tracking error feedback controller: Let
f (n) = xd (n) − x̂(n|n), f (n + 1|n) = xd (n + 1) − x̂(n + 1|n)
Controlled dynamics:
x(n + 1) = f (x(n), u(n)) + w(n + 1) + Kc f (n)
Aim: To choose the controller coefficient Kc so that the probability

P (max1≤n≤N (|e(n)|2 + |f (n)|2 ) > �)
is minimized where
e(n) = x(n) − x̂(n), e(n + 1|n) = x(n + 1) − x̂(n + 1|n), x̂(n) = x̂(n|n)
Linearized difference equations for e(n), f (n): (Approximate)

e(n+1|n) = x(n+1)−x̂(n+1|n)
� �
= f (x(n), u(n))+w(n+1)+Kc f (n+1|n)−K −1 . f (x̂(n)+ P (n|n)ξ(k), u(n))
k
� �
= f (x(n), u(n))−f (x̂(n), u(n))−f � (x̂(n), u(n)) P (n|n)K −1 ξ(k)+w(n+1)+Kc f (n)
k
� �
� � −1
= f (x̂(n), u(n))e(n)−f (x̂(n), u(n)) P (n|n)K ξ(k)+w(n+1)+Kc f (n)−−−(1)
k
� �
f (n+1|n) = xd (n+1)−x̂(n+1|n) = f( xd (n), u(n))−K −1 . f (x̂(n)+ P (n|n)ξ(k), u(n))
k
� �
= f � (x̂(n), u(n))f (n) − f � (x̂(n), u(n)) P (n|n)K −1 ξ(k) − − − (2)
k
e(n + 1) = x(n + 1) − x̂(n + 1|n + 1) =

e(n + 1|n) + Kc f (n + 1|n) − K(n)(z(n + 1) − ẑ(n + 1|n))
= e(n+1|n)+Kc f (n+1|n)−K(n)(h(x(n+1), u(n+1))+
� �
v(n+1)−K −1 h(x̂(n+1|n)+ P (n + 1|n)η(k), u(n+1)))
k
= e(n + 1|n) + Kc f (n + 1|n) − K(n)(h� (x̂(n + 1|n), u(n + 1))e(n + 1|n) + v(n + 1)
� �
−h� (x̂(n + 1|n), u(n + 1)). P (n + 1|n).K −1 . η(k))
k
�
= (I − K(n)h (x̂(n + 1|n), u(n + 1)))e(n + 1|n) + Kc f (n) − K(n)v(n + 1)
199
� �
+K(n)h� (x̂(n + 1|n), u(n + 1)). P (n + 1|n)K −1 η(k) − − − (3)
k
Finally,
f (n + 1) = xd (n + 1) − x̂(n + 1|n + 1) =
f (n + 1|n) − K(n)(z(n + 1) − ẑ(n + 1|n))
= f (n+1|n)−K(n)(h(x(n+1), u(n+1))+
� �
v(n+1)−K −1 h(x̂(n+1|n)+ P (n + 1|n)η(k), u(n+1))
k
= f (n + 1|n) − K(n)(h� (x̂(n + 1|n), u(n + 1))e(n + 1|n)

� �
−h� (x̂(n + 1|n), u(n + 1)). P (n + 1|n)K −1 η(k)) − K(n)v(n + 1)
k
= f (n + 1|n) − K(n)h� (x̂(n + 1|n), u(n + 1)))e(n + 1|n)
� �
+K(n)h� (x̂(n+1|n), u(n+1)). P (n + 1|n).K −1 . η(k)−K(n)v(n+1)−−−(4)
k
Substituting for e(n + 1|n) and f (n + 1|n) from (1) and (2) respectively into (3)
and (4) results in the following recursion for e(n), f (n):
e(n + 1) =
� �
f � (x̂(n), u(n))f (n)−f � (x̂(n), u(n)) P (n|n)K −1 ξ(k)−K(n)h� (x̂(n+1|n),
k
u(n+1)))(f � (x̂(n), u(n))e(n)

� �
−f � (x̂(n), u(n)) P (n|n)K −1 ξ(k) + w(n + 1) + Kc f (n))
k
� �
+K(n)h� (x̂(n+1|n), u(n+1)). P (n + 1|n).K −1 . η(k)−K(n)v(n+1)−−−(5)
k
and
f (n + 1) =
� �
f � (x̂(n), u(n))f (n) − f � (x̂(n), u(n)) P (n|n)K −1 ξ(k)
k
−K(n)h� (x̂(n + 1|n), u(n + 1))).
� �
.(f � (x̂(n), u(n))e(n) − f � (x̂(n), u(n)) P (n|n)K −1 ξ(k) + w(n + 1) + Kc f (n))
k
� �
+K(n)h (x̂(n+1|n), u(n+1)). P (n + 1|n).K −1 .
�
η(k)−K(n)v(n+1)−−−(6)
k
If we regard the error processes e(n), f (n) and the noise processes w(n),v(n),
�K �
ξ1 (n) = K −1 . k=1 ξ(k) and η1 (n) = K −1 k η(k) to be all of the first order of
smallness in perturbation theory, then to the same order, we can replace in (4)
and (5), x̂(n) appearing on the rhs by xd (n) and then the error thus incurred
190
200 Stochastic Processes
CHAPTER in Classical and Quantum
12. LDPUKF Physics
BASED and Engineering
CONTROLLER
will be only of the second order of smallness. Doing so, we can express (4) and
(5) in the form after equilibrium has been obtained as
e(n+1) = A11 e(n)+(A12 +A18 Kc )f (n)+A13 w(n+1)+

A14 A15 v(n+1)+A16 ξ1 (n+1)+A17 η1 (n+1)
f (n+1) = A21 e(n)+(A22 +A28 Kc )f (n)+A23 w(n+1)+
A24 A25 v(n+1)+A26 ξ1 (n+1)+A27 η1 (n+1)
Define the aggregate noise process
W [n] = [w(n)T , v(n)T , ξ1 (n)T , η1 (n)T ]T
It is a zero mean vector valued white Gaussian process and its covariance matrix
is block diagonal:
E(W (n)W (m)T ) = RW (n)δ(n − m)
where
RW (n) = E(W (n)W (n)T ) = diag[Q, R, P [n|n)/K, P (n + 1|n)/K]
Let P denote the limiting equilibrium value of P [n|n] as n → ∞. The limiting
equilibrium value of P [n + 1|n] is then given approximately by
P+ = f � (x̂, u)P f � (x̂, u)T + Q
where x̂ may be replaced by the limiting value of xd , assuming it to be asymp
totically d.c and f � (x, u) is the Jacobian matrix of f (x, u) w.r.t x. We write
f � (xd , u) = F and then
P+ = F P F T + Q
If we desire to use a more accurate value of P+ , then we must replace it by
P+ = Cov(f (xd + P.ξ1 )) + Q
where ξ1 is an N (0, K −1 P ) random vector. Now, let RW denote the limiting

equilibrium value of RW (n), ie,
RW = diag[Q, R, P/K, P+ /K]
Then, the above equations for
χ[n] = [e[n]T , f [n]T ]T
can be expressed as
χ[n + 1] = (G0 + G1 Kc G2 )χ[n] + G3 W [n + 1]
and its solution is given by

n−1
�
χ[n] = (G0 + G1 Kc G2 )n−1−k G3 W [k + 1]
k=0
201
This process is zero mean Gaussian with an autocorrelation given by

min(n,m)−1
�
E(χ[n]χ[m]T ) = (G0 + G1 Kc G2 )n−1−k .RW .(G0 + G1 Kc G2 )T m−1−k
k=0
= Rχ [n, m|Kc ]
say. The LDP rate function of the process χ[.] over the time interval [0, N ] is
then given by
N
�
I[χ[n] : 0 ≤ n ≤ N ] = (1/2) χ[n]T Qχ [n, m|Kc ]χ[m]
n,m=0
where
Qχ = ((Rχ [n, m|Kc ]))−1
0≤n,m≤N
�N
According to the LDP for Gaussian processes, the probability that √ n=0 �
χ[n] �2 will exceed a threshold δ after we scale χ by a small number �| is given
approximately by
N
�
exp(−�−1 min(I(χ) : � χ[n] �2 > δ))
n=0
and in order to minimize this deviation probability, we must choose Kc so that

the maximum eigenvalue λmax (Kc ) of ((Rχ [n, m|Kc ]))−1
0≤n,m≤N is a minimum.
Chapter 13
Chapter 13
Abstracts for a Book on
Abstracts
Quantumfor a book
Signal on
Processing
quantum signal Field
Using Quantum processing
Theory
using quantum field theory
Chapter 1
An introduction to quantum electrodynamics
[1] The Maxwell and Dirac equations.
[2] Quantization of the electromagnetic field using Bosonic creation and an
nihilation operators.
[3] Second Quantization of the Dirac field using Fermionic creation and an
nihilation operators.
[4] The interaction term between the Dirac field and the Maxwell field.
[5] Propagators for the photons and electronpositron fields.
[7] The Dyson series and its application to calculation of amplitudes for
absorption, emission and scattering processes between electrons and positrons.
[8] Representing the terms in the Dyson series for interactions between pho
tons, electrons and positrons using Feynman diagrams.
[9] The role of propagators in Feynman diagrammatic calculations.
[10] Quantum electrodynamics using the Feynman path integral method.
[11] Calculating the propagator of fields using Feynman path integrals.
[12] Electrons self energy, vacuum polarization, Compton scattering and
anomalous magnetic moment of the electron.
Chapter 2
The design of quantum gates using quantum field theory
[1] Design of quantum gates using first quantized quantum mechanics by
controlled modulation of potentials.
[2] Design of quantum gates using harmonic oscillators with charges per
turbed by and electric field.
203
193
204CHAPTER 13. ABSTRACTS FOR A BOOK ON QUANTUM SIGNAL PROCESSING
[3] Design of quantum gates using 3D harmonic oscillators perturbed by

control electric and magnetic field.
[4] Representing the KleinGordon Hamiltonian with perturbation using
truncated spatial Fourier modes within a box.
[5] Design of the perturbation potential in KleinGordon quantization for
realizing very large sized quantum gates.
[6] Representing the Hamiltonian of the Dirac field plus electromagnetic field
in terms of photon and electronpositron creation and annihilation operators.
[7] Design of control classical electromagnetic fields and control classical
current sources in quantum electrodynamics for approximating very large sized
quantum unitary gates.
[8] The statistical performance of quantum gate design methods in the pres
ence of classical noise.
Chapter 3
The quantum Boltzmann equation with applications to the design of quan
tum noisy gates, ie, TPCP channels.
[1] The classical Boltzmann equation for a plasma of charged particles inter
acting with an external electromagnetic field.
[2] Calculating the conductivity and permeability of a plasma interacting
with an electromagnetic field using first order perturbation theory.
[3] The quantum Boltzmann equation for the one particle density opera
tor for a system of N identically charged particles interacting with a classical
electromagnetic field. The effect of quantum charge and current densities on
the dynamics of the Maxwell field and the effect of the Maxwell field on the
dynamics of the one particle density operator.
[4] The quantum nonlinear channel generated by the evolving the one particle
density operator under an external field in the quantum Boltzmann equation.
Chapter 4
The HudsonParthasarathy quantum stochastic calculus with application to
quantum gate and quantum TPCP channel design.
[1] Boson and Fermion Fock spaces.
[2] Creation, annihilation and conservation operator fields in Boson and
Fermion Fock spaces.
[3] Creation, annihilation and conservation processes in Boson Fock space.
[4] The HudsonParthasarathy quantum Ito formulae.
[5] HudsonParthasarathySchrodinger equation for describing the evolution
of a quantum system in the presence of bath noise.
[6] The GKSL (Lindlbad) master equation for open quantum systems.
[7] Design of the Lindlbad operators for an open quantum system for gener
ation of a given TPCP map, ie, design of noisy quantum channels.
[8] Schrodinger’s equation with Lindlbad noisy operators in the presence of
an external electromagnetic field. Design of the electromagnetic field so that
the evolution matches a given TPCP map in the least squares sense.
205
[9] Quantum Boltzmann equation in the presence of an external electromag

netic field with Lindblad noise terms. Design of the Lindblad operators and
the electromagnetic field so that the evolution matches a given nonlinear TPCP
map.
Chapter 5
NonAbelian matter and gauge fields applied to the design of quantum gates.
[1] Lie algebra of a Lie group.
[2] Gauge covariant derivatives.
[3] The nonAbelian matter and gauge field Lagrangian that is invariant
under local gauge transformations.
[4] Application of nonAbelian matter and gauge field theory to the design
of large sized quantum gates in the first quantized formalism, gauge fields are
assumed to be classical and the control fields and current are also assumed to
be classical.
Chapter 6
The performance of quantum gates in the presence of quantum noise.
Quantum plasma physics: For an N particle system with identitcal particles

� �
iρ,t = [ Hk , ρ] + [ Vij , ρ] + Lindlbadnoiseterms.
k i<j
where
Hk = −(�k − ieA(t, rk ))2 /2m + eΦ(t, rk )
= H0k + Vk
where
H0k = −�2k /2m, Vk = (ie/2m)(2.divAk + (Ak , �k )) + e2 A2k /2m + eΦk
Approximate one particle Boltzmann equation obtained by partial tracing over

the remaining N particles.
iρ1,t = [H1 , ρ1 ] + (N − 1)T r2 [V12 , ρ12 ] + Lindbladnoiseterms
iρ12,t = [H1 + H2 + V12 , ρ12 ] + (N − 2)T r3 [V13 + V23 , ρ123 ] + Lindbladnoiseterms

Writing
ρ12 = ρ1 ⊗ ρ1 + g12
we get from the above two equations,
i(ρ1,t ⊗ ρ1 + ρ1 ⊗ ρ1,t ) + ig12,t =
[H1 , ρ1 ] ⊗ ρ1 + ρ1 ⊗ [H1 , ρ1 ] + ig12,t =

[H1 , ρ1 ] ⊗ ρ1 + ρ1 ⊗ [H1 , ρ1 ] + [H1 + H2 + V12 , g12 ]
+Lindbladnoise
Here, we have neglected the cubic terms, ie, the terms involving ρ123 . After
making the appropriate cancellations, we get
ig12,t = [H1 + H2 + V12 , g12 ] + Lindbladnoise
Thus we get for the one particle density,
iρ1,t = [H1 , ρ1 ] + (N − 1)T r2 [V12 , ρ1 ⊗ ρ1 ] + (N − 1)T r2 [V12 , g12 ]
+Lindbladnoise
We can express this approximate equation as
iρ1,t = [H1 , ρ1 ] + (N − 1)T r2 [V12 , ρ1 ⊗ ρ1 ]
+(N − 1)T r2 [V12 , exp(−itad(H1 + H2 + V12 ))(g12 (0))]

Since the nonlinear term is small, we can approximate the ρ1 (t) occurring in it
by exp(−itad(H1 ))(ρ1 (0)). Thus,
iρ1,t (t) = [H1 , ρ1 (t)] + (N − 1)T r2 [V12 , exp(−it.ad(H1 + H2 ))(ρ1 (0) ⊗ ρ1 (0))]
+(N − 1)T r2 [V12 , exp(−itad(H1 + H2 + V12 ))(g12 (0))] + θ(ρ1 (t))

where θ(ρ1 ) is the Lindblad noise term. The solution is
ρ1 (t) = exp(−itad(H1 ))(ρ1 (0))+

� t
(N −1) exp(−i(t−s)ad(H1 ))(T r2 [V12 , exp(−is.ad(H1 +H2 ))(ρ1 (0)⊗ρ1 (0))])ds
0
� t
+(N −1) exp(−i(t−s)ad(H1 ))(T r2 [V12 , exp(−isad(H1 +H2 +V12 ))(g12 (0))])ds
0
+Lindbladnoiseterms.
The first term is the initial condition term, the second term is the collision term
and the third term is the scattering term. Note that taking the trace gives us
i∂t (T r(ρ1 (t)) = 0
which means that the above quantum Boltzmann equation as it evolves the one
particle state, determines a nonlinear trace preserving mapping. This suggests
that the quantum Boltzmann equation can be used to design nonlinear TPCP
maps by controlling the electromagnetic potentials A, Φ as well as the Lindblad
terms.
Now let ρ1 (t, r, r� ) denote the position space representation of the one particle
density operator. Then the charge density corresponding to this operator is
given by eρ1 (t, r, r) and the current density is
J(t, r) = (e/m)�r� Im(ρ(t, r, r� ))|r� =r + (e/m)A(t, r)ρ1 (t, r, r)

207
This is based on the fact that if ψ(t, r) denotes a pure state in the posi
tion representation, ie, a wave function, then the mixed state is an average
of ψ(t, r)ψ(t, r' )∗ over ensembles of such pure states, ie,
ρ1 (t, r, r' ) =< ψ(t, r)ψ(t, r' )∗ >
In particular, the charge density is
e < |ψ(t, r)|2 >= eρ1 (t, r, r)
Likewise the current density for a pure state is
J(t, r) = (ie/2m)(ψ(t, r)∗ (V − ieA(t, r))ψ(t, r) − ψ(t, r)(V + ieA(t, r))ψ(t, r)∗ )
= (e/m)A(t, r)|ψ(t, r)|2 − (e/m)Im(ψ(t, r)∗ Vψ(t, r))

which on taking average gives
J(t, r) = (e/m)A(t, r)ρ1 (t, r, r) − (e/m)Vr! Im(ρ1 (t, r' , r)|r! =r )
= (e/m)A(t, r)ρ1 (t, r, r) + (e/m)Vr! Im(ρ1 (t,' r, r)|r! =r )

The classical em field then satisfies the Maxwell equations
DA(t, r) = µ0 J(t, r) = µ0 ((e/m)A(t, r)ρ1 (t, r, r)−(e/m)Vr! Im(ρ1 (t, r' , r)|r! =r ))
DΦ(t, r) = ρ(t, r)/E0 = E−1

0 (eρ1 (t, r, r))
where
D = (c−2 ∂t2 − V2 )
is the wave operator. The complete system of the quantum Boltzmann or rather
quantum Vlasov (quantum for the density operator and classical for the em field)
is therefore summarized below:
H1 = −(V1 + ieA(t, r1 ))2 /2m + eΦ(t, r1 )
V12 = V (|r1 − r2 |)
ρ1 (t) = exp(−itad(H1 ))(ρ1 (0))+
f t
(N −1) exp(−i(t−s)ad(H1 ))(T r2 [V12 , exp(−is.ad(H1 +H2 ))(ρ1 (0)⊗ρ1 (0))])ds
0
f t
+(N −1) exp(−i(t−s)ad(H1 ))(T r2 [V12 , exp(−isad(H1 +H2 +V12 ))(g12 (0))])ds
0
+Lindbladnoiseterms.
DA(t, r) = µ0 ((e/m)A(t, r)ρ1 (t, r, r) − (e/m)Vr! Im(ρ1 (t, r' , r)|r! =r ))
DΦ(t, r) = E−1
0 (eρ1 (t, r, r))
Second quantization of the quantum plasma equations.

Consider the Dirac field of electrons and positrons. The wave function ψ(t, r)
is a Fermionic field operator satisfying the Dirac equation
[γ µ (i∂µ + eAµ ) − m]ψ(t, r) = 0
The unperturbed field ψ0 (t, r), ie, when Aµ = 0 can be expanded as a superposi
tion of annihilation and creation operator fields in momentum space of electrons
and positrons. This is the case when we second quantize the motion of a sin
gle electronpositron pair. When however we second quantize the motion of N
identical electronpositron pairs and then form the marginal density operator of
a single electronpositron pair, we obtain the quantum Boltzmann equation for
the one particle density operator field ρ1 (t, r� , r� ) which in the absence of mu
tual interactions V (r1 − r2 ), is simply ψ0 (t, r)ψ0 (t, r� )∗ which is expressible as
a quadratic combination of Fermionic creation and annihilation operator fields
in momentum space. The single particle Dirac Hamiltonian is
HD = (α, −i� + eA) + βm − eΦ
which in the absence of electromagnetic interactions reads
HD0 = (α, −i�) + βm
The second quantized density operator field in position space representation

that corresponds to this unperturbed Dirac Hamiltonian is given by
ρ0 (t, r, r� ) = ψ0 (t, r)ψ0 (t, r� )

where
�
ψ0 (t, r) = [a(P, σ)u(P, σ)exp(−i(E(P )t−P.r))+
b(P, σ)∗ v(P, σ)exp(i(E(P )t−P.r))]d3 P
with u(P, σ) and v(−P, σ) being eigenvectors for √ HD0 (P ) = (α, P ) + βm with
eigenvalues ±E(P ) respectively where E(P ) = P 2 + m2 . Thus, we can write
after discretization in the momentum domain,
�
ρ100 (t, r, r� ) = [a(k)a(m)f1 (t, r, r� |k, m) + a(k)a(m)∗ f2 (t, r, r� )] + H.c
k,m
∗
where a(k), a(k) are respectively Fermionic annihilation and creation operators
satisfying the canonical anticommutation relations
[a(k), a(m)∗ ]+ = δ(k, m), [a(k), a(m)]+ = 0 = [a(k)∗ , a(m)∗ ]+
After interaction with the quantum em field, the density operator ρ10 (t, r, r� )
satisfies the equation
i∂t ρ10 (t) = [HD0 + Hint (t), ρ10 (t)]

209
where
Hint (t) = e(α, A(t, r)) − eΦ(t, r)
where A(t, r), Φ(t, r) are now both Bosonic quantum operator fields expressible
upto zeroth order approximations as linear combinations of Bosonic annihila
tion and creation operators c(k), c(k)∗ , k = 1, 2, ... that satisfy the canonical
commutation relations:
[c(k), c(m)∗ ] = δ(k, m), [c(k), c(m)] = [c(k)∗ , c(m)∗ ] = 0
We have upto first order perturbation theory,
ρ10 = ρ100 + ρ101
where
i∂t ρ101 (t) = [HD0 , ρ101 (t)] + [Hint (t), ρ100 (t)]
This has the solution
� t
ρ101 (t) = exp(−i(t − s)ad(HD0 )([Hint (s), ρ100 (s)])ds
0
It is clear that this expression for ρ101 (t) is a trilinear combination of the pho
tonic creation and annihilation operators and the electronpositron creation and
annihilation operators such that it is linear in the former and quadratic in the
latter. Thus, we can write approximately
�
ρ101 (t, r, r� ) = [c(k)a(l)a(m)g1 (t, r, r� |klm)+c(k)∗ a(l)a(m)g2 (t, r, r� |klm)+
klm
c(k)a(l)∗ a(m)g3 (t, r, r� |klm)+c(k)∗ a(l)∗ a(m)g4 (t, r, r� |klm)]+H.c

After taking into account self interactions of the plasma of electrons and positrons,
the one particle density operator field ρ1 (t, r, r� ) satisfies the quantum Boltz
mann equation
�
iρ1,t (t, r1 , r1� ) = (HD (t, r1 , r1�� )ρ1 (t, r�� , r� ) − ρ1 (t, r, r�� )HD (t, r�� , r� ))d3 r��
�
+(N − 1)ρ10 (t, r1 , r1� ) (V (r1 , r2 ) − V (r1 ‘� , r2 ))ρ10 (t, r2 , r2 )d3 r2
�
+(N − 1) (V (r1 , r2 ) − V (r1� , r2 ))g12 (t, r1 , r2 |r1� , r2 )d3 r2 − − − (a)
where
g12 (t, r1 , r2 |r1� , r2� )
Note that in perturbation theory,
ρ1 = ρ10 + ρ11
where ρ10 as calculated above, satisfies (a) with V = 0.

Chapter 14
Chapter 14
Quantum Neural Networks
Quantum neuralUsing
and Learning networks
Mixed
and learning using mixed
State Dynamics for Open
state dynamics for open
Quantum Systemsy
quantum systems
Let ρ(t) be the state of an evolving quantum system according to the dynamics
iρ� (t) = [H0 + V (t), ρ(t)] + θ(ρ(t))
where V (t) is the control potential and θ(ρ) is the Lindblad noise term. Our aim
is to design this control potential V (t) so that the probability density ρ(t, r, r)
in the position representation tracks a given probability density function p(t, r)
and the quantum current (e/m)�r� Im(ρ(t, r, r� )|r� =r ) tracks a given current
density J(t, r). By analogy with pure state dynamics and fluid dynamical sys
tems, the corresponding problem is formulated as follows: Consider the wave
function ψ(t, r) that satisfies Schrodinger’s equation for a charged particle in an
electromagnetic field:
i∂t ψ(t, r) = (−(� + ieA(t, r))2 /2 − eΦ(t, r) + W (t, r)V (t, r))ψ(t, r)
The aim is to control the neural weight W (t, r), so that �pq (t, r) = |ψ(t, r)|2
tracks the fluid density ρ(t, r) that is normalized so that ρ(t, r)d3 r = 1 and
further that the probability current density
Jq (t, r) = Im(ψ(t, r)�ψ(t, r)∗ ) + A(t, r)|ψ(t, r)|2
tracks a given current density J(t, r) of the fluid. Note that we are assuming
that p, J satisfy the equation of continuity:
∂t ρ + div(J) = 0
211
201
212CHAPTER 14. QUANTUM NEURAL NETWORKS AND LEARNING USING MIXED
This is an effective QNN algorithm for simulating the motion of a fluid because
the corresponding quantum mechanical probability and current density pairs
(pq , Jq ) automatically by virtue of Schrodinger’s equation satisfy the equation
of continuity and simultaneously guarantee that pq is a probability density, ie,
it is nonnegative and integrates over space to unity. Thus rather than using a
QNN for tracking only the density of a fluid or a probability density, we can in
addition use it to track also the fluid current density by adjusting the control
weights W (t, r). Note that for open quantum systems, ie, in the presence of
Lindblad noise terms, the equation of continuity for the QNN does not hold.
This means that when bath noise interacts with our QNN, we would not get an
accurate simulation of the fluid mass and current density. However, if there are
sources of mass generation in the fluid so that the equation of continuity reads
∂t ρ(t, r) + divJ(t, r) = g(t, r)
then we can use the Lindblad operator coefficient terms as additional control
parameters to simulate ρ, J for such a system with mass generation g > 0 or
mass destruction g < 0.
|J(t, r) − Jq (t, r)|2 = |Im(ψ(t, r)�ψ(t, r)∗ ) + A(t, r)|ψ(t, r)|2 − J(t, r)|2
is the spacetime instantaneous norm error square between the true fluid current
density and that predicted by the QNN. Its gradient w.r.t A(t, r) is
δ|J(t, r) − Jq (t, r)|2 /δA(t, r) =
2|ψ(t, r)|2 (J(t, r) − Jq (t, r))

Note that this is a vector. This means that for tracking the current density, the
update equation for the control vector potential must be
∂t A(t, r) = −µδ|J(t, r) − Jq (t, r)|2 /δA(t, r) =
−2µ|ψ(t, r)|2 (J(t, r) − Jq (t, r))

We can also derive a corresponding update equation for the scalar potential in
Schrodinger’s equation based on matching the true mass density of the fluid to
that predicted by the QNN as follows.
|p(t, r) − pq (t, r)|2 = |p(t, r) − |ψ(t, r)|2 |2
so
∂t Φ(t, r) = −µδ|p(t, r) − |ψ(t, r)|2 |2 /δΦ(t, r) =
2µ[ψ̄(t, r)δψ(t, r)/δΦ(t, r) + ψ(t, r)δψ̄(t, r)/δΦ(t, r)][p(t, r) − |ψ(t, r)|2 ]
Now using the Dyson series expansion,
� �
ψ(t, r) = U0 (t)ψ(0, r)+ (−i)n U0 (t−t1 )V (t1 )U0 (t1 −t2 )....U0 (tn−1 −tn )ψ(0, r)dt1 ...dtn−1
n≥1 0<tn <...<t1 <t
213
giving
δψ(t, r)/δV (t, r)

� �
= V (t, r) (−i)n U0 (t−t2 )V (t2 )U0 (t2 −t3 )...U0 (tn−1 −tn )ψ(0, r)dt2 ...dtn
n≥1 0<tn <...<t2 <t
= −iV (t, r)(ψ(t, r) − ψ(0, r))

Taking the conjugate of this equation gives us
¯ r) − ψ̄(0, r))
δψ̄(t, r)/δV (t, r) = iV (t, r)(ψ(t,
Thus the update equation for Φ(t, r) = V (t, r)/e becomes
∂t Φ(t, r) = 2µ.Φ(t, r)Im(ψ(t, r) − ψ(0, r)).(p(t, r) − |ψ(t, r)|2 )
Note that here, we can more generally consider the error energy to be a weighted
linear combination of the error energies between the fluid mass density and its
QNN estimate and that between the fluid current density and its QNN estimate:
E = w1 |p(t, r)−|ψ(t, r)|2 |2 +w2 |J(t, r)−Im(ψ(t, r)�ψ(t, r)∗ )+A(t, r)|ψ(t, r)|2 |2
while calculating the updates for the vector and scalar potentials using this
¯ r)
energy function, we must take into account the dependence of ψ(t, r) and ψ(t,
on both A(t, r) and Φ(t, r). For this we write the Hamiltonian as
H(t) = H0 + eV1 (t) + e2 V2 (t)
where
H0 = −�2 /2, V1 (t) = Φ(t, r)+(1/2)(divA(t, r)+2(A(t, r), �)), V2 (t) = A(t, r)2 /2
Then in terms of variational derivatives, from the Dyson series expansion,
δψ(t, r)/δA(t, r) = e((i/2)�ψ(t, r) − i�ψ(t, r))
= (−i/2)�ψ(t, r)
and taking conjugate
δψ̄(t, r)/δA(t, r) = (i/2)�ψ̄(t, r)
so we get
δ|ψ(t, r)|2 /δA(t, r) =
¯ r)�ψ(t, r) − ψ(t, r)�ψ̄(t, r))
(−i/2)(ψ(t,
¯ r)�ψ(t, r))
= Im(ψ(t,
Exercise: Using the above formulas, evaluate
δE/δA(t, r), δE/δΦ(t, r)

214CHAPTER 14. QUANTUM NEURAL NETWORKS AND LEARNING USING MIXED
QNN in the presence of HudsonParthasarathy noise:

Let |ψ(t) > be a pure state in the Hilbert space h ⊗ Γs (L2 (R+ )) evolving
according to the HudsonParthasarathy noisy Schrodinger equation where h =
L2 (R+ ):
d|ψ(t) >= (−(iH + LL∗ /2)dt + LdA − L∗ dA∗ )|ψ(t) >
where H, L are system operators. Specifically, choose
H = −�2 /2 + V (t, r), L = (a, �) + (b, r)
where a, b are constant complex 3vectors. For a specific representation of the

creation and annihilation processes A(t)∗ , A(t), we choose an onb |en > for
L2 (R+ ) and then write
� �
A(t) = an < χ[0,t] > |en >, A(t)∗ = a∗n < en |χ[0,t] >
n n
where an are operators in a Hilbert space satisfying the CCR
[an , am ] = 0, [an , a∗m ] = δ[n − m], [a∗n , a∗m ] = 0
Then an easy calculation shows that

�
[A(t), A(s)∗ ] = < χ[0,t] |en >< en |χ[0,s] >=< χ[0,t] |χ[0,s] >= min(t, s)
n
from which we easily deduce the quantum Ito formula of Hudson and Parthasarathy.
We can also introduce a conservation process Λ(t)
�
Λ(t) = a∗n am < en |χ[0,t] |em >
n,m
and deduce that

�
[A(t), Λ(s)] = am < χ[0,t] |en >< en |χ[0,s] |em >
n,m
�
= am < χ[0,min(t,s)] |em >= A(min(t, s))
m
and taking the adjoint of this equation then gives us
[Λ(s), A(t)∗ ] = A(min(t, s))∗
These equations yield the other quantum Ito formulas involving the products
of the creation and annihilation operator differentials with the conservation
operator differentials. Now suppose that we assume that our wave function on
the tensor product of the system and bath space has the form
|ψ(t) >= |ψs (t) > ⊗|φ(u) >

where |φ(u) > is the bath coherent state and |ψs (t) > is the system state.
Then substituting this into the HudsonParthasarathy noisy Schrodinger equa
tion gives us using
dA(t)|φ(u) >= dt.u(t)|φ(u) >
dA(t)∗ |φ(u) >= (dB(t) − u(t)dt)|φ(u) >
where
B(t) = A(t) + A(t)∗
is a classical Brownian motion process and since the bath is in the coherent
state, it is should be regarded as a Brownian motion with drift. Thus, the HPS
equation assumes the form for such a special class of pure states on system and
bath space assumes the form of a stochastic Schrodinger equatiion
i∂t |ψs (t) >= (−(iH + LL∗ /2)dt + u(t)dtL − (dB(t) − u(t)dt)L∗ )|ψs (t) >
= (((−iH − LL∗ /2 + u(t)(L + L∗ ))dt − L∗ dB(t))|ψs (t) >

This stochastic Schrodinger equation is parametrized by the coherent state pa
rameter function u ∈ L2 (R+ ). We can now use this stochastic Schrodinger equa
tion to track a given probability density p(t, r) using control on all H, L, u(t).
This would thus define a QNN in the HudsonParthasarathyStochasticSchrodinger
framework. The idea is that Brownian noise is undesired but one cannot help
it enter into the picture and hence one must track the given pdf in spite of it
being present.
Using quantum electrodynamics (QED) to track joint probability densities

in a QNN.
The Lagrangian of (QED) is given by the sum of a Maxwell term, a Dirac
term and an interaction term:
L = LM + LD + Lint =
(−1/4)Fµν F µν + ψ̄(γ µ (i∂µ + eAµ ) − m)ψ

¯ µ (∂µ − m)ψ + eψγ
= (−1/4)Fµν F µν + ψ(iγ ¯ µ ψ.Aµ
Thus we identify
¯ µ ∂µ − m)ψ,
LM = (−1/4)Fµν F µν , LD = ψ(iγ
¯ µ ψ.Aµ
Lint = eψγ
To proceed further, we write down the Schrodinger equation corresponding to
this second quantized system as
iU � (t) = H(t)U (t)
where H(t) is the second quantized Hamiltonian of the system expressed in terms
of the creation and annihilation operator fields of the photons and Fermions. To
206
216CHAPTERStochastic ProcessesNEURAL
14. QUANTUM in Classical and QuantumAND
NETWORKS Physics and Engineering
LEARNING USING MIXED
get at this Hamiltonian, we must first calculate to a given order of perturbation,

the Dirac wave field and the Maxwell wave field in terms of these Bosonic and
Fermionic creation and annihilation operators. Let us as an example, derive
these expressions upto second order of perturbation. The Maxwell and Dirac
equations derived from the Lagrangian are
∂α ∂ α Aµ = eψ ∗ αµ ψ,
(iγ µ ∂µ − m)ψ = −eγ µ ψ.Aµ

where we have applied the Lorentz gauge condition
∂ µ Aµ = 0
in deriving the former. In the absence of interactions, we can express the solu
tions as �
ψ0 (x) = (a(k)fk (x) + a(k)∗ gk (x)),
k
�
µ
A (x) = c(k)φµk (x) + c(k)∗ φ¯µk (x)
k
where c(k), c(k) satisfy the Bosonic CCR while a(k), a(k)∗ satisfy the Fermionic
∗
CAR. Writing
ψ = ψ0 + ψ1 + ψ2 , Aµ = Aµ0 + Aµ1 + Aµ2 ,
we get
a.Zeroth order perturbation equations.
(iγ µ ∂µ − m)ψ0 = 0, ∂α ∂ α Aµ0 = 0,
b.First order perturbation equations.
(iγ µ ∂µ − m)ψ1 = −eγ µ ψ0 Aµ0 ,
∂α ∂ α Aµ1 = eψ0∗ αµ ψ0
c.Second order perturbation equations.
(iγ µ ∂µ − m)ψ1 = −eγ µ (ψ0 Aµ1 + ψ1 Aµ0 )
∂α ∂ α Aµ2 = e(ψ0∗ αµ ψ1 + ψ1∗ αµ ψ0 )

In general, writing the nth order perturbation series
n
� n
�
Aµ = Aµk , ψ = ψk ,
k=0 k=0
217
we obtain the nth order perturbation equations as

n−1
�
µ µ
(iγ ∂µ − m)ψn = −eγ ψk Aµn−1−k ,
k=0
n−1
�
∂α ∂ α Aµn = e ψk∗ αµ ψn−1−k
k=0
Quantum Quantum Gate Design,

gate design,
Magnet
magnet Motion, Quantum
motion,quantum
Filtering, Wigner Distributions
filtering,Wigner
distributions
15.1 Designing a quantum gate using QED with
a control cnumber four potential
Consider the interaction Lagrangian between the em four potential and the
Dirac current:
¯ µ ψ.Aµ
Lint = eψγ
The corresponding contribution of this interaction Lagrangian to the interaction
Hamiltonian is its negative (in view of the Legendre transformation) integrated
over space: �
Hint = −Lint = −e ψγ ¯ µ ψ.Aµ d3 x
The contribution of this term upto second order in the em potential to the
scattering amplitude between two states |p1 , σ1 , p2 , σ2 > and |p�1 , σ1� , p�2 , σ2� > in
which there are two electrons with the specified four momenta and zcomponent
of the spins or equivalently two positrons or one electron and one positron is
given by
A2 (p�1 , σ1� , p�2 , σ2� |p1 , σ1 , p2 , σ2 )

�
= −e2 < p�1 , σ1� , p�2 , σ2� |T ( Hint (x1 )Hint (x2 ))d4 x1 d4 x2 |p1 , σ1 , p2 , σ2 >
Let u(p, σ) denote the free particle Dirac wave function in the momentumspin
domain. Thus it satisfies
(γ µ pµ − m)u(p, σ) = 0, σ = 1, 2
219
209
220CHAPTER 15. QUANTUM GATE DESIGN, MAGNET MOTION,QUANTUM FILTE
Taking the adjoint of this equation after premultiplying by γ 0 and making use
of the selfadjointness of the matrices γ 0 γ µ gives us
ū(p, σ)(γ µ pµ − m) = 0
where
ū(p, σ) = u(p, σ)∗ γ 0
Now observe that for one particle momentumspin states of an electron,
a(p� , σ � )|p, σ >= δ(σ � , σ)δ 3 (p� − p)|0 >
and therefore,
ψ(x)|p, σ >e = exp(−ip.x)u(p, σ)|0 >
and for a one particle positron state
ψ(x)∗ |p, σ >p = exp(ip.x)v(p, σ)∗ |0 >
or equivalently,
ψ̄ (x)|p, σ >= ψ(x)∗ γ 0 |p, σ >p = exp(ip.x)v̄(p, σ)|0 >
Thus one of the terms in A2 (p�1 , σ1� , p�2 , σ2� |p1 , σ1 , p2 , σ2 ) for two electron momen
tum state transitions is given by
�
−e2 ¯ 1 )⊗ψ(x
< p�1 , σ1� , p�2 , σ2� |(ψ(x ¯ 2 ))γ µ ⊗γ ν )(ψ(x1 )⊗ψ(x2 ))|p1 , σ1 , p2 , σ2 >
Dµν (x1 −x2 )d4 x1 d4 x2
�
= Dµν (x1 −x2 )(ū(p�1 , σ1� )⊗ū(p�2 , σ2� ))(γ µ ⊗γ ν )(u(p1 , σ1 )⊗u(p2 , σ2 ))exp(−i(p1 −p�1 ).x1 −i(p2 −p�2 ).x2 )d4 x1 d4 x2
ˆ µν (p1 − p� )δ 4 (p1 + p2 − p� − p� ).
=D 1 1 2
.(ū(p1� , σ1� ) ⊗ ū(p2� , σ2� ))(γ µ ⊗ γ ν )(u(p1 , σ1 ) ⊗ u(p2 , σ2 ))

This evaluates to
[(ū(p�1 , σ1� )γ µ u(p1 , σ1 )).(ū(p2� , σ2� )γµ u(p2 , σ2 ))/(p1� − p1 )2 ]δ 4 (p1 + p2 − p1 − p�2 )
Likewise another term having a different structure also comes from the same
expression for A2 and that is
[(ū(p�2 , σ2� )γ µ u(p1 , σ1 )).(ū(p1� , σ1� )γµ u(p2 , σ2 ))/(p2� − p1 )2 ]δ 4 (p1 + p2 − p1 − p�2 )
The total amplitude is a superposition of these two amplitudes. Here, Dµν (x)
is the photon propagator and Dˆ µν (p) is its four dimensional spacetime Fourier
transform and is given by ηµν /p2 where p2 = pµ pµ . This propagator is defined
by
Dµν (x1 − x2 ) =< 0|T (Aµ (x1 )Aν (x2 ))|0 >
15.2. ON THE MOTION OF A MAGNET THROUGH A PIPE WITH A COIL UNDER GR
Now we consider a situation when apart from the quantum em field, there is
also an external classical control em field described by the four potential Acµ .
Then in the nth order term in the Dyson series for the scattering of M electrons,
there will be a term of the form
�
< p�k , σk� , k
= 1, 2, ..., M |T (Hint (x1 )...Hint (xn ))|pk , σk , k = 1, 2, ..., M > d4 x1 ...d4 xn
where
¯
Hint (x) = eψ(x)γ µ
ψ(x)(Aµ (x) + Acµ (x))
Here, Aµ is the four potential of the quantum electromagnetic field while Acµ (x)
is the four potential of the classical cnumber control electromagnetic four po
tential. Expanding these, it is clear that this term will be a sum of terms of the
form
�
¯ k )γ µk ψ(xk ))|pk , σk , k = 1, 2, ..., M >
< p�k , σk� , k = 1, 2, ..., M T (Πnk=1 ψ(x
. < 0|T (Aµ1 (x1 )...Aµr (xr ))|0 > Πnm=r+1 Acµm (xm )d4 x1 ...d4 xn
15.2 On the motion of a magnet through a pipe

with a coil under gravity and the reaction
force exerted by the induced current through
the coil on the magnet
Problem formulation
Assume that the coil is wound around a tube with n0 turns per unit length
extending from a height of h to h + b. The tube is cylindrical and extends from
z = 0 to z = d. Taking relativity into account via the retarded potentials, if
m(t) denotes the magnetic moment of the magnet, then the vector potential
generated by it is approximately obtained from the retarded potential formula
as
A(t, r) = µm(t − r/c) × r/4πr3 − − − (1)
where we assume that the magnet is located near the origin. This approximates
to
A(t, r) = µm(t) × r/4πr3 − µm� (t) × r/4πcr2 − − − (2)
Now Assume that the magnet falls through the central axis with its axis always
being oriented along the −ẑ direction. Then, after time t, let its z coordinate
be ξ(t). We have the equation of motion
M ξ �� (t) = F (t) − M g − − − (3)

where F (t) is the z component of the force exerted by the magnetic field gener
ated by the current induced in the coil upon the magnet. Now, the flux of the
magnetic field generated by the magnet through the coil is given by
�
Φ(t) = n0 Bz (t, X, Y, Z)dXdY dZ − − − (4)
2 ,h≤z≤h+b
X2 +Y 2 ≤RT
where
Bz (t, X, Y, Z) = ẑ.B(t, X, Y, Z) − − − (5a)
B(t, X, Y, Z) = curlA(t, R − ξ(t)ẑ) − − − (5b)
where
R = (X, Y, Z)
and where RT is the radius of the tube. The current through the coil is then
I(t) = Φ� (t)/R0 − − − (6)
where R0 is the coil resistance. The force of the magnetic field generated by
this current upon the falling magnet can now be computed easily. To do this,
we first compute the magnetic field Bc (t, X, Y, Z) generated by the coil using
the usual retarded potential formula
Bc (t, X, Y, Z) = curlAc (t, X, Y, Z) − − − (7a)

where
� 2n0 π
Ac (t, X, Y, Z) = (µ/4π) (I(t−|R−RT (x̂.cos(φ)+ŷ.sin(φ))−ẑ(h+b/2n0 π)φ|/c).
0
.(−ˆ
x.sin(φ)+ˆ
y.cos(φ))/|R−RT (ˆ
x.cos(φ)+ˆ
y.sin(φ))−ẑ(b/2n0 π)φ|)RT dφ−−−(7a)
Note that Ac (t, X, Y, Z) is a function of ξ(t) and also of m(t) and m� (t). It follows
that I(t) is a function of ξ(t), ξ � (t), m(t), m� (t), m�� (t). Then we calculate the
interaction energy between the magnetic fields generated by the magnet and by
the coil as
�
� −1
Uint (ξ(t), ξ (t), t) = (2µ) (Bc (t, X, Y, Z), B(t, X, Y, Z))dXdY dZ − − − (8)
and then we get for the force exerted by the coil’s magnetic field upon the falling
magnet as
F (t) = −∂Uint (ξ(t), ξ � (t), t)/∂ξ(t) − − − (9)
Remark: Ideally since the� interaction energy between the two magnetic fields
appears in the Lagrangian (E 2 /c2 − B 2 )d3 R/2µ, we should write down the
EulerLagrange equations using the Lagrangian
L(t, ξ � (t), ξ � (t)) = M ξ � (t)2 /2 − Uint (ξ(t), ξ � (t), t) − M gξ(t) − −(10)

Stochastic
15.2. ONProcesses in Classical
THE MOTION OFand Quantum Physics
A MAGNET THROUGHand Engineering 213UNDER G
A PIPE WITH A COIL
in the form
d
∂L/∂ξ � − ∂L/∂ξ = 0 − − − (11)
dt
which yields
d
M ξ �� (t) − ∂U/∂ξ � + ∂Uint /∂ξ + M g = 0 − −(12)
dt
3. Numerical simulations
Define the force function as in (9):
F (ξ(t), ξ � (t) = −∂Uint (ξ(t), ξ � (t))/∂ξ(t)
Neglecting the term involving the gradient of Uint w.r.t ξ � (this is justified
because the dominant reaction force of the coil field on the magnet is due to the
spatial gradient of the dipole interaction energy), we can discretize (2) with a
time step of $δ to get
M (ξ[n + 1] − 2ξ[n] + ξ[n − 1])/δ 2 + Fint (ξ[n], (ξ[n] − ξ[n − 1])/δ) + M g = 0
or equivalently in recursive form,
ξ[n + 1] == 2ξ[n] − ξ[n − 1] − (δ 2 /M )Fint (ξ[n], (ξ[n] − ξ[n − 1])/δ) − δ 2 g
Fig.1 shows plot of ξ[n] obtained using a ”for loop”. The current through the
coil as a function of time has also been plotted in fig.2 using the formula
�
d
I(t) = R0−1 (n0 Bz (t, X, Y, Z)dXdY dZ)
dt 2 ,h≤z ≤h+b
X2 +Y 2 ≤RT
= I(t|ξ(t), ξ � (t))
where
Bz (t, X, Y, z) = ẑ.curlA(t, X, Y, Z−ξ(t)) = ∂Ay (X, Y, Z)∂X−∂Ax (X, Y, Z)/∂Y
with
A(X, Y, Z) = µ0 .m × R/4πR3 , R = (X, Y, Z)
Here relativistic corrections have been neglected. The formula for the interaction
energy between the magnet and the field generated by the coil discussed in the
previous section can be simplified after making appropriate approximations to
−m.Bc = mBcz where Bcz , the zcomponent of the reaction magnetic field
produced by the coil is evaluated at the site of the magnet, ie, at (0, 0, ξ(t)).
Thus, the formula for the force of the field on the coil is approximated as
Fint (ξ, ξ � ) = Fint (ξ) = −m∂Bcz (t, 0, 0, ξ|ξ, ξ � )/∂ξ

The coil magnetic field is evaluated using the approximate nonrelativistic Biot
Savart formula
� 2n0 π
Bc (t, X, Y, Z|ξ, ξ � ) = (µ0 I(t|ξ, ξ � )/4π)) RT dφ(−ˆ
x.sin(φ)+ˆ
y.cos(φ))×
0
((X−RT .cos(φ))x̂+(Y −RT .sin(φ))

.ŷ+(Z−h−bφ/2n0 π)ẑ)/[(X−RT .cos(φ))2 +(Y −RT .sin(φ))2
+(Z−h−bφ/2n0 π)2 ]3/2
These approximations are shown to produce excellent agreement with hardware
experiments as a comparison of figs.() shows.
Taking into account relativistic corrections: The magnetic field generated

by the coil has the form Bc (t, X, Y, Z|ξ(t), ξ � (t)). More precisely, if I0 (t) is
the current of the electrons within the magnet, the magnetic vector potential
generated by this current is given by
�
A(t, R) = (µ/4π) I0 (t − |R − ξ(t) − R� |/c)dR� /|R − ξ(t) − R� |
magnet
In this expression, ξ(t) is the position vector of the centre of the magnet at time
t and R� is the position on the magnet electron loop relative to its centre. Since
the magnet current is a constant in the case of a permanent magnet, the above
simplifies to
�
A(t, R) = (µI0 /4π) dR� /|R − ξ(t) − R� |
magnet
and hence by making the infinitesimal loop approximation, this becomes
A(t, R) = (µ/4π)m × (R − ξ(t))/|R − ξ(t)|3

In other words, relativistic corrections to the magnetic field produced by the
magnet will arise only if the magnet is an electromagnet so that the current
through it varies with time. However, the current through the coil wound on
the tube surface varies with time and the magnetic field produced by it will have
to be calculated using the retarded potential. Hence relativistic corrections will
arise in this case.
Estimating the magnetic moment m of the magnet using the following scheme:
The magnetic field through the coil and hence the magnetic flux and hence the
induced current in the coil are all functions of m. We’ve already noted that the
induced current can be expressed in the form given by equations (2b),(4),(5)
and (6). We write this as
I(t) = I(m, ξ(t), ξ � (t))
in the special case m is a constant. It is also clear from the above formulas that
for given ξ(t), ξ � (t), the coil current is a linear function of m. This is because
the magnetic field produced by the magnet is a linear function of m. Hence, we

can write
I(t) = P (ξ(t), ξ � (t))m
where P (ξ(t), ξ � (t)) is an easily calculable function. Thus, if we take measure
ments of the coil current and the displacement of the magnet at a sequence of
times t1 < t2 < .. < tN , then m may be estimated by the least squares method:
N
�
m̂ = argminm (I(tk ) − P (ξ(tk ), (ξ(tk+1 ) − ξ(tk ))/(tk+1 − tk ))m)2
k=1
Carrying out this minimization results in the estimate

� �
m̂ = I k Pk / Pk2
k k
where
Ik = I(tk ), Pk = P (ξ(tk ), (ξ(tk+1 ) − ξ(tk ))/(tk+1 − tk ))
The estimates of m obtained in this way for different measurements are shown
in Table(.).
Remark: m is the z component of the magnetic moment which equals in mag
nitude the total magnetic moment since the magnet is assumed to be oriented
along the z direction. The flux through the coil can be expressed as
�
Φ(t) = m.(µn0 /4π) curl(ẑ×R−ξ(t)ẑ)/|R−ξ(t)ẑ |3 ).ẑdXdY dZ
2 ,h≤Z≤h+b
X 2 +Y 2 ≤RT
so this provides the immediate identification
P (ξ(t), ξ � (t)) =
�
d
[(µn0 /4πR0 ) curl(ẑ × R/|R − ξ(t)ẑ|3 ).ẑdXdY dZ
dt 2 ,h≤Z ≤h+b
X 2 +Y 2 ≤RT
Quantization of the motion of a magnet in the vicinity of the coil. The

magnetic dipole is replaced by a constrained sea of Dirac electrons and positrons
within the boundary of the magnet. This Dirac field can be expressed in the
form �
ψ(x) = (a(k)fk (x) + a(k)∗ gk (x))
k
where the functions fk (x), gk (x) are eigen functions of the Dirac equation with
boundary conditions determined by the vanishing of the Dirac wave function
on the boundary of the magnet. These are four component wave functions. In
other words, they satisfy
(iγ µ ∂µ − m)(fk (x), gk (x)) = 0

216226CHAPTER
15. QUANTUM Classical
GATE and Quantum
DESIGN, MAGNET Physics and Engineering
MOTION,QUANTUM FILTE
for all x within the bounding walls of the magnet and they vanish on the bound
ary, or rather the charge density corresponding to these functions as well as the
normal component of the spatial current density vanishes on the boundary. The
magnetic dipole moment of the magnet after second quantization is given by
�
m(t) = ψ(x)∗ (eJ/2m)ψ(x)d3 x
M
where
J = L + gσ/2
is the total orbital plus spin angular momentum of an electron taking the g
factor into account, normally we put g = 1. It is a conserved vector operator
for the Dirac Hamiltonian. The a(k), a(k)∗ are the annihilation and creation
operators of the electrons and positrons. They satisfy the CAR
[a(k), a(m)∗ ]+ = δ(k, m)
It is easily noted that m(t) is a quadratic function of the electronpositron

creation and annihilation operators.
The magnetic field produced by this magnet at the site of the coil is given
by the usual formula
B(t, R) = (µ/4π)curl(m(t) × (R − ξ(t))/|R − ξ(t)|3 )
When integrated over the crosssection of the coil and differentiated w.r.t time
and then divided by the coil resistance, this gives the induced current as a
function of the position ξ(t) and velocity ξ � (t) of the magnet and this expression
for the current is a quadratic form in the Fermion creation and annihilation
operators. The magnetic field produced by the coil current at the site of the
magnet can be computed and hence the force exerted by the coil’s magnetic field
on the magnet obtained by taking the dot product of m(t) with this magnetic
field followed by the spatial gradient will be a fourth degree polynomial in
the Fermionic creation and annihilation operators. The Heisenberg equation of
motion of the coil will therefore have the form
ξ �� (t) = F (t, ξ(t), ξ � (t), {a(k), a(k)∗ }) − g
where g is the acceleration due to gravity and the rhs is a homogeneous fourth
degree polynomial in a, a∗ . We can thus express this operator equation as
ξ �� (t) = F1 (t, ξ(t), ı� (t)|kmpq)a(k)a(m)a(p)a(q)+F2 (t, ξ(t),
ı� (t)|km)a(k)∗ a(m)a(p)a(q)+F3 (t, ξ(t), ı� (t)|kmpq)a(k)∗ a(m)∗ a(p)a(q)+

F4 (t, ξ(t), ı� (t)|km)a(k)∗ a(m)...+sixteenter
These equations are solved approximately using perturbation theory as follows.
We express the solution as
ξ(t) = d − gt2 /2 + δξ(t) = ξ0 (t) + δξ(t), ξ0 (t) = d − gt2 /2
where δξ(t) satisfies the first order perturbed equation
δξ �� (t) = F (t, ξ0 (t), ξ0� (t), a, a∗ )

giving
� t
δξ(t) = (t − s)F (s, ξ0 (s), ξ0� (s), a, a∗ )ds
0
Statistical properties of the quantum fluctuations δξ(t) of the magnet’s position

in a coherent state of the Fermions within the magnet can now be evaluated.
For example if |ψ(γ) > is such a coherent state, then
a(k)|ψ(γ) >= γ(k)|ψ(γ) >
a(k)∗ |ψ(γ) >= ∂/∂γ|ψ(γ) >

where γ(k), γ(k)∗ are Grassmannian numbers, ie, Fermionic parameters. They
are assumed to all mutually anticommute, ie,
[γ(k), γ(l)]+ = 0 = [γ(k)∗ , γ(l)∗ ]+ ,
[γ(k), γ(l)∗ ]+ = 0
and further they anticommute with the a(m), a(m)∗ . To check the consistency
of these operations, we have
γ(m)a(k)|ψ(γ) >= γ(m)γ(k)|ψ(γ) >
on the one hand and on the other,
γ(k)γ(m)|ψ(γ) >= −γ(m)γ(k)|ψ(γ) >= −γ(m)a(k)|ψ(γ) >
in agreement with
γ(m)γ(k) = −γ(k)γ(m)
Further,
a(k)a(m)∗ |ψ(γ) >= (δ(k, m) − a(m)∗ a(k))|ψ(γ) >=
(δ(k, m) − a(m)∗ γ(k))|ψ(γ) >= (δ(k, m) + γ(k)a(m)∗ )|ψ(γ) >
while on the other hand,
a(m)∗ a(k)|ψ(γ) >= a(m)∗ γ(k)|ψ(γ) >
= −γ(k)a(m)∗ |ψ(γ) >

which is a consistent picture.
Let us for example take a single CAR pair a, a∗ so that a2 = a∗2 = 0, aa∗ +
∗
a a = 1. Let |0 > be the vacuum state and |1 > the single particle state, so
that
a|0 >= 0, a∗ |1 >= 0, a∗ |0 >= |1 >, a|1 >= |0 >
Then, the number operator N = a∗ a obviously satisfies
N |0 >= 0, N |1 >= |1 >

218 Stochastic
228CHAPTER ProcessesGATE
15. QUANTUM in Classical and Quantum
DESIGN, MAGNET Physics and Engineering FILTE
MOTION,QUANTUM
Suppose we assume that
|ψ(γ) >= (c1 γ + c2 γ ∗ + c3 )|0 > +(c4 γ + c5 γ ∗ + c6 )|1 >
Then since aγ = −γa, aγ ∗ = −γ ∗ a, we get
a|ψ(γ) >= (−c4 γ|0 > −c5 γ ∗ + c6 )|0 >
and for this to equal γ|ψ(γ) >, we require that c2 = c5 = c6 = 0 and c3 = −c4 .
In other words,
|ψ(γ) >= (c1 γ + c3 )|0 > −c3 γ|1 >
We then get
a∗ |ψ(γ) >= (−c1 γ + c3 )|1 >
Note that we require that
γγ ∗ + γ ∗ γ = 1
We have
∂/∂γ|ψ(γ) >= c1 |0 > −c3 |1 >
∗
which cannot equal a |ψ(γ) >. So to get agreement with this equation also, we
must suppose that
|ψ(γ) >= (c1 γ + c2 γ ∗ + c3 γ ∗ γ + c4 )|0 > +(d1 γ + d2 γ ∗ + d3 γ ∗ γ + d4 )|1 >
Note that by hypothesis,

γ ∗ γ + γγ ∗ = 0
Then,
a|ψ(γ) >= (−d1 γ − d2 γ ∗ + d3 γ ∗ γ + d4 )|0 >
γ|ψ(γ) >= (c2 γγ ∗ + c4 γ)|0 > +(d2 γγ ∗ + d4 γ)|1 >
For equality of the two expressions, we must thus assume that
d2 = d4 = 0, d3 = −c2 , d1 = −c4
This gives
|ψ(γ) >= (c1 γ + c2 γ ∗ + c3 γ ∗ γ + c4 )|0 > +(−c4 γ − c2 γ ∗ γ)|1 >
Now we also require that
a∗ |ψ(γ) >= −∂|ψ(γ) > /∂γ
and this gives
−(c1 − c3 γ ∗ )|0 > +(c4 − c2 γ ∗ )|1 >= (−c1 γ − c2 γ ∗ + c3 γ ∗ γ + c4 )|1 >
Thus,
c 1 = c3 = 0
Stochastic Processes inFILTERING
15.3. QUANTUM Classical andTHEORY
Quantum APPLIED
Physics andTO QUANTUM FIELD 219
Engineering THEORY FO
and we get
|ψ(γ) >= (c2 γ ∗ + c4 )|0 > +(−c4 γ − c2 γ ∗ γ)|1 >
As a check, we compute
a|ψ(γ) >= (c4 γ − c2 γ ∗ γ)|0 >,
γ|ψ(γ) >= (c2 γγ ∗ + c4 γ)|0 >,

a∗ |ψ(γ) >= (−c2 γ ∗ + c4 )|1 >,
∂|ψ(γ) > /∂γ = (−c4 + c2 γ ∗ )|1 >
Now suppose that we have a term for example as a(k)∗ a(m)a(n)∗ a(r) and
we wish to evaluate its expectation value in the coherent state |ψ(γ) >. It will
be given by
< ψ(γ)|a(k)∗ a(m)a(n)∗ a(r)|ψ(γ) >=
−γ(r) < ψ(γ)|a(k)∗ a(m)a(n)∗ |ψ(γ) >
= −γ(r)γ(k)∗ < ψ(γ)|a(m)a(n)∗ |ψ(γ) >=
= −γ(r)γ(k)∗ < ψ(γ)|δ(m, n) − a(n)∗ a(m)|ψ(γ) >
= −δ(m, n)γ(r)γ(k) + γ(r)γ(k)∗ γ(n)∗ γ(m)
The problem is how to interpret this in terms of real numbers ?
15.3 Quantum filtering theory applied to quan

tum field theory for tracking a given prob
ability density function
Consider for example the KleinGordon field φ(x) in the presence of an external
potential V (φ) whose dynamics after it gets coupled to the electronpositron
Dirac field is given by the Lagrangian
¯ µ ∂µ − eφ − me )ψ + (1/2)∂µ φ∂ µ φ − m2 φ2 /2 − V (φ)
L = ψ(iγ k
= Le + Lkg + Lint
where Le , the electronic Lagrangian, is given by
¯ µ ∂µ − me )ψ,
Le = ψ(iγ
Lk the KG Lagrangian is given by
Lkg = (1/2)∂µ φ.∂ µ φ − m2k φ2 /2 − V (φ)

230CHAPTER 15. QUANTUM GATE DESIGN, MAGNET MOTION,QUANTUM FILT
and Lint , the interaction Lagrangian between the KG field and the electron field
is given by
¯
Lint = −eψψ.φ
Note that we could also include interaction terms of the form given by the real
and imaginary parts of
ψ̄γ µ1 ...γ µk ψ.∂µ1 ...∂µk φ
The Schrodinger evolution equation for such a system of quantum fields is ob
tained by introducing Fermionic and Bosonic creation and annihilation operators
to describe the noninteracting fields in the presence of zero potential V , then
applying perturbation theory to approximately solve for the interacting fields
in the presence of the potential V , then use this approximate solution to write
down an expression for the total second quantized field Hamiltonian in terms of
the creation and annihilation operators.
The electronpositron field interacting with a KG field and the photon field:
¯ µ (i∂µ + e1 Aµ ) + e2 φ − m1 )ψ
L = (−1/4)Fµν F µν + ψ(γ
+(1/2)∂µ φ.∂ µ φ − m2 φ2 /2
The interaction Hamiltonian derived from this expression is
�
¯
Hint (t) = e1 ψ(x)γ µ
ψ(x)Aµ (x)d3 x
�
+e2 ¯
ψ(x)ψ(x)φ(x)d 3
x
One of the terms for the scattering amplitude of two electrons in the Dyson
series formulation is
�
2 � � ¯ 1 )γ µ ψ(x1 )ψ(x
e1 < p1 , p2 | T (ψ(x ¯ 2 )γ ν ψ(x2 ))DA (x1 − x2 )d4 x1 d4 x2 |p1 , p2 >
µν
Another term is
�
¯ 1 )ψ(x1 )ψ(x
e22 < p�1 , p�2 | T (ψ(x ¯ 2 )ψ(x2 ))Dφ (x1 − x2 )d4 x1 d4 x2 |p1 , p2 >
Yet another term is obtained from

�
e12 e22 ¯ 1 )γ µ ψ(x1 )ψ(x
ψ(x ¯ 2 )γ ν ψ(x2 )ψ(x
¯ 3 )ψ(x3 )ψ(x
¯ 4 )ψ(x4 )DA (x1 −x2 )Dφ (x3 −x4 )dx1 dx2 dx3 dx4
µν
This last term leads to a term of the form

�
¯ 4 )Se (x4 −x1 )γ µ Se (x1 −x2 )γ ν ψ(x2 )ψ(x
e21 e22 < p�1 , p�2 |ψ(x ¯ 3 )ψ(x3 )|p1 ,
A
p2 > Dµν (x1 −x2 )Dφ (x3 −x4 )dx1 dx2 dx3 dx4
15.4. LECTURE ON THE ROLE OF STATISTICAL SIGNAL PROCESSING IN GENERAL
This term can be evaluated easily by expanding in terms of matrix elements or

equivalently by using the Kronecker tensor product:
�
¯ 4 )⊗ψ(x
e1 e2 < p1� , p�2 |(ψ(x
2 2 ¯ 3 ))(Se (x4 −x1 )γ µ Se (x1 −x2 )⊗I)(ψ(x2 )⊗ψ(x3 ))|p1 , p2 >
A
.Dµν (x1 − x2 )Dφ (x3 − x4 )dx1 dx2 dx3 dx4
One of the terms in the scattering amplitude of two electrons in the presence
of an external electromagnetic potential is
A(p�1 , p�2| p1 , p2 )
�
=< p�1 , p�2 | ¯ 1 )γ µ T (ψ(x1 )ψ(x
Acµ (x1 )T (ψ(x ¯ 2 ))γ ν ψ(x2 )ψ(x
¯ 3 ))γ ρ ψ(x3 ))Dνρ (x2 −x3 )dx1 dx2 dx3 |p1 , p2 >
�
= ¯ 1 )γ µ Se (x1 −x2 )γ ν ψ(x2 )ψ(x
Acµ (x1 ) < p�1 , p2� |T (ψ(x ¯ 3 )γ ρ ψ(x3 )|p1 , p2 > Dνρ (x2 −x3 )dx1 dx2 dx3
This can be expressed as

�
¯ 1 )⊗ψ(x
Acµ (x1 ) < p�1 , p2� , |T ((ψ(x ¯ 3 ))(γ µ Se (x1 −x2 )γ ν ⊗γ ρ )(ψ(x2 )⊗ψ(x3 ))|p1 , p2 > Dνρ (x2 −x3 )dx1 dx2 dx3
which evaluates to
�
Acµ (x1 )exp(i(p�1 .x1 +p�2 .x3 −p1 .x2 −p2 .x3 ))ū(p�1 )γ µ Se (x1 −x2 )γ ν u(p1 ).ū(p2� )γ ρ u(p2 )Dνρ (x2 −x3 )dx1 dx2 dx3
By the change of variables
x1 − x2 = x, x2 − x3 = y
and by expressing Acµ (x1 ) in terms of its momentum domain Fourier integral,
this integral can be expressed as
15.4 Lecture on the role of statistical signal pro

cessing in general relativity and quantum
mechanics
[1] Using the extended Kalman filter to calculate the estimate of the fluid ve
locity field when the fluid dynamical equations are described by the general
relativisitic versions of the NavierStokes equation and the equation of continu
ity and external noise is present. The fluid equations are
((ρ + p)v µ v ν − pg µν ):ν = f µ

222232CHAPTER
15. QUANTUM Classical
GATE and Quantum
DESIGN, MAGNET Physics and Engineering
MOTION,QUANTUM FILTE
where f µ is a random field. From these equations, we can derive
((ρ + p)v ν ):ν v µ + (ρ + p)v ν v:ν

µ
− p,µ = f µ
which results in on using vµ v µ = 1,
((ρ + p)v ν ):ν − p,µ v µ − fµ v µ = 0
This represents the general relativistic equation of continuity and on substituting

this into the previous equation, we get
(ρ + p)v ν v:ν
µ
− p,µ − f µ + v µ (p,ν v ν + fν v ν ) = 0
This equation is the general relativistic version of the NavierStokes equation.

This system of equations can be expressed as a system of four differential equa
tions for v r , r = 1, 2, 3, ρ assuming that the equation of state p = p(ρ) is known.
Then, the measurement process consists of measuring the velocity field v r at
a set of N discrete spatial points, taking into account measurement noise and
then applying the EKF to estimate the fields v r , ρ at all spatial points on a real
time basis.
Applications of quantum statistical signal processing to quantum fluid dy

namics.
Consider the NavierStokes equation along with the equation of continuity
in the form
v,t (t, r) = F1 (t, r, v(t, r), �v(t, r), ρ(t, r), �ρ(t, r)) + w(t, r)
ρ,t (t, r) = F − 2(t, r, v(t, r), �ρ(t, r), �v(t, r))

Denoting the four component vector time varying vector field [ρ(t, r), v(t, r)T ]T
by ξ(t, r), these equations can be expressed in the form
ξ,t (t, r) = F(t, r, ξ(t, r), �ξ(t, r)) + w(t, r)
where w is the noise field. Now to quantize this system of equations, there are
two available routes. The first is to introduce an auxiliary four vector field η(t, r)
which acts as a Lagrange multiplier and write down the Lagrangian density for
the pair of fields ξ, η as
L(ξ, η, ξ,t , �ξ, �η) = η(t, r)T (ξ,t (t, r) − F (t, r, ξ(t, r), �ξ(t, r)))
The equations of motion of η are derived from
δξ L = 0
which result in
η,t (t, r) + η(t, r)T (∂F/∂ξ) − div(η(t, r)T ∂F/∂�ξ) = 0

Stochastic Processes in IN
15.5. PROBLEMS Classical and Quantum
CLASSICAL Physics and Engineering
AND QUANTUM 223
STOCHASTIC CALCULUS233
Now we can express the Hamiltonian density of this system of two fields as
follows. The Hamiltonian density of this system of fields is first computed: The
canonical momenta are
Pη = ∂L/∂η,t = 0
Pξ = ∂L/∂ξ,t = η
H = ηξ,t − L
It is impossible to bring this to Hamiltonian form owing to the strange nature
of of the constraints involved. Indeed, to bring this to Hamiltonian form, ξ,t
must be expressed in terms of the canonical momenta, ξ, η and their spatial
gradients which is not possible. Therefore, we add a regularizing term to give
the Lagrangian density the following form:
L = η T (ξ,t − F ) + �.ξ,t2 /2 + µ.η,t

2
/2
The canonical momenta now are
Pξ = ∂L/∂ξ,t = η + �.ξ,t ,
Pη = ∂L/∂η,t = µη,t
Thus,
H = Pξ .ξ,t + Pη .η,t − L =
15.5 Problems in classical and quantum stochas

tic calculus
[1] Let P : 0 = t0 < t1 < ... < tn = T be a partition of [0, T ]. Let B be standard
one dimensional Brownian motion. Let f be adapted to the Brownian filtration
and bounded, continuous over [0, T ]. Let Q : 0 = s0 < s1 < ... < sm = T be
another partition of [0, T ]. Define
n−1
�
I(P, f ) = f (tk )(B(tk+1 ) − B(tk ))
k=0
Show that if R = P ∪ Q, then
E(I(P, f ) − I(P ∪ Q, f ))2 ≤ T.maxk suptk <s<tk+1 E(f (s) − f (tk ))2
and that this quantity converges to zero as |P | → 0. Deduce using the inequality
E(I(P, f ) − I(Q, f ))2 ≤ 2E(I(P, f ) − I(P ∪ Q, f ))2 + 2E(I(Q, f ) − I(P ∪ Q, f ))2

that as |P |, |Q| → 0, we get
E(I(P, f ) − I(Q, f ))2 → 0
and hence that � T

lim|P |→0 I(P, f ) = f (t)dB(t)
0
exists.
[2] Derive the quantum FokkerPlanck or Master equation or GKSL equation

from Hamiltonian quantum mechanics along the following lines: Let H(t) =
H0 +V (t) be the Hamiltonian where H0 is nonrandom and δH(t) is an operator
valued Hermitian stochastic process with dt2 .E(V (t) ⊗ V (t)) being of O(dt)
just as in Ito’s formula for Brownian motion. Here V (t) is to be regarded as
operator valued white noise so that V (t)dt becomes an operator valued Brownian
differential. Then consider the Schrodinger equation
iρ� (t) = [H0 + V (t), ρ(t)] = [H(t), ρ(t)]
Show that this implies approximately for small τ that

� t+τ � t+τ
i(ρ(t+τ )−ρ(t)) = [H(s), ρ(s)]ds = [H(s), ρ(t)+ρ� (t)(s−t)]ds+O(τ 3 )
t t
= [H(t), ρ(t)]τ − i[H(t), [H(t), ρ(t)]]τ 2 /2

which gives on taking average over the probability distribution of V (t),
ρ(t + τ ) − ρ(t) = −i[H0 , ρ(t)]τ − (τ 2 /2) < [V (t), [V (t), ρ(t)]] >
= −i[H0 , ρ(t)]τ − (τ 2 /2)(< V (t)2 ρ(t) + ρ(t)V (t)2 − 2V (t)ρ(t)V (t) >)
Now, write �
V (t) = ξ(k)Lk
k
where ξ(k) are iid r.v’s with zero mean and variance 1/τ in accordance with the
assumption that V (t) is operator valued white noise. L�k s are assumed to be
selfadjoint operators in the Hilbert space. Then show that the above equation
gives
�
ρ� (t) = −i[H0 , ρ(t)] − (1/2) (L2k ρ(t) + ρ(t)L2k − 2Lk ρ(t)Lk )
k
which is a special case of the master equation for open quantum systems.
[3] Quantum fluctuation dissipation theorem: The general Lindblad term in

the master equation has the form
�
(L∗k Lk ρ + ρL∗k Lk − 2Lk ρL∗k ]
k
15.5. PROBLEMS IN CLASSICAL AND QUANTUM STOCHASTIC CALCULUS235
For ρ = C(β)exp(−βH0 ) to satisfy the GKSL as an equilibrium state, we require

the quantum fluctuationdissipation theorem to hold, ie,
2
(L∗k Lk exp(−βH) + exp(−βH)L∗k Lk − 2Lk .exp(−βH).L∗k ) = 0
k
Problem: let
H = P 2 /2m + U (Q)
in L2 (R) where P = −id/dQ and put
Lk = gk (Q, P ) = αk Q + βk P
Evaluate the conditions of the quantum fluctuationdissipation theorem.

hint:
exp(−βH).F (Q, P ).exp(βH) = exp(−βadH)(F )
2
=F+ ((−β)n /n!)(adH)n (F )
n≥1
Now,
[H, Q] = −iP/m, [H, P ] = iU ' (Q)
Make the induction hypothesis
n
2 n
2
n k n
ad(H) (Q) = Fnk (Q)P , ad(H) (P ) = Gnk (Q)P k
k=0 k=0
Deduce recursion formulae for {Fnk (Q)}k and {Gnk (Q)}k .
[4] Evaluate exp(−βP 2 /2m).exp(−βU (Q)) in terms of exp(−β(P 2 /2m +

U (Q)) and vice versa and multiple commutators between P and Q.
hint: Let A, B be two operators. Put
exp(tA).exp(tB) = exp(Z(t))
Then,
exp(Z)g(adZ)(Z ' ) = Aexp(Z) + exp(Z)B
where
g(z) = (1 − exp(−z))/z
Thus,
g(adZ)(Z ' ) = exp(−adZ)(A) + B
or equivalently
Z ' = f1 (adZ)(A) + f2 (adZ)(B)
where
f1 (z) = g(z)−1 , f2 (z) = exp(−z)g(z)−1
236CHAPTER 15. QUANTUM GATE DESIGN, MAGNET MOTION,QUANTUM FILTER
15.6 Some remarks on the equilibrium Fokker

Planck equation for the Gibbs distribution
The quantum fluctuationdissipation theorem for a single Lindlbad operator is
L∗ Lexp(−βH) + exp(−βH)L∗ L − 2Lexp(−βH)L∗ = 0
This is the condition on the Lindblad operator L so that
ρ = exp(−βH)/T r(exp(−βH))
satisfies the steady state GKSL equation:
0 = i∂t ρ = [H, ρ] − (1/2)(L∗ Lρ + ρL∗ L − 2LρL∗ )
Here
H = P 2 /2m + U (Q), P = −i∂/∂Q
The classical version of this is
0 = ∂t f (Q, P ) = −(P/m)∂Q f + U � (Q)∂P f + γ∂P (P f ) + (σ 2 /2)∂P2 f
with
f = Cexp(−βH), H = P 2 /2m + U (Q)
Note that this classical FokkerPlanck equation is derived from the Ito sde
dQ = P dt/m, dP = −γP dt − U � (Q)dt + σdB(t)
Now,
∂P f = −βP f /m, ∂P2 f = −βf /m + β 2 P 2 f /m2
∂Q f = −βU � (Q)f
So the above classical equilibrium condition becomes
0 = (βP/m)U � (Q)+U � (Q)(−βP/m)+γ(1−βP 2 /m)+(σ 2 /2)(−β/m+β 2 P 2 /m2 )
This is satisfied iff
γ − σ 2 β/2m = 0
or equivalently,
2kT mγ = σ 2
which is the classical fluctuationdissipation theorem as first derived by Ein
stein. In the quantum context, the dissipation coefficient γ and the diffu
sion/fluctuation coefficient σ 2 are both represented by the Lindblad opera
tor L as follows from the derivation the master equation from the Hudson
ParthasarathySchrodinger equation
dU (t) = (−(iH + L∗ L/2)dt + LdA − L∗ dA∗ )U (t)
by partial tracing of the density evolution over the bath state. Now consider
the Heisenberg quantum equations of motion for Q, P for the master equation:
For any Heisenberg observable X, the dual of the master equation is
X � = i[H, X]−(1/2)(XL∗ L+L∗ LX−2L∗ XL) = i[H, X]−(1/2)([X, L∗ ]L+L∗ [L, X])
15.6. SOME in Classical
REMARKS ONand Quantum
THE Physics and
EQUILIBRIUM Engineering
FOKKERPLANCK 227
EQUATION FOR
In particular for X = Q,
Q� = P/m − (1/2)([Q, L∗ ]L + L∗ [L, Q])
and for X = P ,
P � = −U � (Q) − (1/2)([P, L∗ ]L + L∗ [L, P ])
Now to get an equation that describes the motion of a particle in a potential
U (Q) with linear velocity damping, we assume that
L = aQ + bP, L∗ = āQ + ¯bP
Then,
¯ + ¯bP ] = ib¯
[L, Q] = [aQ + bP, Q] = −ib, [Q, L∗ ] = [Q, aQ
¯
[Q, L∗ ]L + L∗ [L, Q] = ib(aQ ¯ + ¯bP )(−ib) = 2.Im(āb)Q
+ bP ) + (aQ
¯ + ¯bP )(ia) = 2.Im(āb)P
[P, L∗ ]L + L∗ [L, P ] = (−iā)(aQ + bP ) + (aQ
Thus the HeisenbergLindblad equations read
Q� = P/m − Im(āb)Q,
P � = −U � (Q) − Im(āb)P
Eliminating P between these two equations gives us
Q�� = (−U � (Q) − m.Im(āb)(Q� + Im(āb)Q))/m − Im(āb)Q�
or
mQ�� = −(U � (Q) + (Im(āb))2 )Q − 2Im(āb)Q�
which is the velocity damped equation in a potential U (Q) + γ 2 Q2 /8 and with
velocity damping coefficient γ = 2Im(āb). The harmonic correction to the po
tential U (Q) is actually a quantum correction as can be verified by not assuming
Planck’s constant h = 1 as we have done here. (Here h stands for Planck’s con
stant divided by 2π). Now, the quantum fluctuationdissipation theorem can
also be expressed as
L∗ L + exp(−βadH)(L∗ L) − 2Lexp(−βadH)(L∗ ) = 0
For high temperatures or equivalently small β, this reads
[L∗ , L] − β([H, L∗ L] − L[H, L∗ ]) = 0
where O(β 2 ) terms have been neglected. Now for L = aQ + bP , we get
¯ + ¯bP, aQ + bP ] = iab
[L∗ , L] = [aQ ¯ − iab¯ = 2Im(āb)
[H, L∗ L] = [H, L∗ ]L + L∗ [H, L]
so
[H, L∗ L] − L[H, L∗ ] = [[H, L∗ ], L] + L∗ [H, L]
15.7 Wignerdistributions in quantum mechan

ics
Since Q, P do not commute in quantum mechanics, we cannot talk of a joint
probability density of these two at any time unlike the classical case. So how to
interpret the Lindblad equation
i∂t ρ = [H, ρ] − (1/2)(L∗ Lρ + ρL∗ L − 2LρL∗ )
as a quantum FokkerPlanck equation for the joint density of (Q, P ) ? Wigner

suggested the following method: Let ρ(t, Q, Q� ) =< Q|ρ(t)|Q > denote ρ(t) in
the position representation. Then ρ(t) in the momentum representation is given
by
�
ρ̂(t, P, P � ) =< P |ρ(t)|P � >= < P |Q >< Q|ρ(t)|Q� >< Q� |P � > dQdQ�
�
= ρ(t, Q, Q� )exp(−iQP + iQ� P � )dQdQ�
ρ(t, Q, Q) is the probability density of Q(t) and ρ̂(t, P, P ) is the probability

density of P (t). Both are nonnegative and integrate to unity. Now Wigner
defined the function
�
W (t, Q, P ) = ρ(t, Q + q/2, Q − q/2)exp(−iP q)dq/(2π)
This is in general a complex function of Q, P and therefore cannot be a proba

bility density. However,
�
W (t, Q, P )dP = ρ(t, Q, Q),
is the marginal density of Q(t) Further, we can write

�
W (t, Q, P ) = ρ̂(t, P1 , P2 )exp(iP1 (Q+q/2)−iP2 (Q−q/2))
exp(−iP q)dP1 dP2 dq/(2π)2

�
−= ρ̂(t, P1 , P2 )exp(i(P1 − P2 )Q)δ((P1 + P2 )/2 − P )dP1 dP2 /(2π)
so that
� �
W (t, Q, P )dQ = ρ̂(t, P1 , P2 )δ(P1 − P2 )δ((P1 + P2 )/2 − P )dP1 dP2
= ρ̂(t, P, P )
is the marginal density of P (t). This suggest strongly that the complex valued
functionW (t, Q, P ) should be regarded as the quantum analogue of the joint
probability density of (Q(t), P (t)). Our aim is to derive the pde satisfied by W
15.7. WIGNERDISTRIBUTIONS IN QUANTUM MECHANICS 239
when ρ(t) satisfies the quantum FokkerPlanck equation. To this end, observe
that
� �
[P 2 , ρ]ψ(Q) = (− ρ,11 (Q, Q� )ψ(Q� )dQ� + ρ(Q, Q� )ψ �� (Q� )dQ�
�
= (−ρ,11 + ρ,22 )(Q, Q� )ψ(Q� )dQ�
which means that the position space kernel of [P 2 , ρ] is
[P 2 , ρ](Q, Q� ) = (−ρ,11 + ρ,22 )(Q, Q� )
Also
[U, ρ](Q, Q� ) = (U (Q) − U (Q� ))ρ(Q, Q� )
¯ + ¯bP )(aQ + bP )ρ(Q, Q� )
L∗ Lρ(Q, Q� ) = (aQ
= |a|2 Q2 ρ(Q, Q� ) − |b|2 ρ,11 (Q, Q� ) − iab
¯ Qρ,1 (Q, Q� )
¯
−iab(ρ(Q, Q� ) + Qρ,1 (Q, Q� ))
How to transform from W (Q, P ) to ρ(Q, Q� )?
�
W (Q, P ) = ρ(Q + q/2, Q − q/2)exp(−iP q)dq
gives �
W (Q, P )exp(iP q)dP/(2π) = ρ(Q + q/2, Q − q/2)
or equivalently,
�
W ((Q + Q� )/2, P )exp(iP (Q − Q� ))dP/(2π) = ρ(Q, Q� )
So
�
ρ,1 (Q, Q� ) = ((1/2)W,1 ((Q+Q� )/2, P )+iP W ((Q+Q� )/2, P ))exp(iP (Q−Q� ))dP
Also note that

�
Qρ(Q, Q� ) = QW ((Q + Q� )/2, P )exp(iP (Q − Q� ))dP =
�
= ((Q + Q� )/2) + (Q − Q� )/2)W ((Q + Q� )/2, P )exp(iP (Q − Q� ))dP
�
(((Q+Q� )/2)W ((Q+Q� )/2, P )+(i/2)W,2 ((Q+Q� )/2, P ))exp(iP (Q−Q� ))dP
on using integration by parts. Consider now the term
(U (Q) − U (Q� ))ρ(Q, Q� )

In the classical case, ρ(Q, Q� ) is diagonal and so such a term contributes only
when Q� is in the vicinity of Q. For example, if we have
ρ(Q, Q� ) = ρ1 (Q)δ(Q� − Q) + ρ2 (Q)δ � (Q� − Q)
then,
(U (Q� ) − U (Q))ρ(Q, Q� ) = ρ2 (Q)(U (Q� ) − U (Q))δ � (Q� − Q)
or equivalently, �
(U (Q� ) − U (Q))ρ(Q, Q� )f (Q� )dQ� =
−∂Q� (ρ2 (Q)(U (Q� ) − U (Q))f (Q� ))|Q� =Q

= −ρ2 (Q)U � (Q)f (Q)
which is equivalent to saying that
(U (Q� ) − U (Q))ρ(Q, Q� ) = −ρ2 (Q)U � (Q)δ(Q� − Q)
just as in the classical case. In the general quantum case, we can write
� �
ρ(Q, Q� )f (Q� )dQ� = n
ρn (Q)∂Q f (Q)
n≥0
This can be seen by writing

�
f (Q� ) = f (n) (Q)(Q� − Q)n /n!
n
and then defining

�
ρ(Q, Q� )(Q� − Q)n dQ� /n! = ρn (Q)
Now, this is equivalent to saying that the kernel of ρ in the position represen
tation is given by �
ρ(Q, Q� ) = ρn (Q)δ (n) (Q − Q� )
n
and hence
�
(U (Q� )−U (Q))ρ(Q, Q� )f (Q� )dQ�
� �
= ρn (Q) δ (n) (Q−Q� )(U (Q� )−U (Q))f (Q� )dQ�
n
�
� �
= ρn (Q)(−1)n ∂Q
n
� ((U (Q ) − U (Q))f (Q ))|Q� =Q
n
� � �
n n
= ρn (Q)(−1) U (k) (Q)f (n−k) (Q)
k
n≥0,1≤k≤n
Thus,
(U (Q� ) − U (Q))ρ(Q, Q� ) =
� � �
k n
ρn (Q)(−1) U (k) (Q)δ (n−k) (Q� − Q)
k
n≥0,1≤k≤n
This is the quantum generalization of the classical term
−U (Q)∂P f (Q, P )
which appears in the classical FokkerPlanck equation. Another way to express

this kernel is to directly write
�
(U (Q� ) − U (Q))ρ(Q, Q� ) = U (n) (Q)(Q� − Q)n ρ(Q, Q� )/n!
n≥1
From the Wignerdistribution viewpoint, it is more convenient to write
U (Q� ) − U (Q) = U ((Q + Q� )/2 − (Q − Q� )/2) − U ((Q + Q� )/2 + (Q − Q� )/2)

�
=− U (2k+1) ((Q + Q� )/2)(Q − Q� )2k+1) /(2k + 1)!22k
k≥0
Consider a particular term U (n) ((Q+Q� )/2)(Q−Q� )n ρ(Q, Q� ) in this expansion:
U (n) ((Q+Q� )/2)(Q−Q� )n ρ(Q, Q� )

�
= U (n) ((Q+Q� )/2)(Q−Q� )n W ((Q+Q� )/2, P )exp(iP (Q−Q� ))dP
�
= in U (n) ((Q + Q� )/2)∂Pn (W ((Q + Q� )/2, P ))exp(iP (Q − Q� ))dP
and thus
(U (Q� ) − U (Q))ρ(Q, Q� ) =
� �
−2 exp(iP (Q−Q� )) (i/2)2k+1 ((2k+1)!)−1 (U (2k+1) ((Q+Q� )/2)∂P2k+1 W ((Q+Q� )/2, P ))dP
k≥0
Also note that

ρ,11 (Q, Q� ) =
�
2
∂Q W ((Q + Q� )/2, P )exp(iP (Q − Q� ))dP
�
= ((1/4)W,11 ((Q+Q� )/2, P )−P 2 W ((Q+Q� )/2, P )+iP W,1 ((Q+Q� )/2, P ))exp(iP (Q−Q� ))dP
and likewise,
�
ρ,22 (Q, Q� ) = ((1/4)W,11 ((Q+Q� )/2, P )−P 2 W ((Q+Q� )/2, P )−iP W,1 ((Q+Q� )/2, P ))exp(iP (Q−Q� ))dP
232 Stochastic
242CHAPTER ProcessesGATE
MOTION,QUANTUM
Thus, the noiseless quantum FokkerPlanck equation (or equivalently, the Liouville
SchrodingerVonNeumannequation)
iρ� (t) = [(P 2 /2m + U (Q)), ρ(t)]
is equivalent to
i∂t (W (t, (Q + Q� )/2, P )) =
(−iP/m)W,1 (t, (Q + Q� )/2, P )
�
+2 (i/2)2k+1 ((2k + 1)!)−1 (U (2k+1) ((Q + Q� )/2)∂P2k+1 W (t, (Q + Q� )/2, P ))
k≥0
or equivalently,
i∂t W (t, Q, P ) =
(−iP/m)W,1 (t, Q, P ) + U � (Q)∂P W (t, Q, P )+
�
+2 (i/2)2k+1 ((2k + 1)!)−1 (U (2k+1) (Q)∂P2k+1 W (t, Q, P ))
k≥0
This can be expressed in the form of a classical Liouville equation with quantum
correction terms:
∂t W (t, Q, P ) = (−P/m)∂Q W (t, Q, P ) + U � (Q)∂P W (t, Q, P )

�
+ ((−1)k /(22k (2k + 1)!))U (2k+1) (Q)∂P2k+1 W (t, Q, P )
k≥1
The last term is the quantum correction term.

Remark: To see that the last term is indeed a quantum correction term, we
have to use general units in which Planck’s constant is not set to unity. Thus,
we define
�
W (Q, P ) = C ρ(Q + q/2, Q − q/2).exp(−iP q/h)dq
�
For W (Q, P )dP = ρ(Q, Q), we must set
2πhC = 1
Then, �
ρ(Q + q/2, Q − q/2) = W (Q, P )exp(iP q/h)dP
Thus, with P = −ih∂Q , we get
[P 2 /2m, ρ](Q, Q� ) = (h2 /2m)(−ρ,11 (Q, Q� ) + ρ,22 (Q, Q� ))

�
= (h2 /2m)(−∂Q 2
+ ∂Q2
�) W ((Q + Q� )/2, P )exp(iP (Q − Q� )/h)dP
�
= (−ih/m) P W,1 ((Q + Q� )/2, P )exp(iP (Q − Q� )/h)dP
Further,
(U (Q)−U (Q� ))ρ(Q, Q� )

�
= 2−2k ((2k+1)!)−1 U (2k+1) ((Q+Q� )/2)(Q−Q� )2k+1 ρ(Q, Q� )
k≥0
� �
= 2−2k ((2k+1)!)−1 (ih)2k+1 U (2k+1) ((Q+Q� )/2)∂P2k+1 W ((Q+Q� )/2, P )exp(iP (Q−Q� )/h)dP
k
and hence its we get from the Schrodinger equation
ih∂t ρ = [H, ρ] = [P 2 /2m, ρ] + [U (Q), ρ],
the equation
ih∂t W (t, Q, P ) = (−ih/m)P W,1 ((Q + Q� )/2, P )

�
+ihU � (Q)∂P W (t, Q, P )+ih (−1)k 2−2k h2k ((2k+1)!)−1 U (2k+1) (Q)∂P2k+1 W (t, Q, P )
k≥1
or equivalently,

�
+ (−h2 /4)k ((2k + 1)!)−1 U (2k+1) (Q)∂P2k+1 W (t, Q, P )
k≥1
which clearly shows that the quantum correction to the Liouville equation is a
power series in h2 beginning with the first power of h2 . Note that the quantum
corrections involve ∂P2k+1 W (t, Q, P ), k = 1, 2, ... but a diffusion term of the form
∂P2 W is absent. It will be present only when we include the Lindblad terms for
it is these terms that describe the effect of a noisy bath on our quantum system.
Construction of the quantum FokkerPlanck equation for a specific

choice of the Lindblad operator
Here,
H = P 2 /2m + U (Q), L = g(Q)
Assume that g is a rea; function. Then, the Schrodinger equation for the mixed
state ρ is given by
∂t ρ = −i[H, ρ] − (1/2)θ(ρ)
where
θ(ρ) = L∗ Lρ + ρL∗ L − 2LρL∗
Thus, the kernel of this in the position representation is given by
θ(ρ)(Q, Q� ) = (g(Q)2 + g(Q� )2 − 2g(Q)g(Q� ))ρ(Q, Q� )
= (g(Q) − g(Q� ))2 ρ(Q, Q� )

More generally, if
�
θ(ρ) = (L∗k Lk ρ + ρL∗k Lk − 2Lk ρLk )
k
where
Lk = gk (Q)
are real functions, then
θ(ρ)(Q, Q� ) = F (Q, Q� )ρ(Q, Q� )
where �
F (Q, Q� ) = (gk (Q) − gk (Q� ))2
k
We define
G(Q, q) = F (Q + q/2, Q − q/2)
or equivalently,
F (Q, Q� ) = G((Q + Q� )/2, Q − Q� )
Then,
θ(ρ)(Q, Q� ) = G((Q + Q� )/2, (Q − Q� )ρ(Q, Q� ) =
� �
Gn ((Q + Q� )/2)(Q − Q� )n W ((Q + Q� )/2, P )exp(iP (Q − Q� ))dP
n≥0
� �
= Gn ((Q + Q� )/2)in ∂Pn W ((Q + Q� )/2, P )exp(iP (Q − Q� ))dP
n≥0
and this contributes a factor of

�
−(1/2) in Gn (Q)∂Pn W (Q, P )
n≥0
= (−1/2)G0 (Q)W (Q, P ) − (i/2)G1 (Q)∂P W (Q, P ) + (1/2)G2 (Q)∂P2 W (Q, P )

�
−(1/2) in Gn (Q)∂Pn W (Q, P )
n>2
The term (1/2)G2 (Q)∂P2 W (Q, P )

is the quantum analogue of the classical dif
fusion term in the FokkerPlanck equation. Here, we are assuming an expansion
�
G(Q, q) = Gn (Q)q n
n
ie,
Gn (Q) = n!−1 ∂qn G(Q, q)|q=0
Problems in quantum corrections to classical theories in probabil

ity theory and in mechanics
Consider the Lindblad term in the density evolution problem:
θ(ρ) = L∗ Lρ + ρL∗ L − 2LρL∗ )
where
L = g0 (Q) + g1 (Q)P
We have
L∗ = ḡ0 (Q) + P ḡ1 (Q)
L∗ Lψ(Q) = ḡ0 +P ḡ1 )(g0 +g1 P )ψ(Q)
= |g0 (Q)|2 ψ(Q)+P |g1 (Q)|2 P ψ(Q)+ḡ0 g1 P ψ(Q)+P ḡ1 g0 ψ(Q)
Now, defining
|g1 (Q)|2 = f1 (Q), |g0 (Q)|2 = f0 (Q),
we get
P f1 (Q)P ψ(Q) = −∂Q f1 (Q)ψ � (Q) = −f1� (Q)ψ � (Q) − f1 (Q)ψ �� (Q)
P ḡ1 (Q)g0 (Q)ψ(Q) = −i(ḡ1 g0 )� (Q)ψ(Q) − ḡ1 (Q)g0 Q)ψ � (Q)

Likewise for the other terms. Thus, L∗ L has the following kernel in the position
representation:
L∗ L(Q, Q� ) = −f1 (Q)δ �� (Q − Q� ) + f2 (Q)δ � (Q − Q� ) + f3 (Q)δ(Q − Q� )
where f1 (Q) is real positive and f2 , f3 are complex functions. Thus,
L∗ Lρ(Q, Q� ) = −f1 (Q)∂Q

2
ρ(Q, Q� ) + f2 (Q)∂Q ρ(Q, Q� ) + f3 (Q)ρ(Q, Q� )
�
∗ �
ρL L(Q, Q ) = ρ(Q, Q�� )L∗ L(Q�� , Q� )dQ�� = −∂Q
2 � �
� (ρ(Q, Q )f1 (Q ))−
∂Q� (ρ(Q, Q� )f2 (Q� )) + ρ(Q, Q� )f3 (Q� )

Likewise,
LρL∗ (Q, Q� ) = g1 (Q)ḡ1 (Q� )∂Q ∂Q� ρ(Q, Q� ) + T
where T contains terms of the form a function of Q, Q� times first order partial
derivatives in Q, Q� . Putting all the terms together, we find that tbe Lindblad
contribution to the master equation has the form
(−1/2)θ(ρ)(Q, Q� )
2
= (f1 (Q)∂Q +f1 (Q� )∂Q
2 �
g1 (Q� )∂Q ∂Q� ρ(Q, Q� )+h1 (Q, Q� )∂Q ρ(Q, Q� )+
� )ρ(Q, Q )−g1 (Q)¯
h2 (Q, Q� )∂Q� ρ(Q, Q� ) + h3 (Q, Q� )ρ(Q, Q� )

15.8 Motion of an element having both an elec

tric and a magnetic dipole moment in an
external electromagnetic field plus a coil
Let m, p denote respectively the magnetic and electric dipole moments of the
element at time t = 0. Let E(t, r), B(t, r) denote the external electric and
magnetic fields. After time t, the element undergoes a rotation S(t) ∈ SO(3)
so that its magnetic moment becomes
m(t) = S(t)m,
and its electric dipole moment becomes
p(t) = S(t)p
Assume that after time t, the centre of the element is located at the position
R(t). It should be noted that if n(t) denotes the unit vector along the length of
the element at time t, then we can write
m(t) = K1 n(t), p(t) = K2 n(t)
where K1 , K2 are constants. Thus, we can write
m(t) = K1 S(t)n(0), p(t) = K2 S(t)n(0)
The angular velocity tensor Ω(t) in the lab frame can be expressed using
S � (t) = Ω(t)S(t)
or equivalently,
Ω(t) = S � (t)S(t)−1 = S � (t)S(t)T
The angular velocity tensor in the rest frame of the element is
Ωr (t) = S(t)−1 Ω(t)S(t) = S(t)−1 S � (t) = S(t)T S � (t)

The total kinetic energy of the element due to translational moment of its cm
and rotational motion around its cm is given by
K(t) = (1/2)T r(S � (t)JS � (t)T ) + M R� (t)2 /2
the interaction potential energy between the magnetic moment of the element
and the magnetic field plus that between the electric dipole moment and the
electric field is given by
V (t) = −(m(t), B(t, R(t)))−(p(t), E(t, R(t))
= −mT S(t)T B(t, R(t))−pT S(t)T E(t, R(t))

and hence taking into account the gravitational potential energy, the total La
grangian of the element is given by
L(R(t), R� (t), S(t), S � (t))
= (1/2)T r(S � (t)T JS � (t))+M R� (t)2 /2+mT S(t)T B(t, R(t))+
pT S(t)T E(t, R(t))−M geT3 R(t)

Stochastic ProcessesOF
15.8. MOTION in Classical
AN ELEMENTand Quantum Physics
HAVING BOTH and AN ELECTRIC AND A237
Engineering MAGNETIC
where e3 is the unit vector along the z direction. While writing the Euler
Lagrange equations of motion from this Lagrangian, we must take into ac
count the constraint S(t)T S(t) = I by adding a Lagrange multiplier term
−T r(Λ(t)(S(t)T S(t) − I)) to the above Lagrangian.
Remark: The magnetic vector potential produced by a current loop Γ car

rying a current I(t) is given by the retarded potential formula in the form of a
line integral
�
A(t, r) = (µ/4π) I(t − |r − r� |/c)dr� /|r − r� |
Γ
This approximates in the far field zone |r| >> |r� | to

�
I(t − r/c + r̂.r� /c)(1 + r̂.r� /r)dr� /r
Γ
which further approximates to

� �
(I(t − r/c)/r ) r̂.r dr + (I (t − r/c)/cr) (r̂.r� )dr�
2 � � �
Γ Γ
� �
� � �
= (I(t − r/c)/r )3 2
r.r dr + (I (t − r/c)/cr ) r.r� dr�
Γ Γ
which can be expressed in terms of the magnetic moment

�
m(t) = I(t) r� × dr�
Γ
as
A(t, r) = (µ/4π)[m(t − r/c) × r/r3 + m� (t − r/c) × r/cr2 ]
Remark: The interaction energy between a system of charged particles in
motion and a magnetic field is derived from the Hamiltonian term
�
(pi − ei A(t, ri ))2 /2mi
i
The interaction term in this expression is

�
− ei (pi , A(t, ri ))/mi
i
However, noting the relationship between momentum and velocity in a magnetic

field: pi = mi vi +ei A(t, ri ), it follows that the interaction energy should be taken
as �
− ei (vi , A(t, ri ))
i
238248CHAPTER
Stochastic Processes GATE
MOTION,QUANTUM
Now if the system of charged particles spans a small region of space around the
point r and in this region, the magnetic field does not vary too rapidly, then we
can approximate the above interaction energy term by
� �
U =− ei (vi , A(t, r + ri� )) = − ei (vi , (ri� , �)A(t, r))
i i
where ri� = ri − r. Now assuming A to be time independent,
d/dt(ri� , (ri� , �)A) = (vi , (ri� , �)A) + (ri� , (vi , �)A)
For bounded motion, the time average of the rhs is zero and we get denoting
time averages by < . >,
< (vi , (ri� , �)A) >= − < (ri� , (vi , �)A) >
Hence,
< (vi , (ri� , �)A) >= (1/2(< (vi , (ri� , �)A) − (ri� , (vi , �)A) >)
On the other hand,
(ri� × vi ).(nabla × A) = (vi , (ri� , �)A) − (ri� , (vi , �)A)
and hence we deduce that

�
< U >= − (ei ri� × vi /2).curlA(t, r)
i
provided that A does not vary too rapidly with time. This formula can be
expressed as
U = −m.B
where �
m = (1/2) ei ri� × vi
i
is the magnetic moment of the system of particles.

Chapter 16
Chapter 16
Large Deviations for
Large deviations for
Markov
Markov Chains,
chains, Quantum
quantum
Master
master Equation,
equation, Wigner
Wigner
Distributionfor
distribution forthe
the Belavkin
Belavkin filter Filter
16.1 Large deviation rate function for the em

pirical distribution of a Markov chain
Assume that X(n), n ≥ 0 is a discrete time Markov chain with finite state space
E = {1, 2, ..., K} and with transition probabilities
P (X(n + 1) = y|X(n) = x) = π(x, y), x, y ∈ E
let µ be the distribution of X(1). The empirical distribution of the chain based
on data collected upto time N is given by
N
�
νN = N −1 δX(n)
n=1
Then the logarithmic generating function of νN is given by

� � N
ΛN (f ) = logEexp( f (x)dνN (x)) = logE[exp(N −1 f (X(n)))]
n=1
and hence,
N
�
ΛN (N f ) = logE[exp( f (X(n)))] =
n=1
249
239
250CHAPTER 16. LARGE DEVIATIONS FOR MARKOV CHAINS, QUANTUM MAST
log(µT πfN (1))

where
πf = ((exp(f (x))π(x, y)))1≤x,y≤K
and �
πf u(x) = πf (x, y)u(y)
y
It follows from the spectral theorem for matrices that
limN →∞ N −1 ΛN (N f ) = log(λ(f ))
where λ(f ) is the maximum eigenvalue of πf . By the PerronFrobenius theory,

this is real. Equivalently, this fact follows from the above relation since the
limit is real because each term is real and the limit equals the eigenvalue of πf
having maximum magnitude. Thus if q is a probability distribution on E, then
the LDP rate function of the process evaluated at q is given by
�
I(q) = supf ( f (x)q(x) − logλ(f ))
x
We claim that
�
I(q) = J(q) = supu>0 q(x)log(u(x)/πu(x))
x
where �
πu(x) = π(x, y)u(y)
y
To see this, we first observe that for any f, u, we have

� � �
f (x)q(x) − q(x)log(u(x)/πu(x)) = − q(x)log(u(x)/πf u(x))
x x x
Now fix f and choose u = u0 so that
πf u0 (x) = λ(f )u0 (x), x ∈ E
ie u0 is an eigenvector of πf corresponding to its maximum eigenvalue. Then,

� � �
f (x)q(x) − J(q) = infu ( f (x)q(x) − q(x)log(u(x)/πu(x))
x x x
� �
= infu − q(x)log(u(x)/πf u(x)) ≤ − log(u0 (x)/πf u0 (x)) = logλ(f )
x x
Thus, for any f , �

f (x)q(x) − logλ(f ) ≤ J(q)
x
16.2. QUANTUM MASTER EQUATION IN GENERAL RELATIVITY 251
and taking supremum over all f , gives us
I(q) ≤ J(q)
In order to show that

J(q) ≤ I(q)
it suffices to show that for every u, there exists an f such that
� �
q(x)log(u(x)/πu(x)) ≤ f (x)q(x) − logλ(f )
x x
or equivalently, such that

�
− q(x)log(u(x)/πf u(x)) = logλ(f ) − − − (a)
x
To show this fix u and define
f (x) = log(u(x)/πu(x))
Then, clearly
πf u(x) = u(x)
and hence for every n ≥ 1,
πfn u = u
and hence
log(λ(f )) = limn n−1 log(µT πfn u) = 0
ie
λ(f ) = 1
But then, both sides of (a) equal zero and the proof is complete.
16.2 Quantum Master Equation in general rel

ativity
Assume that the metric of spacetime is restricted to have the form
3
�
dτ 2 = (1 + δ.V0 (x))dt2 − (1 + δ.Vk (x))(dxk )2
k=1
where V0 , Vk , k = 1, 2, 3 are some functions of x = (t, xk , k = 1, 2, 3). The

corresponding metric is given by
g00 = 1 + δ.V0 , gkk = −(1 + δ.Vk ), k = 1, 2, 3

242 Stochastic
252CHAPTER 16. LARGEProcesses in ClassicalFOR
DEVIATIONS and MARKOV
Quantum Physics
CHAINS,and Engineering
QUANTUM MASTE
and
ν
gµν = 0, µ =
We substitute this into the EinsteinHilbert Lagrangian density
√
Lg = g µν −g(Γµνα
Γβαβ − Γµβ
α
Γβνα )
and retain terms only upto O(δ 2 ). Likewise, we compute the Lagrangian density
of the electromagnetic field
Lem = (−1/4)Fµν F µν
and retain terms only upto O(δ), ie, terms linear in the Vµ . The resulting
Lagrangian of the gravitational field interacting with the external em field will
have the form
L = Lg + Lem,int = δ 2 C1 (µναβ)Vµ,α Vν,β +
+δ.C2 (µαβρσ)Vµ Fαβ Fρσ

where we regard the covariant components Aµ of the electromagnetic four po
tential as being fundamental and the contravariant components being derived
from the covariant components after interaction with the gravitational field.
Specifically, with neglect of O(δ 2 ) terms,
A0 = g 0µ Aµ = g 00 A0 = (1 − δ.V0 )A0
Ar = g rr Ar = −(1 − δ.Vr )Ar

We regard the field
Bµ = C2 (µαβρσ)Fαβ Fρσ
as a new cnumber external field with which the gravitational field interacts and
then write down our gravitational field Lagrangian as
L = δ 2 C1 (µναβ)Vµ,α Vν,β + δ.Vµ Bµ
Keeping this model in mind, it can be abstracted into the Lagrangian of a set of
KleinGordon field interacting with an external cnumber field. The canonical
momenta are
Pµ = ∂L/∂Vµ,0 = 2δ 2 C1 (µν0β)Vν,β
= 2δ 2 C1 (µν00)Vν,0 + 2δ 2 C1 (µν0r)Vν,r
which can be easily solved to express Vµ,0 as a linear combination of Pν and
Vν,r . In this way our Hamiltonian density has the form
H = C1 (µν)Pµ Pν + C2 (µνr)Pµ Vν,r
+C3 (µνrs)Vµ,r Vν,s + C4 (µν)Vµ Bν

16.3. BELAVKIN FILTER FOR THE WIGNER DISTIRBUTION FUNCTION253
Abstracting this to one dimension, we get a Hamiltonian of the form
H = P 2 /2 + ω 2 Q2 /2 + β(P Q + QP ) + γE(t)Q
where E(t) is an external cnumber function of time. Application of a canonical

transformation then gives us a Hamiltonian of the form
H = P 2 /2 + ω 2 Q2 /2 + E(t)(aP + bQ)
If E(t) is taken as quantum noise coming from a bath, then we can derive the
following HudsonParthasarathy noisy Schrodinger equation for the evolution
operator U (t):
dU (t) = (−(iH + LL∗ /2)dt + LdA(t) − L∗ dA(t)∗ )
where
L = aQ + bP
16.3 Belavkin filter for the Wigner distirbution

function
The HPS dynamics is
dU (t) = (−(iH + LL∗ /2)dt + LdA(t) − L∗ dA(t)∗ )U (t)
where
L = aP + bQ, H = P 2 /2 + U (Q)
The coherent state in which the Belavkin filter is designed to operate is
|φ(u) >= exp(−|u|2 /2)|e(u) >, u ∈ L2 (R+ )
The measurement process is
Yo (t) = U (t)∗ Yi (t)U (t), Yi (t) = cA(t) + c̄A(t)∗
Then,
dYo (t) = dYi (t) − jt (c̄L + cL∗ )dt
where
jt (X) = U (t)∗ XU (t)
Note that
¯ + cL∗ = c̄(aP + bQ) + c(āQ + ¯bP ) = 2Re(āc)Q + 2Re(b̄c)P

cL
254CHAPTER 16. LARGE DEVIATIONS FOR MARKOV CHAINS, QUANTUM MASTER
Thus, our measurement model corresponds to measuring the observable 2Re(āc)Q(t)+

2Re(b̄c)P (t) plus white Gaussian noise. Let
πt (X) = E(jt (X)|ηo (t)), ηo (t) = σ(Yo (s) : s ≤ t)
Let
dπt (X) = Ft (X)dt + Gt (X)dYo (t)
with
Ft (X), Gt (X) ∈ ηo (t)
Let
dC(t) = f (t)C(t)dYo (t), C(0) = 1
The orthogonality principle gives
E[(jt (X) − πt (X))C(t)] = 0
and hence by quantum Ito’s formula and the arbitrariness of the complex valued
function f (t), we have
E[(djt (X) − dπt (X))|ηo (t)] = 0,
E[(jt (X) − πt (X))dYo (t)|ηo (t)] + E[(djt (X) − dπt (X))dYo (t)|ηo (t)] = 0
Note that
djt (X) = jt (θ0 (X))dt + jt (θ1 (X))dA(t) + jt (θ2 (X))dA(t)∗
where
θ0 (X) = i[H, X] − (1/2)(LL∗ X + XLL∗ − 2LXL∗ ),
θ1 (X) = −LX + XL = [X, L], θ2 (X) = L∗ X − XL∗ = [L∗ , X]
Then the above conditions give
πt (θ0 (X)) + πt (θ1 (X))u(t) + πt (θ2 (X))ū(t) =
Ft (X) + Gt (X)(cu(t) + c̄ū(t) − πt (c̄L + cL∗ ))

and
−πt (X(c̄L + cL∗ )) + πt (X)πt (c̄L + cL∗ ) + πt (θ1 (X))c̄
−Gt (X)|c|2 = 0
This second equation simplifies to
Gt (X) = |c|−2 (−πt (cXL∗ + c̄LX) + πt (c̄L + cL∗ )πt (X))
Then,
dπt (X) = (πt (θ0 (X)) + πt (θ1 (X))u(t) + πt (θ2 (X))ū(t))dt
+Gt (X)(dYo (t) + πt (c̄L + cL∗ − cu(t) − c̄ū(t))dt)
In the special case when c = −1, we get
dπt (X) = (πt (θ0 (X)) + πt (θ1 (X))u(t) + πt (θ2 (X))ū(t))dt
+(πt (XL∗ + LX) − πt (L + L∗ )πt (X))(dYo (t) − (πt (L + L∗ ) − 2Re(u(t)))dt)
Consider the simplified case when u = 0, ie, filtering is carried out in the
vacuum coherent state. Then, the above Belavkin filter simplifies to
dπt (X) = πt (θ0 (X))dt+(πt (XL∗ +LX)−πt (L+L∗ )πt (X))(dYo (t)−πt (L+L∗ )dt)
Equivalently in the conditional density domain, the dual of the above equation
gives the Belavkin filter for the conditional density
ρ� (t) = θ0∗ (ρ(t))dt+(L∗ ρ(t)+ρ(t)L−T r(ρ(t)(L+L∗ ))ρ(t))(dYo (t)−T r(ρ(t)(L+L∗ ))dt)
where
θ0∗ (ρ) = −i[H, ρ] − (1/2)(LL∗ ρ + ρLL∗ − 2L∗ ρL)
Now let us take
L = aQ + bP, a, b ∈ C
and
H = P 2 /2 + U (Q)
and translate this Belavkin equation to the Wignerdistribution domain and
then compare the resulting differential equation for W with the KushnerKallianpur
filter for the classical conditional density. We write
�
ρ(t, Q, Q� ) = W (t, (Q + Q� )/2, P )exp(iP (Q − Q� ))dP
Note that the KushnerKallianpur filter for the corresponding classical proba
bilistic problem
dQ = P dt, dP = −U � (Q)dt − γP dt + σdB(t),
dY (t) = (2Re(a)Q(t) + 2Re(b)P (t))dt + σdV (t)

where B, V are independent identical Brownian motion processes is given by
dp(t, Q, P ) = L∗ p(t, Q, P )+((αQ+βP )p(t, Q, P )

� �
−( (αQ+βP )p(t, Q, P )dQdP )p(t, Q, P ))(dY (t)−dt (αQ+βP )p(t, Q, P )dQdP )
where
α = 2Re(a), β = 2Re(b)
where
L∗ p = −P ∂Q p + U � (Q)∂P p + γ.∂P (P f ) + (σ 2 /2)∂P2 p
256CHAPTER 16. LARGE DEVIATIONS FOR MARKOV CHAINS, QUANTUM MASTE
Let us see how this KushnerKallianpur equation translates in the quantum case
using the Wigner distribution W in place of p. As noted earlier, the equation
ρ� (t) = −i[H, ρ(t)]
is equivalent to

�
+ (−h2 /4)k ((2k + 1)!)−1 U (2k+1) (Q)∂P2k+1 W (t, Q, P ) − − − (a)
k≥1
while the Lindblad correction term to (a) is (−1/2) times
LL∗ ρ + ρLL∗ − 2L∗ ρL
= (aQ + bP )(āQ + ¯bP )ρ + ρ(aQ + bP )(āQ + ¯bP ) − 2(āQ + ¯bP )ρ(aQ + bP )

which in the position representation has a kernel of the form
� 2
−|b|2 (∂Q
2 2
+ ∂Q 2
� + 2∂Q ∂Q� ) − |a| (Q − Q )
+(c1 Q + c2 Q� + c3 )∂Q + (c4 Q + c5 Q� + c6 )∂Q� ]ρ(t, Q, Q� )

This gives a Lindblad correction to (a) of the form
2
(σ12 /2)∂Q W (t, Q, P ) + (σ22 /2)∂P2 W (t, Q, P )
+d1 ∂Q ∂P W (t, Q, P ) + (d3 Q + d4 P + d5 )∂Q W (t, Q, P )

Thus, the master/Lindblad equation (in the absence of measurements) has the
form
�
+ (−h2 /4)k ((2k + 1)!)−1 U (2k+1) (Q)∂P2k+1 W (t, Q, P )
k≥1
2
+(σ12 /2)∂Q W (t, Q, P ) + (σ22 /2)∂P2 W (t, Q, P )
+d1 ∂Q ∂P W (t, Q, P ) + (d3 Q + d4 P + d5 )∂Q W (t, Q, P )
Remark: Suppose b = 0 and a is replaced by a/h. Then, taking into account
the fact that in general units, the factor exp(iP (Q − Q� )) must be replaced by
exp(iP (Q − Q� ))/h) in the expression for ρ(t, Q, Q� ) in terms of W (t, Q, P ), we
get the Lindblad correction as
(|a|2 /2)(Q − Q� )2 ρ(t, Q, Q� )
or equivalently in the Wignerdistribution domain as
(|a|2 /2)∂P2 W (t, Q, P )

and the master equation simplifies to
∂t W (t, Q, P ) = (−P/m)∂Q W (t, Q, P )+U � (Q)∂P W (t, Q, P )+(|a|2 /2)∂P2 W (t, Q, P )

�
+ (−h2 /4)k ((2k + 1)!)−1 U (2k+1) (Q)∂P2k+1 W (t, Q, P )
k≥1
This equation clearly displays the classical terms and the quantum correction
terms.
Exercise:Now express the terms L∗ ρ(t), ρ(t)L∗ and T r(ρ(t)(L + L∗ ) in terms

of the Wigner distribution for ρ(t) assuming L = aQ + bP and complete the
formulation of the Belavkin filter in terms of the Wigner distribution. Compare
with the corresponding classical KushnerKallianpur filter for the conditional
density of (Q(t), P (t)) given noisy measurements of aQ(s) + bP (s), s ≤ t by
identifying the precise form of the quantum correction terms.
Chapter 17
Chapter 17
Stringand
String andfield
Fieldtheory
Theory with
with
Large
large Deviations, Stochastic
deviations,stochastic
Algorithm
algorithm Convergence
convergence
17.1 String theoretic corrections to field theo

ries
Let �
X µ (τ, σ) = xµ + pµ τ − i aµ (n)exp(in(τ − σ))/n

n=0
be a string field where

aµ (−n) = aµ∗ (n)
and
[aµ (n), aν (m)] = η µν δ(n + m)
When this string interacts with an external cnumber gauge field described by
the antisymmetric potentials
Bµν (X(τ, σ))
The total string Lagrangian is given by
L = (1/2)η αβ X,α
µ
Xµ,β − (1/2)Bµν �αβ X,α
µ ν
X,β
The string equations are given by
η αβ Xµ,αβ = −�αβ (Bµν (X)X,β
ν
),α ) = −�αβ Bµν,ρ (X)X,α
ρ ν
X,β
This is the string theoretic version of the following point particle theory for a
charged particle in an electromagnetic field:
ν
M Xµ,τ τ = q.Fµν (X)X,τ
259
249
260CHAPTER 17. STRING AND FIELD THEORY WITH LARGE DEVIATIONS,STOCH
In the case when the field Bµν,ρ is a constant, this gives corrections to the string
field that are quadratic functions of the {aµ (n)}:
Xµ = Xµ(0) + Xµ(1)
where �
Xµ(0) = xµ + pµ τ − i aµ (n)exp(in(τ − σ))/n
n=0
�
−i bµ (n).exp(in(τ + σ))/n
n=0
and
(1) (0)ν
ηαβ Xµ,αβ = −�αβ Bµν,ρ (X)X,α
(0)ρ
X,β
= �αβ Bµν,ρ (aρ (n)bν (m)uα vβ exp(i(n(τ − σ) + m(τ + σ)))
+bρ (n)aν (m)vα uβ exp(i(n(τ + σ) + m(τ − σ))))
where
(uα )α=0,1 = (1, −1), (vα )α=0,1 = (1, 1)
17.2 Large deviations problems in string theory

Consider first the motion of a point particle with Poisson noise. Its equation of
motion is
dX(t) = V (t)dt, dV (t) = −γV (t)dt − U � (X(t))dt + σ�dN (t)
The rate function of the Poisson process is computed as follows. Let λ(�) be the
mean arrival rate of the Poisson process. Then
� T
M� (f ) = E[exp(� f (t)dN (t))] =
0
� T
exp((λ/�) (exp(�f (t)) − 1)dt)
0
Thus the GartnerEllis logarithmic moment generating function of the family of
processes �.N (.) over the time interval [0, T ] is given by
� T
Λ(f ) = �.log(M� (f /�)) = λ (exp(f (t)) − 1)dt
0
So the rate function of this family of processes is given by

� T
IT (ξ) = supf ( f (t)ξ � (t)dt − Λ(f ))
0
17.2. LARGE DEVIATIONS PROBLEMS IN STRING THEORY 261
f T f T
= supx (ξ " (t)x − λ(exp(x) − 1))dt = η(ξ " (t))dt
0 0
where
η(y) = supx (xy − λ(exp(x) − 1))
By the contraction principle, the rate function of the process {X(t) : 0 ≤ t ≤ T }
is given by
f T
JT (X) = η(X "" (t) + γX " (t) + U " (X(t)))dt
0
assuming σ = 1.
Application to control: let the potential U (x|θ) have some control parameters
θ. We choose θ so that
inflX −Xd l>δ JT (X)
is as large as possible. This would guarantee that the probability of deviation
of X from desired noiseless trajectory Xd obtained with θ = θ0 by an amount
more than δ is as small as possible. To proceed further, we write
e(t) = X(t) − Xd (t)
and then with θ = θ0 + δθ,
η(X "" + γX " + U " (X|θ)) = η(e"" + γe" + U " (X|θ) − U " (Xd |θ0 ))
≈ η(e"" + γe" + U "" (Xd |θ0 )e + U1 (Xd |θ0 )T δθ + δθT U2 (Xd |θ0 )δθ)
where
U1 (Xd |θ0 ) = ∂θ U "" (Xd |θ0 ),
U2 (Xd |θ0 ) = (1/2)∂θ ∂θT U "" (Xd |θ0 )
In particular, consider the string field equations in the presence of a Poisson
source:
∂02 X µ − ∂12 X µ = fkµ (τ, σ)dN k (τ )/dτ
Here
σ 0 = τ, σ 1 = σ
By writing down the Fourier series expansion
�
X µ (τ, σ) = aµ (n)exp(in(τ − σ)) + δX µ (τ, σ)
n
where δX µ is the Poisson contribution, evaluate the rate function of the field
δX µ and hence calculate the approximate probability that the quantum average
of the string field in a coherent state will deviate from its unforced value by an
amount more than a threshold over a given time duration with the spatial region
extending over the entire string.
17.3 Convergence analysis of the LMS algorithm

X(n) ∈ RN , d(n), n ∈ Z
are jointly iid Gaussian processes, ie, (X(n), d(n)), n ∈ Z is a Gaussian process
with zero mean. Let
E(X(n)X(n)T ) = R, E(X(n)d(n)) = r
The LMS algorithm is
h(n + 1) = h(n) − µ.∂h(n) (d(n) − h(n)T X(n))2
= (I − 2µX(n)X(n)T )h(n) + 2µd(n)X(n)

h(n) is a function of h(0), (X(k), d(k)), k ≤ n − 1 and is thus independent of
(X(n), d(n)). Then taking means with λ(n) = E(h(n)), we get
λ(n + 1) = (I − 2µR)λ(n) + 2µr
and hence
n−1
�
λ(n) = (I − 2µR)n λ(0) + 2µ (I − 2µR)n−k−1 2µr
k=0
= (I − 2µR) λ(0) + (I − (I − 2µR)n )h0

n
where
h0 = R−1 r
is the optimal Wiener solution. It follows that if all the eigenvalues of I − 2µR
are smaller than unity in magnitude, then
limn→∞ λ(n) = h0
However, we shall show that Cov(h(n)) does not converge to zero. It converges
to a nonzero matrix. We write
δh(n) = h(n) − λ(n)
and then get
λ(n + 1) + δh(n + 1) = (I − 2µX(n)X(n)T )(λ(n) + δh(n)) + 2µd(n)X(n)
or equivalently,
δh(n+1) = ((I−2µX(n)X(n)T )
−(I−2µR))λ(n)+(I−2µX(n)X(n)T )δh(n)+2µ(d(n)X(n)−r)
= −2µ(X(n)X(n)T − R)λ(n) + (I − 2µX(n)X(n)T )δh(n) + 2µ(d(n)X(n) − r)
and hence
E(δh(n + 1) ⊗ δh(n + 1)) =
17.4. STRING INTERACTING WITH SCALAR FIELD, GAUGE FIELD AND A GRAVIT
4µ2 E[(X(n)X(n)T − R) ⊗ (X(n)X(n)T − R)](λ(n) ⊗ λ(n))

+E[(I − 2µX(n)X(n)T ) ⊗ (I − 2µX(n)X(n)T )]E(δh(n) ⊗ δh(n))
+4µ2 E[(d(n)X(n) − r) ⊗ (d(n)X(n) − r)]
−4µ2 (E[(X(n)X(n)T −R)⊗(d(n)X(n)−r)]+
E[d(n)X(n)−r)⊗(X(n)X(n)T −R)])λ(n)
Note that so far no assumption regarding Gaussianity of the process (d(n), X(n))
has been made. We’ve only assumed that this process is iid. Now making the
Gaussianity assumption gives us
E(Xi (n)Xj (n)Xk (n)Xm (n)) = Rij Rkm + Rik Rjm + Rim Rjk
Exercise: Using this assumption, evaluate
limn→∞ E(δh(n)δh(n)T )
Note that in the general nonGaussian iid case, in case that E(δh(n) ⊗ δh(n))
converges, say to V , then the above equation implies that
V = (I − E[(I − 2µX(0)X(0)T ) ⊗ (I − 2µX(0)X(0)T )])−1
.[4µ2 E[(X(0)X(0)T − R) ⊗ (X(0)X(0)T − R)](h0 ⊗ h0 )

+4µ2 E[(d(0)X(0) − r) ⊗ (d(0)X(0) − r)]
−4µ2 (E[(X(0)X(0)T −R)⊗(d(0)X(0)−r)]+E[d(0)X(0)−r)⊗(X(0)X(0)T −R)])h0 ]
It should be noted that a sufficient condition for this convergence to occur is
that all the eigenvalues of the real symmetric matrix
I − E[(I − 2µX(0)X(0)T ) ⊗ (I − 2µX(0)X(0)T )]
be smaller than unity in magnitude.
17.4 String interacting with scalar field, gauge

field and a gravitational field
The total action is
�
S[X, Φ, g] = (1/2)η αβ gµν (X)X,α
µ
X ν +,β d2 σ
� �
2 µ ν
+ Φ(X)d σ + �αβ X,α X,β Bµν (X )d2 σ
The string field equations are

ν ρ ν
−ηαβ (gµν (X)X,β ),α + Φ,µ (X) − �αβ Bµν,ρ (X)X,α X,β =0
In flat spacetime, gµν = ηµν and the approximate string equations simplify to
ρ ν
ηαβ Xµ,αβ = −Φ,µ (X) + �αβ Bµν,ρ (X)X,α X,β
The propagator of the string
Dµν (σ|σ � ) =< T (Xµ (σ)Xν (σ � )) >
thus depends on Φ, Bµν,ρ . Let us evaluate this propagator approximately:
∂0 Dµν (σ|σ � ) = δ(σ 0 − σ0� ) < [Xµ (σ), Xν (σ � )] >
+ < T (Xµ,0 (σ)Xν (σ � )) >

=< T (Xµ,0 (σ)Xν (σ � )) >
Further,
�
∂02 Dµν (σ|σ � ) = δ(σ 0 − σ 0 ) < [Xµ,0 (σ), Xν (σ � )] >
+ < T (Xµ,00 (σ)Xν (σ � )) >
= −iδ 2 (σ − σ � ) + ∂12 < T (Xµ (σ)Xν (σ � )) >
γ
ρ
+ < T (−Φ,µ (X(σ)) + �αβ Bµγ,ρ (X(σ))X,α (σ)X,β (σ)Xν (σ � )) >
Thus our exact string propagator equation in the presence of the scalar and
gauge fields becomes
�Dµν (σ|σ � ) = −iδ 2 (σ − σ � )
− < T (Φ,µ (X(σ))Xν (σ � )) > +
γ
ρ
�αβ < T (Bµγ,ρ (X(σ))X,α (σ)X,β (σ)Xν (σ � )) >
We now make approximations by expanding the scalar and gauge fields around
the centre of of the string position xµ . This approximation gives
< T (Φ,µ (X(σ))Xν (σ � )) >=
Φ,µρ (x) < T (X ρ (σ)Xν (σ � )) >

= Φ,µρ (x)Dν(0)ρ (σ|σ � )
where D(0)µν is the unperturbed string propagator and is given by η µν ln(|σ−σ � |)
where �
|σ − σ � | = (σ 0 − σ 0� )2 − (σ 1 − σ 1� )2
Likewise,
γ
ρ
< T (Bµγ,ρ (X(σ))X,α (σ)X,β (σ)Xν (σ � )) >
γ
= Bµγ,ρδ (x)(< T (X δ (σ)X,α
ρ
(σ)X,β (σ)Xν (σ � )) >
Since three out of four of the arguments of X or its partial derivatives in this
last expression have the same arguments, it is clear that it cannot contribute
17.4. STRING INTERACTING WITH SCALAR FIELD, GAUGE FIELD AND A GRAVI
anything to the propagator except singularities. So we neglect this term. Thus

our corrected propagator equation becomes
�Dµν (σ|σ � ) = −iδ 2 (σ − σ � )
−Φ,µρ (x)Dν(0)ρ (σ|σ � )

Note that this is a local propagator equation since it depends on the coordinates
x of the centre of the string. We solve this to get
�
Dµν (σ − σ ) = Dµν (σ − σ ) − Φ,µν (x) D(0) (σ − σ �� )D(0) (σ �� − σ � )d2 σ ��
� (0) �
Where
D(0) (σ) = ln|σ|
Now we raise the question about what field equations should be satisfied by Bµν
and Φ so that the string action has conformal invariance. It For solving this
problem, we must replace the above action by
�
S[X, Φ, g] = (1/2)η αβ exp(φ(σ))gµν (X)X,αµ ν 2
X,β d σ
� �
2 µ ν
+ exp(φ(σ))Φ(X)d σ + �αβ exp(φ(σ))X,α X,β Bµν (X)d2 σ
since this corresponds to using the string sheet metric exp(φ(σ))ηαβ and cor
respondingly replacing the string sheet area element d2 σ by the invariant area
element exp(φ(σ))d2 σ. For small φ, conformal invariance then requires the
quantum average of the above action to be independent of φ. Taking into ac
count the larger singularity involved in the product of four string field terms at
the same point on the string world sheet as compared with the product of two
string field terms, this means that the quantity
η αβ gµν,ργ (x) < X ρ (σ)X γ (σ)X,α

µ ν
(σ)X,β (σ) >
+Φ,µν (x) < X µ (σ)X ν (σ) >

+�αβ Bµν,ρ (x) < X ρ (σ)X,α
µ ν
(σ)X,β (σ) >
+�αβ Bµν,ργ (x) < X ρ (σ)X γ (σ)X,α
µ ν
(σ)X,β (σ) >
must vanish.
Another way to approach this problem is to start with the string equations
of motion:
�Xµ = −Φ,µ (X) + �αβ Bµν,ρ (X)X,α ρ ν
X,β
and express it in terms of perturbation theory by writing
X µ = xµ + δX µ = xµ + Y µ
256
266CHAPTER Stochastic
17. STRING Processes in Classical
AND FIELD THEORYand Quantum Physics and
WITH LARGE Engineering
DEVIATIONS,STOCHA
as
�Y µ = −Φ,µν (x)Y ν + �αβ (Bµν,ρ (x) + Bµν,ργ (x)Y γ )Y,α
ρ ν
Y ,β
giving
Yµ = Zµ −Φ,µν (x)D(0) .Z ν +�αβ Bµν,ρ (x)D(0) .(Z,α

ρ ν
Z,β )+�αβ Bµν,ργ (x)D(0) .(Z γ Z,α
ρ ν
Z,β )
where x + Z is the string field in the absence of any interactions. With this
expansion in mind, we then require that
η αβ gµν,ργ (x) < Y ρ (σ)Y γ (σ)Y,α

µ
(σ)Y,βν (σ) >
+Φ,µν (x) < Y µ (σ)Y ν(σ) >

+�αβ Bµν,ρ (x) < Y ρ (σ)Y,α
µ
(σ)Y,βν (σ) >
+�αβ Bµν,ργ (x) < Y ρ (σ)Y γ (σ)Y,α
µ
(σ)Y,βν (σ) >
should vanish.
Another problem in information theory: Let pi , i = 1, 2, ..., N be a prob

ability distribution. Let p↓i , i = 1, 2, ..., N denote the permutation of these
probabilities in decreasing order. Consider the set
E = {i : pi > eb }
Clearly the size |E| of this set satisfies for 0 ≤ s ≤ 1

�
|E| ≤ (pi /eb )1−s = exp(−(1 − s)b + ψ(s)) = exp(sR) ≤ exp(R)
i
where �
ψ(s) = log p1−s
i
i
and where
b = b(s, R) = (ψ(s) − sR)/(1 − s)
and it is clear that
�
lims→0 ψ(s)/s = H(p) = − pi log(pi )
i
Hence, if R > H(s), by choosing s positive and sufficiently close to zero,

it follows that if p(n) denotes the product probability measure of the pi on
{1, 2, ..., N }n , then with
bn (s, R) = n(ψ(s) − sR))/(1 − s)
it follows that bn (s, R) → −∞ and hence exp(bn (s, R)) → 0 and further,
|{p(n) > exp(bn (s, R))} ≤ exp(nsR) ≤ exp(nR)

Now define
L
�
P (p, L) = p↓i
i=1
Then, it is clear that since
|{p > exp(b(s, R))}| ≤ exp(R)
for all s ∈ [0, 1], we have
1 − P (p, exp(R)) ≤ exp(b(s, R))
and this implies for the product probability distribution,
1 − P (p(n) , exp(nR)) ≤ exp(nb(s, R)) = exp(n(ψ(s) − sR)/(1 − s)) → 0, n → ∞
provided that s is chosen sufficiently close to zero and that R > H(p). In other
words, we have proved that if R > H(p), then
P (p(n) , exp(nR)) → 1
Motion of a string field in the presence of an external gauge field with poten
tials that are linear in the spacetime coordinates. Evaluation of the low energy
field action in terms of the statistics of the string field.
L = (1/2)∂ α X µ ∂α Xµ + �αβ Bµν (X)∂α X µ ∂β X ν
The equations of motion are
�Xµ = ∂ α ∂α Xµ = �αβ Bµν,ρ (X)∂β X ν )∂α X ρ
Writing �
Bµν,ρ (X) = An (µνρ|µ1 , ..., µn )X µ1 X µ2 ...X µn
we find that for the expansion
�
X µ = xµ − i a(n)µ exp(inua σa )/n − ib(n)µ exp(inv a σa )/n
n
where
(ua ) = (1, −1), (v a ) = (1, 1)
we find that
�αβ Bµν,ρ (X)∂β X ν )∂α X ρ =
�
�αβ (−i)n1 +...+nr +m1 +...+ms Crs (µνρ|µ1 , ...,
r,s,n1 ,...,nr ,m1 ,...,ms ,l1 ,l2 r
µr , ν1 , ..., νs )[Πk=1 (aµk (nk )/nk )Πk=1
s
(bνk (mk )/mk )]
(aν (l1 )bρ (l2 )exp(i(l1 u+l2 v).σ)−bν (l1 )aρ (l2 ))
exp(i(l1 v+l2 u).σ))uT �v.exp(i((n1 +...+nr )u+(m1 +...+ms )v).σ)
268CHAPTER 17. STRING AND FIELD THEORY WITH LARGE DEVIATIONS,STOC
where the coefficients Crs (.) are determined in terms of An (µνρ|µ1 , ...µn ). In
particular if Bµνρ are constants, then this gauge term contributes only quadratic
terms in the creation and annihilation operators and hence can be expressed as
�
Bµνρ = Bµν,ρ (uT �v)(aν (l1 )bρ (l2 )exp(i(l1 u+l2 v).σ)−bν (l1 )aρ (l2 ))exp(i(l1 v+l2 u).σ)
l1 ,l2
In short, in this special case, we can express the gauge field interaction corrected
string field upto first order perturbation terms as
� �
X µ (σ) = xµ + aµ (n)fn (σ) + Fνρ
µ
aν (n)aρ (m)gnm (σ)
n n,m
= xµ + δX µ (σ)
where
[aµ (n), aν (m)] = η µν δ(n + m)
Now consider a point field φ(x). Assume that its Lagrangian density is given
by L(φ(x), φ,µ (x)). The average action of this string field when string theoretic
corrections are taken into account in a coherent state |φ(u) > of the string is
given by
�
< φ(u)| L(φ(x + δX(σ)), φ,µ (x + δX(σ)))dσd4 x|φ(u) >
We can evaluate this action upto quadratic terms in the string perturbation by
using the Taylor approximations
φ(x + δX(σ)) = φ(x) + φ,µ (x)δX µ (σ) + (1/2)φ,µν (x)δX µ (σ)δX ν (σ)
φ,µ (x)(x + δX(σ)) = φ,µ (x) + φ,µν (x)δX µ (σ) + (1/2)φ,µν (x)δX µ (σ)δX ν (σ)
and then us the following identities:
< φ(u)|δX µ (σ)|φ(u) >=

� �
< φ(u)|aµ (n)|φ(u) > fn (σ) + Fνρ
µ
< φ(u)|aν (n)aρ (m)|φ(u) > gnm (σ)
n n,m
< φ(u)|δX (σ)δX ν (σ)|φ(u) >

µ
is expressible as a linear combination of
< φ(u)|aµ1 (n1 )aµ2 (n2 )|φ(u) >,
< φ(u)|aµ1 (n1 )aµ2 (n2 )aµ3 (n3 )|φ(u) >,

and
< φ(u)|aµ1 (n1 )aµ2 (n2 )aµ3 (n3 )aµ4 (n4 )|φ(u) >
Finally,
< φ(u)|aµ (n)|φ(u) >= uµ (n), uµ (−n) = ūµ (n)
0,
For n + m =
< φ(u)|aµ (n)aν (m)|φ(u) >= uµ (n)uν (m)
For n > 0
< φ(u)|aµ (n)aν (−n)|φ(u) >= η µν + uµ (n)uν (−n)
and so on.
Problems on the FokkerPlanck equation

[1] Derive the quantum FokkerPlanck equation for the Wigner distribution
of the positionmomentum pair of a particle moving in a potential with quantum
noise. The HPS evolution equation is
dU (t) = (−(iH + LL∗ /2)dt + LdA − L∗ dA∗ )U (t)
so that the master equation obtained as the equation for ρs (t) where
ρs (t) = T r2 (U (t)(ρs (0) ⊗ |φ(0 >< φ(0)|)U (t)∗ )
is given by
ρ�s (t) = −i[H, ρs (t)] − (1/2)(LL∗ ρs (t) + ρs (t)LL∗ − 2L∗ ρs (t)L)
[2] Given a master equation of the above type, how would you adapt the
parameters θk (t), k = 1, 2, ..., d where
d
� d
�
H = H0 + θk (t)Vk , L = L0 + θk (t)Lk
k=1 k=1
so that the probability density ρ(t, q, q) in the position representation tracks a

given pdf p0 (t, q) ?
hint: Express the master equation as

� �
ρ� (t) = T0 (ρ(t)) + θk (t)Tk (ρ(t)) + θk (t)θm (t)Tkm (ρ(t))
k k,m
where T0 , Tk , Tkm are linear operators on the space of mixed states in the un
derlying Hilbert space. It follows then that
A remark about supersymmetry, large deviations and stochastic control:

Suppose we are given a set of Bosonic and Fermionic fields and a Lagrangian
functional of these fields such that this Lagrangian functional is invariant under
global supersymmetry transformations. Suppose we now break the supersym
metry of this Lagrangian by adding terms to it that can be represented as
a coupling between stochastic cnumber fields and the Bosonic and Fermionic
fields appearing in the original Lagrangian. Theh, we write down the equa
tions of motion corresponding to this stochastically perturbed Lagrangian. The
equations of motion will be stochastic partial differential equations which are
nonsupersymmetric because of the stochastic perturbations. Now we calculate
the change in the modified action functional under an infinitesimal supersym
metry transformation and require that this change be bounded by a certain
small number in the sense of expected value of modulus square calculated by
assuming that the Bosonic and Fermionic fields are solutions to the unperturbed
supersymmetric equations of motion. This condition will impose a restriction
on the amplitude of the cnumber stochastic fields. Keeping this restriction,
we then try to calculate using the theory of large deviations, the probability
that the solution of the stochasically perturbed field equations which are non
supersymmetric will deviate from the supersymmetric unperturbed solutions by
an amount more than a certain threshold. We then repeat this calculation by
adding error feedback control terms to the perturbed dynamical equations and
choose the the control coefficients so that the resulting deviation probability is
minimized. This in effect, amounts to attempting to restore supersymmetry by
dynamic feedback control when the supersymmetry has been broken.

Stochastic Processes in Classical and Quantum Physics and Engineering, Parthasarathy, 2023

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Stochastic Processes in Classical and Quantum Physics and Engineering, Parthasarathy, 2023

Uploaded by

Copyright:

Available Formats

VISIT…

Stochastic Processes in Classical and

cussed in this book ﬁnd applications to more advanced problems in quantum

Lagrangian can be reduced by feedback stochastic control etc. Some problem

Table of Contents Author

1. Stability Theory of Nonlinear Systems Based on Taylor

1.1 Stochastic perturbation of nonlinear dynam­

where f : Rn → Rn is a continuously diﬀerentiable mapping. Let x0 be a ﬁxed

dδx(t)/dt = f � (x0 )δx(t) + O(|δx(t)|2 )

If we neglect O(|δx|2 ) terms, then we obtain the approximate solution to this

directions iﬀ all the eigenvalues of A have negative real parts. By asymptoti­

Stochastic perturbations: Consider now the stochastic dynamical system

dx(t)/dt = f (x(t)) + w(t)

Now it is meaningless to speak of a ﬁxed point, so we choose a trajectory x0 (t)

dx0 (t)/dt = f (x0 (t))

dδx(t)/dt = Aδx(t) + w(t), A = f ' (x0 )

which has solution

which converges to zero as t → ∞ provided that all the eigenvalues of A have

Suppose now that A is diagonalizable with spectral representation

supt |Rw (t)| ≤ B < ∞∀t

1.2 Large deviation analysis of the stability prob­

limE→0 P (exp((V − δ)/E) < τE < exp((V + δ)/E)) = 1

and if we assume that that T, N are ﬁxed, or equivalently that T, N → ∞ with

in the sense that E.log(τE ) converges to V as E → 0.

1.3 LDP for Yang­Mills gauge ﬁelds and LDP

leading to the equation of motion

which is the Wheeler­De­Witt equation, the analog of Schrodinger’s equation in

The Large deviation problem: If there is a small random perturbation in the

The Yang­Mills Lagrangian in curved space­time: Let Aµ = Aaµ τa denote

The action functional of the Yang­Mills ﬁeld in the background gravitational

1.4 String in background ﬁelds

L(X, X,α , α = 1, 2) = (1/2)gµν (X)hαβ X,α

+Φ(X) + Hµν (X)Eαβ X,α

∂α (gµν (X)hαβ X,β

= (1/2)gρσ,µ (X)hab X,a

gµν,ρ (X)hab X,a

= (1/2)gρσ,µ (X)hab X,a

[X µ (τ, σ), X ν (τ, σ � )] = 0

and therefore (1) can be expressed as

�Dµν (x|x� ) = η ab ∂a ∂b Dµν (x|x� ) =

δ. < T (F µ (X, X,a X,b )(x).X ν (x� )) >

and further we have approximately upto ﬁrst order of smallness

and hence upto second orders of smallness, (2) becomes

D(X0µ (x)X0ν (x' )) + D < T (δX µ (x).δX ν (x' )) >

= δF µ (X0 , X0,a , X0,b )(x)X0ν (x' )

−i < g µν (X(x)) > δ 2 (x − x' )

D < T (δX µ (x).δX ν (x' )) >

= δF µ (X0 , X0,a , X0,b )(x)X0ν (x' )

−ig µν (X0 (x))δ 2 (x − x' )

gµν,αβ (X0 ) < δX α (x)δX β (x) >

DX0µ = F µ (X0 , X0' , X0'' )

D0µν (x|x' ) =< T (δX µ (x).δX ν (x' )) >0

Now observe that in the absence of such perturbations,

2.1 Evaluating quantum string amplitudes using

Equivalently, we ﬁx the numbers {Prn }r,n corresponding to the momenta carried

LDP in general relativistic ﬂuid dynamics.

2.2 LDP for Poisson ﬁelds

λ(∞, x̂) = lim�→0 λ(x/�)

2.3 LDP for strings perturbed by mixture of

Add to this an interaction Lagrangian

an another interaction Lagrangian

3.1 A lecture on biological aspects of virus at­

researches, a virus consists of a spherical body covered with dots/rods/globs

1.1 Stochastic perturbation of nonlinear dynam

directions iﬀ all the eigenvalues of A have negative real parts. By asymptoti

1.2 Large deviation analysis of the stability prob

1.3 LDP for YangMills gauge ﬁelds and LDP

which is the WheelerDeWitt equation, the analog of Schrodinger’s equation in

The YangMills Lagrangian in curved spacetime: Let Aµ = Aaµ τa denote

The action functional of the YangMills ﬁeld in the background gravitational

3.1 A lecture on biological aspects of virus at

we have the ChapmanKolmogorov Markov chain equations

where Γ is an SU (2)Lie algebra valued one form. Note that

where Φ(t, s) is a classical cnumber function and δf (r, v) is a quantum operator

where L is a nonlinear spatiotemporal integrodiﬀerential operator and w is the