Download as pdf or txt
Download as pdf or txt
You are on page 1of 66

Complex Random Variables

Leigh J. Halliwell, FCAS, MAAA


______________________________________________________________________________

Abstract: Rarely have casualty actuaries needed, much less wanted, to work with complex numbers. One readily
could wisecrack about imaginary dollars and creative accounting. However, complex numbers are well
established in mathematics; they even provided the impetus for abstract algebra. Moreover, they are essential in
several scientific fields, most notably in electromagnetism and quantum mechanics, the two fields to which most
of the sparse material about complex random variables is tied. This paper will introduce complex random
variables to an actuarial audience, arguing that complex random variables will eventually prove useful in the field
of actuarial science. First, it will describe the two ways in which statistical work with complex numbers differs
from that with real numbers, viz., in transjugation versus transposition and in rank versus dimension. Next, it
will introduce the mean and the variance of the complex random vector, and derive the distribution function of
the standard complex normal random vector. Then it will derive the general distribution of the complex normal
multivariate and discuss the behavior and moments of complex lognormal variables, a limiting case of which is

the unit-circle random variable W = e for real Θ uniformly distributed. Finally, it will suggest several
foreseeable actuarial applications of the preceding theory, especially its application to linear statistical modeling.
Though the paper will be algebraically intense, it will require little knowledge of complex-function theory. But
some of that theory, viz., Cauchy’s theorem and analytic continuation, will arise in an appendix on the complex
moment generating function of a normal random multivariate.

Keywords: Complex numbers, matrices, and random vectors; augmented variance; lognormal and unit-circle
distributions; determinism; Cauchy-Riemann; analytic continuation
______________________________________________________________________________

1. INTRODUCTION
Even though their education has touched on algebra and calculus with complex numbers, most

casualty actuaries would be hard-pressed to cite an actuarial use for numbers of the form x + iy .

Their use in the discrete Fourier transformation (Klugman [1998], §4.7.1) is notable; however, many

would view this as a trick or convenience, rather than as indicating any further usefulness. In this

paper we will develop a probability theory for complex random variables and vectors, arguing that

such a theory will eventually find actuarial uses. The development, lengthy and sometimes arduous,

will take the following steps. Sections 2-4 will base complex matrices in certain real-valued matrices

called “double-real.” This serves the aim of our presentation, namely, to analogize from real-valued

random variables and vectors to complex ones. Transposition and dimension in the real-valued

Casualty Actuarial Society E-Forum, Fall 2015 1


Complex Random Variables

realm become transjugation and rank in the complex. These differences figure into the standard

quadratic form of Section 5, where also the distribution of the standard complex normal random

vector is derived. Section 6 will elaborate on the variance of a complex random vector, as well as

introduce “augmented variance,” i.e., the variance of dyad whose second part is the complex

conjugate of the first. Section 7 derives of the formula for the distribution of the general complex

normal multivariate. Of special interest to many casualty actuaries should be the treatment of the

complex lognormal random vector in Section 8, an intuition into whose behavior Section 9 provides

on a univariate or scalar level. Even further simplification in the next two sections leads to the unit-

circle random variable, which is the only random variable with widespread deterministic effects. In

Section 12 we adapt the linear statistical model to complex multivariates. Finally, Section 13 lists

foreseeable applications of complex random variables. However, we believe their greatest benefit

resides not in their concrete applications, but rather in their fostering abstractions of thought and

imagination. Three appendices delve into mathematical issues too complicated for the body of

paper. Those who work on an advanced level with lognormal random variables should read

Appendix A (“Real-Valued Lognormal Random Vectors”), regardless of their interest in complex

random variables.

2. INVERTING COMPLEX MATRICES


Let m×n complex matrix Z be composed of real and imaginary parts X and Y, i.e., Z = X + iY . Of

course, X and Y also must be m×n. Since only square matrices have inverses, our purpose here

requires that m = n . Complex matrix W = A + iB is an inverse of Z if and only if ZW = WZ = I n ,

where In is the n×n identity matrix. Because such an inverse must be unique, we may say that

Z −1 = W . Under what conditions does W exist?

Casualty Actuarial Society E-Forum, Fall 2015 2


Complex Random Variables

First, define the conjugate of Z as Z = X − iY . Since the conjugate of a product equals the product

of the conjugates, 1 if Z is non-singular, then Z Z −1 = ZZ −1 = I n = I n . Similarly, Z −1 Z = I n .

−1
Therefore, Z too is non-singular, and Z = Z -1 . Moreover, if Z is non-singular, so too are i n Z

and i n Z . Therefore, the invertibility of X + iY , − Y + iX , − X − iY , Y − iX , X − iY , Y + iX ,

− X + iY , and − Y − iX is true for all eight or true for none. Invertibility is no respecter of the real

and imaginary parts.

Now if the inverse of Z is W = A + iB , then I n = (X + iY )(A + iB) = (A + iB)(X + iY ) .

Expanding the first equality, we have:

I n = (X + iY )(A + iB)
= XA + iYA + iXB + i 2 YB
= XA + iYA + iXB − YB
= (XA − YB) + i (YA + XB)

Therefore, ZW = I n if and only if XA − YB = I n and YA + XB = 0 n×n . We may combine the last

two equations into the partitioned-matrix form:

X − Y  A  I n 
Y =
 X   B 0 

Since YA + XB = 0 if and only if − YA − XB = 0 , another form just as valid is:

 X − Y   − B  0 
=
Y
 X   A  I n 

We may combine these two forms into the balanced form:

 X − Y   A − B  I n 0
= = I 2n
Y
 X   B A  0 I n 

1 If Z and W are conformable to multiplication:


ZW = (X + iY )(A + iB) = XA − YB + i (XB + YA ) = XA − YB − i (XB + YA ) = (X − iY )(A − iB) = Z W

Casualty Actuarial Society E-Forum, Fall 2015 3


Complex Random Variables

 X − Y   A − B  I n 0
Therefore, ZW = I n if and only if  = . By a similar expansion of the
Y X   B A  0 I n 

 A − B  X − Y   I n 0
last equality above, WZ = I n if and only if  = . Hence, we conclude
B A  Y X  0 I n 

that the n×n complex matrix Z = X + iY is non-singular, or has an inverse, if and only if the 2n×2n

−1
X − Y  X − Y 
real-valued matrix  is non-singular. Moreover, if exists, it will have the form
Y X  Y
 X 

 A − B
B  and Z −1 will equal A + iB .
 A

3. COMPLEX MATRICES AS DOUBLE-REAL MATRICES


That the problem of inverting an n×n complex matrix resolves into the problem of inverting a

2n×2n real-valued matrix suggests that with complex numbers one somehow gets “two for the price

of one.” It even hints of a relation between the general m×n complex matrix Z = X + iY and the

X − Y  X − Y 
2m×2n complex matrix   . If X and Y are m×n real matrices, we will call  a
Y X Y X 

double-real matrix. The matrix is double in two senses; first, in that it involves two (same-sized and

real-valued) matrices X and Y, and second, in that its right half is redundant, or reproducible from

its left.

Returning to the hint above, we easily see an addition analogy:

X − Y1  X 2 − Y2 
X 1 + iY1 + X 2 + iY2 ⇔  1 +
 Y1 X 1   Y2 X 2 

Casualty Actuarial Society E-Forum, Fall 2015 4


Complex Random Variables

And if Z1 is m×n and Z2 is n×p, so that the matrices are conformable to multiplication, then

Z1 Z 2 = (X 1 + iY1 )(X 2 + iY2 ) = (X 1 X 2 − Y1 Y2 ) + i (X 1 Y2 + Y1 X 2 ) . This is analogous with the

double-real multiplication:

X 1 − Y1  X 2 − Y2   X 1 X 2 − Y1 Y2 − X 1 Y2 − Y1 X 2 
Y =
 1 X 1   Y2 X 2  X 1 Y2 + Y1 X 2 X 1 X 2 − Y1 Y2 

Rather trivial is the analogy between the m×n complex zero matrix and the 2m×2n double-real zero

0 − 0
matrix  m×n , as well as that between the n×n complex identity matrix and the 2n×2n double-
0 0

I −0 
real identity matrix  n .
0 I n 

The general 2m×2n double-real matrix may itself be decomposed into quasi-real and quasi-imaginary

X − Y  X 0   0 − Y 
parts:  = + . And in the case of square matrices ( m = n ) this extends
Y X   0 X  Y 0 

 X − Y   X 0  0 − I n  Y 0 
to the form = + , wherein the double-real matrix
Y
 X   0 X  I n 0   0 Y 

0 − In 
I is analogous with the imaginary unit, inasmuch as:
 n 0 

− In 
2
0 0 − I n  0 − I n  − I n 0  I 0
I  = = = (− 1) n
 n 0  I n 0  I n 0   0 
− In  0 I n 

Finally, one of the most important theorems of linear algebra is that every m×n complex matrix

Z = X + iY may be reduced by invertible transformations to “canonical form” (Healy [1986], 32-

34). In symbols, for every Z there exist non-singular matrices U and V such that:

Casualty Actuarial Society E-Forum, Fall 2015 5


Complex Random Variables

I 0
U m×m Z m×n Vn×n =  r
0 0 m×n

The m×n real matrix on the right side of the equation consists entirely of zeroes except for r

instances of one along its main diagonal. Since invertible matrix operations can reposition the ones,

it is further stipulated that the ones appear as a block in the upper-left corner. Although many

reductions of Z to canonical form exist, the canonical forms themselves must all contain the same

number of ones, r, which is defined as the rank of Z. Providing the matrices with real and complex

parts, we have:

U m×m Z m×n Vn×n = (P + iQ )(X + iY )(R + iS)


= (PXR − QYR − PYS − QXS) + i (PXS − QYS + PYR + QXR )
= A + iB
I 0
= r + i 0 m×n
0 0

The double-real analogue to this is:

  I r 0 
   0 m×n 
 P − Q X − Y  R − S   A − B   0 0  
Q = =
 P  Y X   S R   B A    I r 0 
 0 m×n 0 0  
  

 P − Q
As shown in the previous section,  is non-singular, or invertible, if and only if
Q P 

R − S
U = P + i Q is non-singular; the same is true for  . Therefore, the rank of the double-real
S R 

analogue of a complex matrix is twice the rank of the complex matrix. Moreover, the 2r instances of

one correspond to r quasi-real and r quasi-imaginary instances. It is not possible for the

contribution to the rank of a matrix to be real without its being imaginary, and vice versa.

Casualty Actuarial Society E-Forum, Fall 2015 6


Complex Random Variables

To conclude this section, there are extensive analogies between complex and double-real matrices,

analogies so extensive that one who lacked either the confidence or the software to work with

complex numbers could probably do a work-around with double-real matrices. 2

4. COMPLEX MATRICES AND VARIANCE



Σ = Var [x] = E (x − µ )(x − µ )  is a real-valued n×n matrix, whose jkth element is the covariance of
 

the jth element of x with the kth element. Since the covariance of two real-valued random variables is

symmetric, Σ must be a symmetric matrix. But a realistic Σ must have one other property, viz., non-

negative definiteness (NND). This means that for every real-valued n×1 vector ξ, ξ ′Σξ ≥ 0 . 3 This

must be true, because ξ ′Σξ is the variance of the real-valued random variable ξ ′x :

′ ′ ′
Var [ξ ′x] = E (ξ ′x − ξ ′µ )(ξ ′x − ξ ′µ )  = E ξ ′(x − µ )(x − µ ) ξ  = ξ ′E (x − µ )(x − µ )  ξ = ξ ′Σξ
     

Although variances of real-valued random variables may be zero, they must not be negative. Now if

ξ ′Σξ > 0 for all ξ ≠ 0 n×1 , the variance Σ is said to be positive-definite (PD). Every invertible NND

matrix must be PD. Moreover, every NND matrix may be expressed as the product of some real

matrix and its transpose, the most common method for doing this being the Cholesky

 x − y
2 The representation of the complex scalar z = x + iy as the real 2×2 matrix  is a common theme in
y x 
modern algebra (e.g., section 7.2 of the Wikipedia article “Complex number”). We have merely extended the
representation to complex matrices. Our representation is even more meaningful when expressed in the Kronecker-
 X − Y   X 0   0 − Y   1 0 0 − 1
product form Y  =  +  =  ⊗X+  ⊗ Y. Due to certain properties
 X   0 X  Y 0  0 1  1 0
of the Kronecker product (cf. Judge [1988], Appendix A.15), all the analogies of this section would hold even in the
 1 0 0 − 1
commuted form X⊗  +Y⊗  . In practical terms this means that it matters not whether the form
0 1  1 0
is 2×2 of m×n or m×n of 2×2.
3 []
More accurately, ξ ′Σξ ≥ 0 , since the quadratic form ξ′Σξ is a 1×1 matrix. The relevant point is that 1×1 real-
valued matrices are as orderable as their real-valued elements are. Appendices A.13 and A.14 of Judge [1988] provide
introductions to quadratic forms and definiteness that are sufficient to prove the theorems used herein.

Casualty Actuarial Society E-Forum, Fall 2015 7


Complex Random Variables

decomposition (Healy [1986], §7.2). Accordingly, if Σ is NND, then A ′ΣA ≥ 0 for any conformable

real-valued matrix A. Finally, if Σ is PD and real-valued n×r matrix A is of full column rank, i.e.,

rank (A n×r ) = r , then the r×r matrix A ′ΣA is PD.

X − Y 
In the remainder of this section we will show how the analogy between X + i Y and 
Y X 

leads to a proper definition of the variance of a complex random vector. We start by considering

X − Y 
Y as a real-valued variance matrix. In order to be so, first it must be symmetric:
 X 


 X − Y   X ′ Y ′  X − Y 
Y = =
 X  − Y ′ X ′ Y X 

X − Y 
Hence,  is symmetric if and only if X ′ = X and Y ′ = −Y . In words, X is symmetric and
Y X 

Y is skew-symmetric. Clearly, the main diagonal of a skew-symmetric matrix must be zero. But of

greater significance, if a and b are real-valued n×1 vectors:


a ′Yb = (a ′Yb )1×1 = (a ′Yb ) = b ′Y ′a = b ′(− Y )a = −b ′Ya

Consequently, if b = a :

a ′Ya + a ′Ya a ′Ya + (− a ′Ya )


a ′Ya = = = 01×1
2 2

Next, considering the specifications on X, Y, a, and b, we evaluate the 2×2 quadratic form:

Casualty Actuarial Society E-Forum, Fall 2015 8


Complex Random Variables


 a − b  X − Y   a − b   a ′ b′ X − Y   a − b 
b =
 a  Y X  b a  − b′ a ′ Y X  b a 
 a ′X + b′Y b′X − a ′Y   a − b 
= 
− b′X + a ′Y a ′X + b′Y  b a 
 a ′Xa + b′Ya − a ′Yb + b′Xb − a ′Xb − b′Yb + b′Xa − a ′Ya 
=
− b′Xa + a ′Ya + a ′Xb + b′Yb a ′Xa + b′Ya − a ′Yb + b′Xb
a ′Xa + b′Ya − a ′Yb + b′Xb − a ′Xb − 0 + b′Xa − 0
=
 − b′Xa + 0 + a ′Xb + 0 a ′Xa + b′Ya − a ′Yb + b′Xb
a ′Xa + b′Ya − a ′Yb + b′Xb − a ′Xb + b′Xa 
=
 − b′Xa + a ′Xb a ′Xa + b′Ya − a ′Yb + b′Xb
a ′Xa + b′Ya − a ′Yb + b′Xb 0
=
 0 a ′Xa + b′Ya − a ′Yb + b′Xb
 a ′ X −Y a 
    
0
 b  Y X  b  
= 

 a  X − Y  a  
0  

 b  Y X  b  

′ ′
 a − b  X − Y   a − b  a  X − Y  a 
Therefore,  is PD [or NND] if and only if    is PD
b a  Y X  b a  b  Y X  b

[or NND].

a − b
Now the double-real 2n×2 matrix  is analogous with the n×1 complex vector a + ib . Its
b a 


 a − b   a ′ b ′
transpose   = − b ′ a ′ is analogous with the 1×n complex vector a ′ − ib ′ . Moreover,
 b a   

′ ′ ′
a ′ − ib ′ = (a − ib ) = a + ib = (a + ib ) = (a + ib ) , where ‘*’ is the combined operation of
*

Casualty Actuarial Society E-Forum, Fall 2015 9


Complex Random Variables

X − Y 
transposition and conjugation (order irrelevant). 4 And  is analogous with the n×n
Y X 

complex matrix X + i Y . Accordingly, the complex analogue of the double-real quadratic form


 a − b  X − Y   a − b  X − Y 
b is (a + ib )* (X + iY )(a + ib ) . Moreover, since  is
 a  Y X  b a  Y X 

symmetric, (X + iY ) = X ′ − iY ′ = X − i (− Y ) = X + iY . A matrix equal to its transposed conjugate


*

is said to be Hermetian: matrix Γ is Hermetian if and only if Γ* = Γ . Therefore, Γn×n = X + iY is

the variance matrix of some complex random variable z n×1 = x + iy if and only if Γ is Hermetian

X − Y 
and   is non-negative-definite. 5
Y X

′ ′
a  X − Y  a  a  X − Y  a 
Because (a + ib ) (X + iY )(a + ib ) =   +i⋅0 =   , the definiteness
*
Y    Y X  b
b   X  b b 

X − Y 
of Γ = X + iY is the same as the definiteness of   . Therefore, a matrix qualifies as the
 Y X 

variance matrix of some complex random vector if and only if it is Hermetian and NND. Just as the

variance matrix of a real-valued random vector factors as Σ = A ′A n×n for some real-valued A, so too

the variance matrix of a complex random vector factors as Γ = A * A n×n for some complex A.

Likewise, every invertible Hermetian NND matrix must be PD. Due to the skew symmetry of their

4 The transposed conjugate is sometimes called the “transjugate,” which in linear algebra is commonly symbolized with
the asterisk. Physicists prefer the “dagger” notation A†, though physicist Hermann Weyl [1950, p. 17] called it the
~
“Hermetian conjugate” and symbolized it as A.
X − Y 
5 It is superfluous to add ‘symmetric’ here. For Γ = X + iY is Hermetian if and only if Y is symmetric.
 X 

Casualty Actuarial Society E-Forum, Fall 2015 10


Complex Random Variables

complex parts, the main diagonals of Hermetian matrices must be real-valued. If the matrices are

NND [or PD], all the elements of their main diagonals must be non-negative [or positive].

Let Γ represent the variance of the complex random vector z. Its jkth element represents the

covariance of the jth element of z with the kth element. Since Γ is Hermetian,

γ kj = [Γ ]kj = [Γ * ]kj = [Γ ′]kj = [Γ ] jk = [Γ ] jk = γ jk . Because of this, it is fitting and natural to define

the variance of a complex random vector as:


(
Γ = Var [z ] = E (z − µ ) z − µ )′  = E[(z − µ )(z − µ ) ]
*

 

The complex formula is like the real formula except that the second factor in the expectation is

transjugated, not simply transposed. This renders Γ Hermetian, since:

[
Γ * = E (z − µ )(z − µ )
* *
] = E [{(z − µ )(z − µ ) } ] = E[(z − µ )(z − µ ) ] = Γ
* * *

It also renders Γ NND. For since (z − µ )(z − µ ) is NND, its expectation over the probability
*

distribution of z must also be so. Usually Γ is PD, in which case Γ −1 exists.

5. THE EXPECTATION OF THE STANDARD QUADRATIC FORM


The most common quadratic form in zn×1 involves the variance of the complex random variable,

viz., (z − µ ) Γ −1 (z − µ ) , where Γ = Var [z ] . The expectation of this quadratic form equals n, the
*

rank of the variance. The following proof uses the trace function. The trace of a matrix is the sum

of its main-diagonal elements, and if A and B are conformable tr ( AB ) = tr (BA) . Moreover, the

trace of the expectation equals the expectation of the trace. Consequently:

Casualty Actuarial Society E-Forum, Fall 2015 11


Complex Random Variables

[ ] [( )]
E (z − µ ) Γ −1 (z − µ ) = E tr (z − µ ) Γ −1 (z − µ )
* *

= E [tr (Γ (z − µ )(z − µ ) )]
−1 *

= tr (E [Γ (z − µ )(z − µ ) ])
−1 *

= tr (Γ E [(z − µ )(z − µ ) ])
−1 *

(
= tr Γ −1Γ )
= tr (I n )
=n

The analogies above between complex and double-real matrices might suggest the result to be 2n.

[ ]
However, for real-valued random variables E (x − µ ) Σ −1 (x − µ ) = n , and the complex case is a
*

superset of the real. So by extension, the complex case must be the same.

But an insight is available into why the value is n, rather than 2n. Let x and y be n×1 real-valued

random vectors. Assume their means to be zero, and their variances to be identity matrices (so zero

covariance):

x  0 n×1  I n 0 
y  ~  μ = 0 , Σ = 0 
    n×1   I n  

The quadratic form is:

−1
n  x2 y 2j 
[x′ y ′]Σ −1   = [x′ y ′] n
x I 0  x x
= [x y ]  = x x + y y = ∑
′ ′ ′ ′  + 
j

y  0 I n  y   
y j =1

 1 1 

Since the elements have unit variances, the expectation is:

 x   n  x 2j y 2j  n  E x 2j E y 2j
E [x′ y ′]Σ −1    = E ∑   = ∑ 
[ ] [ ] = n
1 1
 1
+
 j =1  1
+
 ∑  1 + 1  = 2n
  y   
 
j =1 1   1  j =1

Now let z be the n×1 complex random vector x + iy . Since E [x + iy ] = 0 n×1 , the variance of z is:

Casualty Actuarial Society E-Forum, Fall 2015 12


Complex Random Variables

Γ = Var [z ]
= Var [x + iy ]
 x 
= Var [I n iI n ]  
 y  
 x  x 
*

= E [I n iI n ]  [I n iI n ]   
 y   y   
 x x
*

= E [I n iI n ]    [I n iI n ] 
*

 y  y  
x x *   I n 
= [I n iI n ]E        
 y  y   − iI n 
x  I 
= [I n iI n ]Var    n 
 y   − iI n 
I 0   I n 
= [I n iI n ] n  
 0 I n   − iI n 
 I 
= [I n iI n ] n 
 − iI n 
= I n −i 2 I n
= 2I n

The complex quadratic form is:

n ( x − iy )( x + iy ) n x2 + y2
z *z n z z
x
z Γ z = z (2I n ) = [x ′ y ′]Σ −1  
1
=∑ =∑ =∑
−1 −1
z=
* * j j j j j j j j

2 j =1 2 j =1 2 j =1 1+1 2 y 

The complex form is half the real-valued form; hence, its expectation equals n. The condensation of

the 2n real dimensions into n complex ones inverts the order of operations:

n  x 2j y 2j  n x2 + y2

∑  +  ⇒ ∑
j j

j =1  1 1   j =1 1+1

Casualty Actuarial Society E-Forum, Fall 2015 13


Complex Random Variables

Within the sigma operator, the sum of two quotients becomes the quotient of two sums. A proof

for general variance Γ involves diagonalizing Γ, i.e., that Γ can be eigen-decomposed as

Γ = WΛW * , where Λ is diagonal and WW * = W * W = I n . 6

At this point we can derive the standard complex normal distribution. The normal distribution is


( x − µ )2
f X (x ) =
1
e 2σ 2
. The standard complex normal random variable is formed from two
2πσ 2

independent real normal variables whose means equal zero and whose variances equal one half:

( x − 0 )2 ( y − 0 )2
− − 1 − x 2 1 − y 2 1 − (x 2 + y 2 ) 1 − z z
f Z (z ) =
1 1
e 2(1 2 ) e 2(1 2 ) = e e = e = e
2π (1 2 ) 2π (1 2 ) π π π π

The distribution of the n×1 standard complex normal random vector is f z (z ) =


1
e z z . A vector so
*

π n

distributed has mean E [z ] = 0 n×1 and variance Var [z ] = E [zz ′] = I n .

6. COMPLEX VARIANCE, PSEUDOVARIANCE, AND AUGMENTED


VARIANCE
Section 4 justified the definition of the variance of a complex random vector as:


(
Γ = Var [z ] = E (z − µ ) z − µ )′  = E[(z − µ )(z − µ ) ]
*

 

The naïve formula differs from this by one critical symbol (prime versus asterisk):


C = E (z − µ )(z − µ ) 
 

6Cf. Appendix C for eigen-decomposition and diagonalization. We believe the insight about commuting sums and
quotients to be valuable as an abstraction. But of course, a vector of n independent complex random variables of unit
2n
1
variance translates into a vector of 2n independent real random variables of half-unit variance, and ∑2 =n.
j =1
Because

of the half-unit real variance, the formula in the next paragraph for the standard complex normal distribution, lacking
any factors of two, is simpler than the formula for the standard real normal distribution.

Casualty Actuarial Society E-Forum, Fall 2015 14


Complex Random Variables

This naïveté leads many to conclude that Var [iz ] = i 2Var [z ] = −Var [z ] , whereas it is actually: 7

[ *
] [ *
] [ *
]
Var [iz ] = E i (z − µ ){i (z − µ )} = E i (z − µ )i (z − µ ) = iiE (z − µ )(z − µ ) = i (− i )Var [z ] = Var [z ]

Nevertheless, there is a role for the naïve formula, which reduces to:

′  ′
[
C = E (z − µ )(z − µ )  = E (z − µ )(z − µ )  = E (z − µ ) z − µ
 
 
( ) ] = Cov[z, z ]
*

Veeravalli [2006], whose notation we follow, calls C the “relation matrix.” The Wikipedia article

“Complex normal distribution” calls it the “pseudocovariance matrix.” Because of the naïveté that

leads many to a false conclusion, we prefer the ‘pseudo’ terminology (better, “pseudovariance”) to

something as bland as “relation matrix.” However, a useful and non-pejorative concept is what we

will call the “augmented variance.”

The augmented variance is the variance of the complex random vector z augmented with its

z   z   E [z ]  µ 
conjugate z , i.e., the 2n×1 vector   . Its expectation is E   =   =   . And its variance is
z   z   E [z ]  µ 

(for brevity we ignore the mean):

z  z  z  *  z   Cov[z, z ] Cov[z, z ]


Var   = E       = E   [ z ′ z ′] =  
z    z   z   z   Cov[z , z ] Cov[z , z ]

In two ways this matrix is redundant. First, Cov[z , z ] = E [zz ′] = E [zz ′] = Cov[z, z ] ; equivalently,

Var [z ] = Var [z ] = Γ . And second, Cov[z , z ] = E [z z ′] = E [zz′] = Cov[z, z ] = C . Therefore:

 z  Cov[z, z ] Cov[z, z ]  Γ C
Var   =  = 
 z  Cov[z , z ] Cov[z , z ]  C Γ 

7 In general, for any complex scalar α, Var [α z ] = ααVar [z ] .

Casualty Actuarial Society E-Forum, Fall 2015 15


Complex Random Variables

As with any valid variance matrix, the augmented variance must be Hermetian. Hence,

′ ′
 Γ C   Γ C   Γ C   Γ C  Γ*
*
C′
C Γ  = C Γ  = C Γ  =   = *  , from which follow Γ = Γ and C′ = C .
*

      C Γ  C Γ′ 

Moreover, it must be at least NND, if not PD. It is important to note from this that the

pseudovariance is an essential part of the augmented z; it is possible for two random variables to

have the same variance and to covary differently with their conjugates. How a complex random

vector covaries with its conjugate is useful information; it is even a parameter of the general complex

normal distribution, which we will treat next.

7. THE COMPLEX NORMAL DISTRIBUTION


All the information for deriving the complex normal distribution of z n×1 = x + iy is contained in

the parameters of the real-valued multivariate normal distribution:

x  μ x  Σ xx Σ xy  
 μ =   Σ = Σ 
y  Σ yy  
~ N ,
 × ×
μ y 
2 n 1 2 n 2 n
    yx

According to this variance structure, the real and imaginary parts of z may covary, as long as the

covariance is symmetric: Σ yx = Σ′xy . The grand Σ matrix must be symmetric and PD. The

probability density function of this multivariate normal is: 8

 x −μ x   x −μ x 

1
[
x ′ −μ ′x ]
y′ −μ ′y Σ −1   −
1
[
x ′ −μ ′x ]
y′ −μ ′y Σ −1  
f  x  (x , y ) =
1  y −μ y  1  y −μ y 
=
2 2
e e
 
y  (2π )2 n Σ π n 2 2n Σ

8 As derived briefly by Judge [1988, pp 49f]. Chapter 4 of Johnson [1992] is thorough. To be precise, Σ under the
radical should be Σ , the absolute value of the determinant of Σ. However, the determinant of a PD matrix must be
positive (cf. Judge [1988, A.14(1)]).

Casualty Actuarial Society E-Forum, Fall 2015 16


Complex Random Variables

 z   x + iy   I n iI n   x  x
Since z = x + iy , the augmented vector is   =  =    = Ξ n   . We will call
 z   x − iy   I n − iI n   y  y 

I iI n 
Ξn =  n the augmentation matrix; this linear function of the real-valued vectors produces
I n − iI n 

the complex vector and its conjugate. An important equation is:

I iI n   I n I n  2I n 0 
Ξ n Ξ *n =  n = = 2I 2 n
I n − iI n  − iI n iI n   0 2I n 

Therefore, Ξn has an inverse, viz., one half of its transjugate.

z  μ x  μ x + iμ y  μ 
The augmented mean is E   = Ξ n   =   =   . The augmented variance is:
z  μ y  μ x − iμ y  μ 

z 
Var   = Ξ n ΣΞ *n
z 
I iI n  Σ xx Σ xy   I n I n 
= n 
I n − iI n  Σ yx Σ yy  − iI n iI n 
Σ xx + iΣ yx Σ xy + iΣ yy   I n I n 
=
Σ xx − iΣ yx Σ xy − iΣ yy  − iI n iI n 
Σ xx + Σ yy − i (Σ xy − Σ yx ) Σ xx − Σ yy + i (Σ xy + Σ yx )
=
Σ xx − Σ yy − i (Σ xy + Σ yx ) Σ xx + Σ yy + i (Σ xy − Σ yx )
 Γ C
= 
C Γ 

And so:

−1
 z   Γ C
−1
Var   =   (
= Ξ n ΣΞ *n )
−1
( )
= Ξ *n
−1
Σ −1 (Ξ n )
−1

z  C Γ 

−1
z  −1  Γ C −1
This can be reformulated as Ξ Var   Ξ n = Ξ *n 
*
 Ξn = Σ .
Γ
n
 
z  C 

Casualty Actuarial Society E-Forum, Fall 2015 17


Complex Random Variables

We now work these augmented forms into the probability density function:

 x −μ x 

1
[
x ′ −μ ′x ]
y′ −μ ′y Σ −1  
f  x  (x , y ) =
1 2  y −μ y 
e
 
y 
π n 2 2n Σ
 z   x −μ x 
1 −
1
[
x ′ −μ ′x ]
y′ −μ ′y Ξ *nVar −1   Ξ n  
 z   y −μ y 
=
2
e
π n 2I 2 n Σ
*
1   x −μ x   −1  z 
  x −μ x  
−  Ξn   Var    Ξ n  y μ  
1 2   y −μ y   z   − y  
= e
π n Ξ n Ξ *n Σ

1 −
1
[ ] z   z −μ 
z′ −μ ′ z′ −μ ′ Var −1    
 z   z −μ 
= e 2

π n Ξ n ΣΞ *n

1 −
1
[ ] z   z −μ 
z′ −μ ′ z′ −μ ′ Var −1    
 z   z −μ 
= e 2

z 
π n Var  
z 

However, this is not quite the density function of z, since the differential volume has not been

considered. The correct formula is f z (z )dVz = f  x  (x , y )dVxy . The differential volume in the xy
 
y 

n
coordinates is dVxy = ∏ dx dy
j =1
j j . A change of dxj entails an equal change in the real part of dzj,

even as a change of dyj entails an equal change in the imaginary part of dzj. Accordingly,

∏ dx j (i ⋅ dy j ) = i n ∏ dx j dy j = i n
n n n n
dVz =
j =1 j =1
∏ dx j dy j = 1 ⋅ ∏ dx j dy j = dVxy . It so happens that
j =1 j =1

Ξn does not distort volume; but this had to be demonstrated. 9

So finally, the probability density function of the complex random vector z is:

9 This will be abstruse to some actuaries. However, the integration theory is implicit in the change-of-variables
technique outlined in Hogg [1984, pp 42-46]. That the n×n determinant represents “the volume function of an n-
dimensional parallelepiped” is beautifully explained in Chapter 4 of Schneider [1973].

Casualty Actuarial Society E-Forum, Fall 2015 18


Complex Random Variables

dVz
f z (z ) = f z (z ) ⋅ 1 = f z (z ) = f  x  (x , y )
dVxy  
y  

1 −
1
[ ]  z   z −μ 
z′ −μ ′ z′ −μ ′ Var −1    
 z   z −μ 
= e 2

z 
π n Var  
z 
−1

1 −
1
[ ] Γ C   z −μ 
z′ − μ ′ z′ − μ ′    
 C Γ   z −μ 
= e 2

Γ C
πn
C Γ

This formula is equivalent to the one found in the Wikipedia article “Complex normal distribution.”

Although Γ Γ − CΓ −1C appears within the radical of that article’s formula, it can be shown that

Γ C
= Γ Γ − CΓ −1C . As far as allowable parameters are concerned, µ may be any complex
C Γ

Γ C  Γ C
vector. is allowed if and only if Ξ *n   Ξ n = 4Σ is real-valued and PD.
C Γ C Γ 

Veeravalli [2006] defines a “proper” complex variable as one whose pseudo[co]variance matrix is

0n×n. Inserting zero for C into the formula, we derive the probability density function of a proper

complex random variable whose variance is Γ:

Casualty Actuarial Society E-Forum, Fall 2015 19


Complex Random Variables

−1

1
[ ]
 Γ C   z −μ 
z′ − μ ′ z′ − μ ′    
f z (z ) =
1 2  C Γ   z −μ 
e
Γ C
πn
C Γ
−1

1 −
1
[
z′ − μ ′ z′ − μ ′  ]
 Γ 0   z −μ 
  
 0 Γ   z −μ 
= e 2

Γ 0
πn
0 Γ
1
1
( ) −1
−  z′ −μ ′ Γ −1 ( z −μ )+ ( z′ −μ ′ )Γ z −μ  ( )
= e 2 

πn Γ Γ
1 −
1
{( ) ( )
z′ −μ ′ Γ −1 ( z −μ )+ z′ −μ ′ Γ −1 ( z′ −μ ′ ) }
= e 2

πn Γ Γ
1 −
1
{( ) ( )
z′ −μ ′ Γ −1 ( z −μ )+ z′ −μ ′ Γ −1 ( z −μ ) }
= e 2

πn Γ
2

e −(z′−μ′ )Γ
1 −1
( z −μ )
=
π Γ
n

1
e − ( z −μ ) Γ ( z −μ )
* −1
=
π Γ
n

The transformations in the last several lines rely on the fact that Γ is Hermetian and PD. Now the

standard complex random vector is a proper complex random vector with mean zero and variance

1
In. Therefore, in confirmation of Section 5, its density function is e −z z .
*

πn

8. THE COMPLEX LOGNORMAL RANDOM VECTOR AND ITS


MOMENTS
A complex lognormal random vector is the elementwise exponentiation of a complex normal

random vector: w n×1 = e z n×1 . Its conjugate also is lognormal, since w = e z = e z . Deriving the

probability density function of w is precluded by the fact that e : z → w is many-to-one.

Specifically, e z = w = e z +i (2πk ) for any integer k. So unlike the real-valued lognormal random

variable, whose density function can be found in Klugman [1998, §A.4.11], an analytic form for the

Casualty Actuarial Society E-Forum, Fall 2015 20


Complex Random Variables

complex lognormal density is not available. However, even for the real-valued lognormal the

density function is of little value; its moments are commonly derived from the moment generating

function of the normal variable on which it is based. So too, the moment generating function of the

complex normal random vector is available for deriving the lognormal moments.

We hereby define the moment generating function of the complex n×1 random vector z as

[ ]
M z (s n×1 , t n×1 ) = E e s′z + t′z . Since this definition may differ from other definitions in the sparse

literature, we should justify it. First, because we will take derivatives of this function with respect to

s and t, the function must be differentiable. This demands simple transposition in the linear

combination, i.e., s′z + t ′z rather than the transjugation s*z + t * z . For transjugation would involve

ds
derivatives of the form , which do not exist, as they violate the Cauchy-Riemann condition. 10
ds

Second, even though moments of z are conjugates of moments of z, we will need second-order

moments involving both z and z . For this reason both terms must be in the exponent of the

moment generating function.

10 Cf. Appendix D.1.3 of Havil [2003]. Express f ( z = x + iy ) in terms of real-valued functions, i.e., as
∂u ∂x ∂v ∂x 
u ( x, y ) + i ⋅ v( x, y ) . The derivative is based on the matrix of real-valued partial derivatives  .
∂u ∂y ∂v ∂y 
For the derivative to be the same in both directions, the Cauchy-Riemann condition must hold, viz., that
 1 0
∂u ∂x = ∂v ∂y and ∂v ∂x = − ∂u ∂y . But for f ( z ) = z = x − iy , the partial-derivative matrix is  ;
0 −1
hence ∂u ∂x ≠ ∂v ∂y . The Cauchy-Riemann condition becomes intuitive when one regards a valid complex
 1 0
derivative as a double-real 2×2 matrix (Section 3). Compare this with f ( z ) = z = x + iy , whose matrix is  ,
0 1
which represents the complex number 1.

Casualty Actuarial Society E-Forum, Fall 2015 21


Complex Random Variables

 z   x + iy   I n iI n   x  x
We start with terminology from Section 7, viz., that   =  =    = Ξ n   and
 z   x − iy   I n − iI n   y  y 

x  μ x  Σ xx Σ xy  
that   ~ N  μ 2 n×1 =  , Σ 2 n×2 n =  . According to Appendix A, the moment
y  
 μ y  Σ yx Σ yy  

x
generating function of the real-valued normal random vector   is:
y 

μ x  1  Σ xx Σ xy   a 
 a    [a ′ x
b′ ]   [a ′ b′ ]  + [a ′ b′ ]
μ

Σ yy   b 
 Σ yx
M  x      = E e y  =e  y 2

   b  
y 
 

Consequently:

[
M z (s, t ) = E e s′z + t′z ]
 [s′ t′ ] I n iI n   x  
 I n − iI n   y  
= E e
 
 
 [s′+ t′ i (s′− t′ )]  
x

= E e y  

 
 s + t 
= M  x     
   i (s − t ) 
y 

[(s+ t )′ ] [
μ x  1
i (s − t )′   + (s + t )′
μ y  2
]  Σ xx
i (s − t )′ 
 Σ yx
Σ xy   s + t 
Σ yy   i (s − t )
=e

It is so that we could invoke it here that Appendix B went to the trouble of proving that complex

values are allowed in this moment generating function.

But in two ways we can simplify this expression. First:

′ μ x 
(s + t )′ i (s − t )    = (s + t ) μ x + i (s − t ) μ y = s ′(μ x + iμ y ) + t ′(μ x − iμ y ) = s ′μ z + t ′μ z
′ ′
  μ y
 

Casualty Actuarial Society E-Forum, Fall 2015 22


Complex Random Variables

z   Γ C Σ xx Σ xy  *
And second, again from Section 7, Var   =  = Ξ n Σ 2 n×2 n Ξ *n = Ξ n  Ξ , or

z  C Γ  Σ yx Σ yy  n

Ξ *n  Γ C Ξ n Σ xx Σ xy 
equivalently,  C Γ  2 = Σ Σ yy 
. Hence:
2    yx

(s + t )′ ′ Σ xx Σ xy   s + t   ′ ′ Ξ  Γ C Ξ n  s + t 
*
i (s − t )      = (s + t ) i (s − t )  n  C Γ  2 i (s − t )
  Σ yx
 Σ yy  i (s − t )   2    

Ξn  s + t  1 I n iI n   s + t  1 (s + t ) − (s − t )  t 
On the right side, i (s − t ) = 2 I = =
− iI n  i (s − t ) 2 (s + t ) + (s − t ) s 
. And on the
2    n

′ ′ Ξ ′ ′  I In 
*
left side, (s + t ) i (s − t )  n = (s + t ) i (s − t )   n = [s ′ t ′] . So:
1
  2 2  − iI n iI n 

(s + t )′ ′ Σ xx Σ xy   s + t   Γ C  t 
i (s − t )      = [s ′ t ′]  
  Σ yx
 Σ yy  i (s − t )  C Γ  s 

And the simplified expression, based on the mean and the augmented variance of z, is:

[
M z (s, t ) = E e s′z + t ′z ]
Γ C  t 
s′μ z + t ′μ z +
1
[s′ t ′ ]  
 C Γ  s 
=e 2

s′μ + t ′μ +
1
(s′Γt + s′Cs + t ′Ct + t ′Γs )
=e 2

As in Appendix A, let e j denote the jth unit vector of ℜ n , or even better, of C n . Then:

[ ] = M (e , 0) = e
1
zj μ j + C jj
2
Ee z j

E e  = E [e ] = M (0, e ) = e [ ]
1
μ j + C jj
=Ee
zj zj 2 zj
  z j

Moreover:

[ ] [ ]= M (e + e k , 0) = e
( )
1
z j +zk μ j +μ k + C jj + C jk + C kj + C kk
E e j ezk = E e
z 2
z j

Casualty Actuarial Society E-Forum, Fall 2015 23


Complex Random Variables

According to Section 6, C is symmetric ( C′ = C ). This and further simplification leads to:

[ ]= e ( )
[ ]E[e ]⋅ e
    1
μ j +μ k +
1
(C jj + C jk + C kj + C kk ) 1 1
 μ j + C jj  +  μ k + C kk  + C jk + C jk
=e =Ee
zj zk 2  2   2  2 zj zk C jk
Ee e

Hence, mindful of the transjugation (*) in the definition of complex covariance, we have:

[ ]
z


z

 
z
[ ] z z
[
Cov e j , e z k = E e j e z k  − E e j E e z k  = E e j e z k − E e j E e z k = E e j E e z k ⋅ e jk − 1
z C
] [ ] [ ] [ ] [ ]( )

In terms of w = e z this translates as Cov[w, w ] =  E [w ]E [w ]′   (e C − 1n×n ) , in which the ‘◦’ operator


 

represents elementwise multiplication. 11 So too:

 
z z
 
[
Cov e j , e z k  = Cov e j , e z k = E e j  E e z k ⋅ e jk − 1
z C
] [ ]( )
This translates as Cov[w , w ] =  E [w ]E [w ]′   e C − 1n×n .
 
( )

The remaining combination is the mixed form E e j e z k : [ z


]
[ ]= E[e e ]= E[e ]= M (e , e ) = e ( )
1
zj zj z j +zk μ j +μk + Γ jk + C jj + Ckk + Γkj
zk zk 2
Ee e z j k

Since Γ is Hermetian, Γkj = Γjk′ = Γ*jk = Γ jk . Hence:

[ ]= e ( )
[ ]E[e ]⋅ e
    1
μ j +μk +
1
(
Γ jk + C jj + Ckk + Γkj ) 1 1
 μ j + C jj  +  μ k + Ckk  + Γ jk + Γ jk Γ jk
=e =Ee
zj zk 2  2   2  2 zj zk
Ee e

Therefore, [ ] [
Cov e j , e z k = E e j e z k − E e
z z
] [ ]E[e ]= E[e ]E[e ]⋅ (e
zj zk zj zk Γ jk
−1 ,) which translates as



Cov[w, w ] =  E [w ]E [w ]   e Γ − 1n×n .

( ) By conjugation, Cov e j , e z k  = E e j  E e z k ⋅ e jk − 1 ,

z


z

 
Γ
[ ]( )
which translates as Cov[w , w ] =  E [w ]E [w ]′   e Γ − 1n×n .
 
( )

11Elementwise multiplication is formally known as the Hadamard, or Hadamard-Schur, product, of which we will make
use in Appendices A and C. Cf. Million [2007].

Casualty Actuarial Society E-Forum, Fall 2015 24


Complex Random Variables

We conclude this section by expressing it all in terms of w n×1 = e z n×1 . Let z be complex normal with

 z   Γ C
mean µ and augmented variance Var   =   . And let D be the n×1 vector consisting of the
z  C Γ 

main diagonal of C. Then D is the vectorization of the diagonal of C . So the augmented mean of

 w  eμ + D 2 
w is E   =  μ + D 2  . And the augmented variance of w is:
 w  e 

 w  Cov[w, w ] Cov[w, w ]
Var   =  
 w  Cov[w , w ] Cov[w , w ]
(
  E [w ]E [w ]′   e Γ − 1
 n× n ) (
 E [w ]E [w ]′   e C − 1
n× n )
=    
′
(
 E [w ]E [w ]   e − 1n×n

 C
)  
(
 E [w ]E [w ]′   e Γ − 1
n× n )
 E [w ]E [w ]′ ′
E [w ]E [w ]    C
Γ C

− 12 n×2 n 

= ′  e
Γ
 E [w ]E [w ]′ E [w ]E [w ]   
 
 w   z 

′   Var  z 
=  E   E [w w]   e − 12 n×2 n 
 
 w    
  w   w  *   Var  zz  
=  E   E      e   − 12 n×2 n 
 w w   
       

Scaling all the lognormal means to unity (or setting μ = − D 2 ), we can say that the coefficient-of-

z 
Var  
lognormal-augmented-variation matrix equals e z 
− 12 n×2 n , which is analogous with the well-

known coefficient of lognormal variation e σ − 1 .


2

9. THE COMPLEX LOGNORMAL RANDOM VARIABLE


The previous section derived the augmented mean and variance of the lognormal random vector;

this section provides some intuition into it. The complex lognormal random variable, or scalar,

Casualty Actuarial Society E-Forum, Fall 2015 25


Complex Random Variables

X   0 σ 2 ρστ  
derives from the real-valued normal bivariate   ~ N   ,   . Zero is not much of a
Y   0 ρστ τ 2  
   

restriction; since eCN (μ, V ) = eμ +CN (0, V ) = eμ  eCN (0, V ) , the normal mean affects only the scale of the

lognormal. The variance is written in correlation form, where − 1 ≤ ρ ≤ 1 . As usual, 0 < σ, τ < ∞ .

Z   X  1 i   X   X + iY 
Define   = Ξ1   =    =   . Its mean is zero, and according to Section 7 its
Z   Y  1 − i   Y   X − iY 

variance (the augmented variance) is:

 Z  1 i  σ 2 ρστ   1 1
Var   =   2  
 Z  1 − i  ρστ τ  − i i 
σ 2 + τ 2 − i (ρστ − ρστ ) σ 2 − τ 2 + i (ρστ + ρστ )
= 2 
σ − τ 2 − i (ρστ + ρστ ) σ 2 + τ 2 + i (ρστ − ρστ )
σ 2 + τ2 σ 2 − τ 2 + 2iρστ 
= 2 
σ − τ 2 − 2iρστ σ 2 + τ 2 

We will say little about non-zero correlation ( ρ ≠ 0 ); but at this point a digression on complex

correlation is apt. The coefficient of correlation between Z and its conjugate is:

σ 2 − τ 2 + 2iρστ σ 2 − τ 2 − 2iρστ
ρ ZZ = = = ρ ZZ
σ2 + τ2 σ2 + τ2

As a form of covariance, correlation is Hermetian. Moreover:

0 ≤ ρ ZZ ρ ZZ = ρ ZZ ρ Z Z =
(σ 2
) 2
− τ 2 + 4ρ 2 σ 2 τ 2

(σ 2
)
− τ 2 + 4(1)σ 2 τ 2
2

=
(σ 2
+ τ2 )
2

=1
(σ 2
+ τ2 )
2
(σ 2
+ τ2 )
2
(σ 2
+ τ2 )
2

So, the magnitude of complex correlation is not greater than unity. The imaginary part of the

correlation is zero unless some correlation exists between the real and imaginary parts of the

underlying bivariate. More interesting are the two limits: lim+


ρ ZZ = 1 and lim+
ρ ZZ = −1 . In the
2 2
τ →0 σ →0

Casualty Actuarial Society E-Forum, Fall 2015 26


Complex Random Variables

first case, Z → Z in a statistical sense, and the correlation approaches one. In the second case,

Z → − Z , and the correlation approaches negative one.

Now if W = e Z , by the formulas of Section 8, E [W ] = e 0+ (σ − τ 2 + 2 iρστ 2) = e (σ ) ⋅ e iρστ and


−τ2 2
2 2

E [W ] = e (σ ) ⋅ e −iρστ . And the augmented variance is:


−τ2 2
2

W   W  W    Var  Z  
* Z 

Var   = E   E    e − 12×2 
W   W  W    

 e (σ 2 − τ 2 ) 2 ⋅ e iρστ  2 2  σ  σ 2 − τ 2 + i ⋅2 ρστ  
[ ]
2
+τ2

=   (σ 2 − τ 2 ) 2 −iρστ  e (σ − τ ) 2 ⋅ e −iρστ e (σ
2
) ⋅ e iρστ    e  σ
−τ2 2
2
− τ 2 −i ⋅2 ρστ σ 2 + τ 2

 
− 12×2 
 
 e ⋅e    

1 e 2iρστ  e σ + τ − 1 e σ − τ + 2iρστ − 1
2 2 2 2

σ2 −τ2
=e  − 2iρστ    σ − τ − 2 iρστ 
− 1 e σ +τ − 1
2 2 2 2

e 1  e 
e σ + τ − 1 eσ − e 2iρστ 
− τ 2 + 4 iρστ
2 2 2

σ2 −τ2
=e  σ 2 − τ 2 − 4iρστ 
− e − 2iρστ eσ +τ2
−1
2
e 
e 2σ − e σ − τ e 2σ − 2 τ + 4iρστ − e σ −τ2
⋅ e 2iρστ 
2 2 2 2 2 2

=  2σ 2 − 2 τ 2 − 4iρστ 
− e σ − τ ⋅ e − 2iρστ e 2σ − e σ − τ
2 2 2 2 2
e 

W  1 W  2 2 1 1
In the first case above, as τ 2 → 0 + , E   → e σ 2   and Var   → e σ e σ − 1 
2

 . Since ( )
W  1 W  1 1

the complex part Y becomes probability-limited to its mean of zero, the complex lognormal

degenerates to the real-valued W = e X . The limiting result is oblivious to the underlying correlation

ρ, since W → W .

Casualty Actuarial Society E-Forum, Fall 2015 27


Complex Random Variables

W  1 2 e − 1 e −τ − 1
τ
W 
2 2

In the second case, as σ → 0 , E   → e − τ 2   and Var   → e −τ  −τ 2


+ 2

 . As
2
τ2
W  1 W  e − 1 e − 1 

in the first case, both E [W ] = E [W ] and the underlying correlation ρ has disappeared. Nevertheless,

the variance shows W and its conjugate to differ; in fact, their correlation is the real-valued

( 2
)( 2
)
ρ WW = e −τ − 1 eτ − 1 = −e −τ = − E [W ] .
2
Since τ2 > 0 , − 1 < ρ WW < 0 and

0 < E [W ] = E [W ] < 1 .

Both these cases are understandable from the “geometry” of W = e Z = e X +iY = e X e iY . The

complex exponential function is the basis of polar coordinates; e X is the magnitude of W, and Y is

the angle of W in radians counterclockwise from the real axis of the complex plane. Imagine a

canon whose angle and range can be set. In the first case, the angle is fixed at zero, but the range is

variable. This makes for a lognormal distribution along the positive real axis. In the second case,

the canon’s angle varies, but its range is fixed at e 0 = 1 . This makes all the shots to land on the

complex unit circle; hence, their mean lies within the circle, i.e., E [W ] < 1 . Moreover, the symmetry

( )
of Y as N 0, τ 2 -distributed guarantees E [W ] to fall on the real axis, or − 1 < E [W ] < 1 .

Furthermore, since the normal density function strictly decreases in both directions from the mean,

more shots land to the right of the imaginary axis than to the left, so 0 < E [W ] = e −τ < 1 . A “right-
2

handed” canon, or a canon whose angle is measured clockwise from the real axis, fires W = e X e − iY

shots.

Casualty Actuarial Society E-Forum, Fall 2015 28


Complex Random Variables

A shot from an unrestricted canon will “almost surely” not land on the real axis. 12 If we desire

negative values from the complex lognormal random variable, as a practical matter we must extract

them from its real or complex parts, e.g., U = Re(W ) . One can see in the second case, that as τ 2

grows larger, so too grows larger the probability that U < 0 . As τ 2 → ∞ , the probability

approaches one half. In the limit, the shots are uniformly distributed around the complex unit

circle. In this specialized case ( σ 2 → 0 + and τ 2 → ∞ ), the distribution of U = Re(W ) is

f U (u ) =
1
, for − 1 ≤ u ≤ 1 . 13
π 1− u 2

This suggests a third case, in which τ 2 → ∞ while σ 2 remains at some positive amount. An

intriguing feature of complex variables is that infinite variance in Y leads to a uniform distribution of

e iY . 14 So if W = e Z = e X e iY , U = Re(W ) = e X cos Y will be something of a reflected lognormal;

both its tails will be as heavy as the lognormal’s. 15 In this case:

W  e (σ − τ + 2iρστ ) 2  0
2 2

E   = lim  (σ 2 − τ 2 + 2iρστ ) 2  =  
W  τ →∞ e  0
2

W  e 2σ − e σ − τ e 2σ − 2 τ + 4iρστ − e σ −τ2
⋅ e 2iρστ  e 2σ 
2 2 2 2 2 2 2
0
Var   = lim  2σ 2 − 2 τ 2 − 4iρστ = 
− e σ − τ ⋅ e − 2iρστ e 2σ − e σ − τ
2 2 2 2 2
2σ 2
W  τ →∞ e  0 
2
e

Again, ρ has disappeared from the limiting distribution; but in this case ρ WW = 0 .

12 For an event almost surely to happen means that its probability is unity; for an event almost surely not to happen

means that its probability is zero. The latter case means not that the event will not happen, but rather that the event has
zero probability mass. For example, if X ~ Uniform[0, 1], Prob[X=½] = 0. So X almost surely does not equal ½, even
though ½ is as possible as any other number in the interval.
13 For more on this bimodal Arcsine(-1, 1) distribution see Wikipedia, “Arcsine distribution.”
14 The next section expands on this important subject. “Infinite variance in Y” means “as the variance of Y approaches

iY
infinity.” It does not mean that e is uniform for a variable Y whose variance is infinite, e.g., for a Pareto random
variable whose shape parameter is less than or equal to two.
15 Cf. Halliwell [2013] for a discussion on the right tails of the lognormal and other loss distributions.

Casualty Actuarial Society E-Forum, Fall 2015 29


Complex Random Variables

In practical work with U = Re(W ) , 16 the angular part e iY will be more important than the

lognormal range e X . For example, one who wanted the tendency for the larger magnitudes of

U = Re(W ) to be positive might set the mean of Y at − π 2 and the correlation ρ to some positive

value. Thus, greater than average values of Y, angling off into quadrants 4 and 1 of the complex

plane, would correlate with larger than average values of X and hence of e X . Of course,

Var [Y ] = τ 2 would have to be small enough that deviations of ± π from E [Y ] = − π 2 would be

tolerably rare. Equivalently, one could set the mean of Y at π 2 and the correlation ρ to some

negative value. As a second example, one who wanted negative values of U to be less frequent than

positive, might set both the mean of Y and ρ to zero, and set the variance of Y so that

Prob[Y > π 2] is desirably small. Some distributions of U for τ 2 >> σ 2 are bimodal, as in the

specialized case σ 2 → 0 + and τ 2 → ∞ . But less extreme parameters would result in unimodal

distributions for U over the entire real number line.

10. THE COMPLEX UNIT-CIRCLE RANDOM VARIABLE

In the previous section we claimed that as the variance τ 2 of the normal random variable Y

approaches infinity, e iY approaches a uniform distribution over the complex unit circle. The

explanation and justification of this claim in this section prepare for an important implication in the

next.

[ ]
Let real-valued random variable Y be distributed as N μ, σ 2 , and let W = e iY . According to the

[ ]
moment-generating-formula of Section 8, M Y (it ) = E e itY = e itμ + (it ) σ
2 2
2
= e itμ −t
2
σ2 2
. Although the

16 In the absence of an analytic distribution, practical work with the complex lognormal would seem to require
simulating its values from the underlying normal distribution.

Casualty Actuarial Society E-Forum, Fall 2015 30


Complex Random Variables

formula applies to complex values of t, here we’ll restrict it to real values. With t ∈ ℜ M Y (it ) is

known as the characteristic function of real variable Y. And so:

1 if t = 0
M Y (it ) = lim eitμ −t σ2 2
= eitμ lim e −t σ2 2
= δt 0 = 
2 2
lim
σ →∞ σ →∞ σ →∞
0 if t ≠ 0
2 2 2

It is noteworthy, and indicative of a uniformity of some sort, that µ drops out of the result.

Next, let real-valued random variable Θ be uniformly distributed over [a, a + 2πn], where n is a

positive integer; in symbols, Θ ~ U[a, a + 2πn] . Then:

M Θ (it ) = E e itΘ [ ]
a + 2πn
1
= ∫
θ=a
e itθ
2πn

a + 2πn
e itθ
=
2πitn a
2πitn
−1 e
= e ita
2πitn
 1 if t = 0

=  0 if tn = ±1, ± 2, 
≠ 0 if tn not integral

Letting n approach infinity, we have:

e 2πitn − 1 1 if t = 0
lim M Θ (it ) = e ita lim = δt0 = 
n →∞ 2πitn
n →∞
0 if t = 0

Hence, lim M Θ (it ) = lim M Y (it ) = δ t 0 . The equality of the limits of the characteristic functions of
n →∞ 2
σ →∞

the random variables implies the identity of the limits of their distributions; hence, the diffuse

uniform U[a, a + ∞ ] is “the same” as the diffuse normal N[μ, ∞ ] . 17

17 Quotes are around ‘the same’ because the limiting distributions are not proper distributions. The notion of diffuse
distributions comes from Venter [1996, pp. 406-410], who shows there how different diffuse distributions result in

Casualty Actuarial Society E-Forum, Fall 2015 31


Complex Random Variables

Indeed, for the limit to be δ t 0 it is not required that n be an integer. But for Θ ~ U[a, a + 2πn] , the

integral moments of W = eiΘ are:

 1 if j = 0
EW [ ] = E[e ]
j ijΘ 
=  0 if jn = ±1, ± 2, 
≠ 0 if jn not integral

So if n is an integer, jn will be an integer, and all the integral moments of W will be zero, except for

the zeroth. Therefore, the integral moments of W = eiΘ are invariant to n, as long as the n in 2πn,

the width of the interval of Θ, is a whole number. Hence, although we hereby define the unit-circle

random variable as e iΘ for Θ ~ U [0, 2π ] , the choice of a = 0 and n = 1 is out of convenience,

rather than out of necessity. The probability for e iΘ to be in an arc of this circle of length l equals

l 2π .

The integral moments of the conjugate of W = e − iΘ are the same, for

EW[ ] = E [W
j j
]= E[W ] = δ
j
j0 [ ]
= δ j 0 = E W j . Alternatively, E W [ ] = E[e ] = δ(
j −ijΘ
− j )0 = δ j 0 . And

the jkth mixed moment is [


E W jW k
] = E[e ijΘ
] [ ]
e −ikΘ = E e i ( j − k )Θ = δ ( j − k )0 = δ jk . Since

E [W ] = E [W ] = 0 , the augmented variance of the unit-circle random variable is:

W   W   WW WW   δ11 δ 20  1 0
Var   = E   [W W ] = E  = = = I2
W   W   W W W W  δ 02 δ11  0 1

Hence, W = eiΘ for Θ ~ U [0, 2π ] is not just a unit-circle random variable; having zero mean and

unit variance, it is the standard unit-circle random variable.

different Bayesian estimates. But here every continuous random variable Y diffuses through the periodicity of eiY into
the same limiting distribution, viz., the Kronecker δ t 0 (note 31).

Casualty Actuarial Society E-Forum, Fall 2015 32


Complex Random Variables

Multiplying W by a complex constant α ≠ 0 affects the radius of the random variable, whose jkth

mixed moment is:

[
E (αW ) αW
j
( ) ] = α α E[W W ] = α α δ
k j k j k j k
jk
(α α ) j
=
 0
if j = k
if j ≠ k

 αW 
The augmented variance is Var   = α α Var [W ] = α α I 2 . One may consider α as an instance of a
 αW 

complex random variable Α. Due to the independence of Α from W, the jkth mixed moment of

ΑW is [
E ( ΑW ) ΑW
j
( ) ] = E [Α
k j k
Α E W jW][ k
] = E [Α j k
]
Α δ jk = E Α Α [( ) ]δj
jk . Its augmented

 AW 
variance is Var  [ ] [ ]
 = E ΑΑ Var [W ] = {Var [A] + E [Α]E Α }Var [W ]. Unlike the one-dimensional
 AW 

W, ΑW can cover the whole complex plane. However, like W, it too possesses the desirable

[
property that E ( ΑW ) = δ j 0 . 18
j
]

11. UNIT-CIRCULARITY AND DETERMINISM


The single most important quality of a random variable is its mean. In fact, just having reliable

estimates of mean values would satisfy many users of actuarial analyses. Stochastic advances in

actuarial science over the last few decades notwithstanding, much actuarial work remains

deterministic. Determinism is not the reduction of a stochastic answer Y = f ( X ) to its mean

E [Y ] = E [ f ( X )] . Rather, the deterministic assumption is that the expectation of a function of a

random variable equals the function of the expectation of the random variable; in symbols,

18 The existence of the moments [


E Αj Α
k
] needs to be ascertained. In particular, moments for j and k as negative

integers will not exist unless Prob[Α = 0] = 1 − Prob[Α ≠ 0] = 1 − Prob Α Α > 0 = 0 . [ ]


Casualty Actuarial Society E-Forum, Fall 2015 33
Complex Random Variables

E [Y ] = E [ f ( X )] = f (E [X ]) . Because this assumption is true for linear f, it was felt to be a

reasonable or necessary approximation for non-linear f.

Advances in computing hardware and software, as well as increased technical sophistication, have

made determinism more avoidable and less acceptable. However, the complex unit-circular random

variable provides a habitat for the survival of determinism. To see this, let f be analytic over the

domain of complex random variable Z. From Cauchy’s Integral Formula (Havil [2003, Appendix

D.8 and D.9]) it follows that within the domain of Z, f can be expressed as a convergent series


f ( z ) = a 0 + a1 z +  = a 0 + ∑ a j z j . Taking the expectation, we have:
j =1

[ ]

E [ f (Z )] = a 0 + ∑ a j E Z j
j =1

[ ]
But if for every positive integer j E Z j = E [Z ] , then:
j

[ ]
∞ ∞
E [ f (Z )] = a 0 + ∑ a j E Z j = a 0 + ∑ a j E [Z ] = f (E [Z ])
j

j =1 j =1

Therefore, determinism conveniently works for analytic functions of random variables whose

moments are powers of their means.

Now a real-valued random variable whose moments are powers of its mean would have the

characteristic function:

[ ]
M X (it ) = E e itX = 1 + ∑

(it ) j [ ] = 1 + ∑ (itj)! E[X ]
EX j
∞ j
j
= e itE [ X ] = M E [ X ] (it )
j =1 j! j =1

This is the characteristic function of the “deterministic” random variable, i.e., the random variable

whose probability is massed at one point, its mean. So determinism with real-valued random

Casualty Actuarial Society E-Forum, Fall 2015 34


Complex Random Variables

variables requires “deterministic” random variables. But some complex random variables, such as

[ ]
the unit-circle, have the property E Z j = E [Z ] without being deterministic.
j

[ ]
In fact, when E Z j = E [Z ] , not only is E [ f (Z )] = f [E [Z ]] . For positive integer k, f
j k
(z ) is as

[
analytic as f itself; hence, E f k
(Z )] = f k
(E[Z ]) . So the determinism with these complex random

variables is valid for all moments; nothing is lost.

In Section 10 we saw that for the unit-circle random variable W = e iΘ and for j = ±1, ± 2,  ,

[ ] [ ]
E W − j = E W j = 0 = E [W ] . Can determinism extend to non-analytic functions which involve
j

the negative moments? For example, let g ( z ) = 1 (η − z ) , for some complex η ≠ 0 . The function

is singular at z = η ; but within the disc {z : z η < 1} = {z : z < η } the function equals the

convergent series:

1  z  z   1 z
2
z2
g ( z ) = 1 (η − z ) = ⋅
1 1
= 1 + +   +  = + 2 + 3 + 
η  z  η  η η  η η η
1 −  
 η

Outside the disc, or for {z : z η > 1} = {z : z > η }, another convergent series represents the

function:

1  η  η   1 η η2
2

g ( z ) = 1 (η − z ) = − ⋅
1 1
= − 1 + +   +  = − − 2 − 3 − 
z  η z  z  z   z z z
1 − 
 z

So, if η > 1 , then W = 1 < η . In this case:

Casualty Actuarial Society E-Forum, Fall 2015 35


Complex Random Variables

1 W W2 
E [g (W )] = E  + 2 + 3 + 
η η η 
1 E [W ] E W
= + 2 +
2
+
[ ]
η η η3
1 0 0
= + 2 + 3 +
η η η
1
=
η

However, if η < 1 , then W = 1 > η . So in this case:

 1 η η2 
E [g (W )] = E − − 2 − 3 − 
 W W W 
[ ]
−1
[ ]
= − E W − ηE W − η E W −3 − 
−2 2
[ ]
= − 0 − η ⋅ 0 − η2 ⋅ 0 − 
=0

Both answers are correct; however, only the first satisfies the deterministic equation

E [g (W )] = g (E [W ]) = g (0 ) = 1 η .

To understand why the answer depends on whether η is inside or outside the complex unit circle, let

us evaluate E [g (W )] directly:


[(
E [g (W )] = E [1 (η − W )] = E 1 η − e iΘ
)] = ∫ 1 dθ


θ =0 η − e

The next step is to transform from θ into z = e iθ . So dz = ie iθ dθ = izdθ , and the line integral

transforms into a contour integral over the unit circle C:

Casualty Actuarial Society E-Forum, Fall 2015 36


Complex Random Variables


1 dθ
E [g (W )] = ∫
θ =0
η − eiθ 2π
1 iz dθ
=∫
C
η − z iz 2π
1 1 dz
=∫
C
η − z z 2π i
1 dz
= ∫
2π i C z (η − z )
1 1 1 1
= ∫  +  dz
2π i C  z η − z  η
1  1 dz 1 dz 
= 
 ∫ + ∫ 
η  2π i C z 2π i C η − z 
1 1 dz 1 dz 
= 
 ∫ − ∫ 
η  2π i C z − 0 2π i C z − η 

Now the value of each of these integrals is one if its singularity is within the unit circle C, and zero if

it is not. 19 Of course, the singularity of the first integral at z = 0 lies within C; hence, its value is

one. The second integral’s singularity at z = η lies within C if and only if η < 1 . Therefore:

1 1 dz  1 if η > 1
E [g (W )] =  = g (E [W ]) ⋅ 
dz 1
 ∫ − ∫
η  2π i C z − 0 2π i C z − η  0 if η < 1

So the deterministic equation will hold for one of the Laurent series according to which the domain

of the non-analytic function is divided into regions of convergence. Fascinating enough is how the

1 if η > 1
function φ(η) =
1 dz
∫ =
2π i C z − η 0 if η < 1
serves as the indicator of a state, viz., the state of being

inside or outside the complex unit circle.

19Technically, the integral has no value if the singularity lies on C; but there are some practical advantages for “splitting
the difference” in that case.

Casualty Actuarial Society E-Forum, Fall 2015 37


Complex Random Variables

12. THE LINEAR STATISTICAL MODEL


Better known as “regression” models, linear statistical models extend readily into the realm of

complex numbers. A general real-valued form of such models is presented and derived in Halliwell

[1997, Appendix C]:

 y1   X1   e1   e1   Σ11 Σ12 
y  = X β + e , Var e  = Σ 
 2  2  2  2   21 Σ 22 

The subscript ‘1’ denotes observations, ‘2’ denotes predictions. Vector y1 is observed; the whole

design matrix X is hypothesized, as well as the fourfold ‘Σ’ variance structure. Although the

variance structure may be non-negative-definite (NND), the variance of the observations Σ11 must

be positive-definite (PD). Also, the observation design X 1 must be of full column rank. The last

−1
two requirements ensure the existence of the inverses Σ11 and X1′Σ11
−1
X1 ( ) −1
. The best linear

−1
unbiased predictor of y 2 is yˆ 2 = X 2 βˆ + Σ 21Σ11 y 1 − X 1βˆ . ( ) The variance of prediction error

(
y 2 − ŷ 2 is Var [y 2 − yˆ 2 ] = X 2 − Σ 21Σ11
−1
) [ ]( −1
X 1 Var βˆ X 2 − Σ 21Σ11

) −1
X 1 + Σ 22 − Σ 21Σ11 Σ12 . Embedded

in these formulas are the estimator of β and its variance:

(
βˆ = X 1′ Σ11
−1
X1 )
−1
X 1′ Σ11
−1
[]
y 1 = Var βˆ ⋅ X 1′ Σ11
−1
y1 .

For the purpose of introducing complex numbers into the linear statistical model we will concern

ourselves here only the estimation of the parameter β. So we drop the subscripts ‘1’ and ‘2’ and

simplify the observation as y = Xβ + e , where Var [e] = Γ . Again, X must be of full column rank

and Γ must be Hermetian PD. According to Section 4, transjugation is to complex matrices what

transposition is to real-valued matrices. Therefore, the short answer for a complex model is:

(
βˆ = X * Γ −1 X )−1
[]
X * Γ −1 y = Var βˆ ⋅ X * Γ −1 y .

Casualty Actuarial Society E-Forum, Fall 2015 38


Complex Random Variables

However, deriving the solution from the double-real representation in Section 3 will deepen the

understanding. The double-real form of the observation is:

y r − y i  X r − X i  β r − β i  e r − ei 
y = +
 i y r   X i X r   β i β r   e i e r 

All the vectors and matrices in this form are real-valued. The subscripts ‘r’ and ‘i’ denote the real

and imaginary parts of y, X, β, and e. Due to the redundancy of double-real representation, we may

retain just the left column:

y r  X r − X i  β r  e r 
y  =  X +
X r  β i   ei 
 i  i

Note that if X is real-valued, then X i = 0 , and y r and y i become two “data panels,” each with its

own parameter β r and β i . 20

I iI t 
Now let Ξ t =  t , the augmentation matrix of Section 7, where t is the number of
I t − iI t 

observations. Since the augmentation matrix is non-singular, premultiplying the left-column form

by it yields equivalent but insightful forms:

20 This assumes zero covariance between the error vectors.

Casualty Actuarial Society E-Forum, Fall 2015 39


Complex Random Variables

I t iI t  y r  I t iI t  X r − X i  β r  I t iI t  e r 
I = +
t − iI t   y i  I t − iI t   X i X r  β i  I t − iI t   e i 
y r + iy i  X r + iX i − X i + iX r  β r  e r + ie i 
 y − iy  =  X − iX − X − iX  β  + e − ie 
 r i  r i i r  i   r i

y   X iX  β r  e 
 y  =  X − iX   β  +  e 
    i   
 y   Xβ r + iXβ i  e 
 y  =  Xβ − i Xβ  +  e 
   r i  
 y   Xβ  e 
y  = X β  + e 
     
 y  X 0  β   e 
 y  =  0 X  β  +  e 
      

The first insight is that y = Xβ + e = X β + e is as much observed as y = Xβ + e . The second

e 
insight is that Var   is an augmented variance, whose general form according to Section 6 is
e 

e   Γ C
Var   =   . Therefore, the general form of the observation of a complex linear model is
e   C Γ 

 y  X 0  β   e  e   Γ C
 y  =  0 X  β  +  e  , where Var e  =  C Γ  . Not only is y as observable as y, but also β
          

is as estimable as β. Furthermore, although the augmented variance may default to C = 0 , the

e 
complex linear statistical model does not require   to be “proper complex,” as defined in Section
e 

7.

X 0 
Since X is of full column rank, so too must be   . And since Γ is Hermetian PD, both it and
 0 X

 Γ C
its conjugate Γ are invertible. But the general form of the observation requires   to be
C Γ 

Casualty Actuarial Society E-Forum, Fall 2015 40


Complex Random Variables

Hermetian PD, hence invertible. A consequence is that both the “determinant” forms Γ − C Γ −1C

and Γ − C Γ −1C are Hermetian PD and invertible. With this background it can be shown, and the

−1
 Γ C Η K
reader should verify, that   =  (
, where Η = Γ − C Γ −1 C )−1
and K = −Γ −1C Η . The
C Γ  K Η

important point is that inversion preserves the augmented-variance form.

 y  X 0  β   e  e   Γ C
The solution of the complex linear model   =     +   , where Var   =   , is:
 y   0 X  β   e  e   C Γ 

−1
 βˆ   X 0   Γ C X 0   X 0   Γ C  y 
* −1 * −1

ˆ =            
 β    0 X   C Γ   0 X    0 X   C Γ   y 
−1
 X * 0  Η K  X 0   X * 0  Η K  y 
=     
 0
 X ′  K Η   0 X    0 X ′  K Η   y 
−1
 X *Η X X * K X   X *Η y + X * K y 
=   
 X′KX X′Η X   X′Ky + X′Η y 

−1
βˆ  X *ΗX X *KX 
And Var  ˆ  =   , which must exist since it is a quadratic form based on the
β   X′KX X′Η X 

−1
 Γ C X 0 
Hermetian PD   and the full column rank   . Reformulate this as:
C Γ   0 X

X *ΗX X *KX   βˆ  X *Ηy + X *Ky 


  ˆ  =  
 X ′K X X ′Η X   β   X′Ky + X′Η y 

The conjugates of the two equations in β̂ and β̂ are the same equations in β̂ and β̂ :

X *ΗX X *KX   βˆ  X *Ηy + X *Ky  X *ΗX X *KX   βˆ 


   =  =  ˆ 
 X′KX X′Η X   βˆ   X′Ky + X′Η y   X′KX X′Η X   β 

Casualty Actuarial Society E-Forum, Fall 2015 41


Complex Random Variables

 ˆ   βˆ 
Therefore,  β  =  ˆ  . It is well known that the estimator of a linear function of a random variable
 βˆ   β 

is the linear function of the estimator of the random variable. But conjugation is not a linear

function. Nevertheless, we have just proven that the estimator of the conjugate is the conjugate of

the estimator.

13. ACTUARIAL APPLICATIONS OF COMPLEX RANDOM VARIABLES


How might casualty actuaries put complex random variables to work? Since the support of most

complex random variables is a plane, rather than a line, their obvious application is bivariate. An

example is a random variable whose real part is loss and whose imaginary part is LAE. Another

application might pertain to copulas. According to Venter [2002, p. 69], “copulas are joint

distributions of unit random variables.” One could translate these joint distributions into

distributions of complex variables whose support is the complex unit square, i.e., the square whose

vertices are the points z = 0, 1, 1 + i, i . However, for now it seems that real-valued bivariates

provide the necessary theory and technique for these purposes.

Actuaries who have applied log-linear models to triangles with paid increments have been frustrated

applying them to incurred triangles. The problem is that incurred increments are often negative, and

the logarithm of a negative number is not real-valued. This has led Glenn Meyers [2013] to seek

modified lognormal distributions whose support includes the negative real numbers. The persistent

intractability of the log-linear problem was a major reason for our attention to the lognormal

random vector w n×1 = e z n×1 in Section 8. But to model an incurred loss as the exponential function

of a complex number suffers from two drawbacks. First, to model a real-valued loss as e z = e x ⋅ e iy

requires y to be an integral multiple of π. The mixed random variable e X with probability p and

Casualty Actuarial Society E-Forum, Fall 2015 42


Complex Random Variables

− e X with probability 1 − p is not lognormal. No more suitable are such “denatured” random

( )
variables as Re e Z . Second, one still cannot model the eminently practical value of zero, because

for all z, e z ≠ 0 . 21 At present it does not appear that complex random variables will give birth to

useful distributions of real-valued random variables. Even the unit-circle and indicator random

variables of Sections 10 and 11, as interesting as they are in the theory of analytic functions, most

likely will engender no distributions valuable to actuarial work.

The complex version of the linear model in Section 12 showed us that conjugates of observations

are themselves observations and that conjugates of estimators are estimators of conjugates.

Moreover, there we found a use for augmented variance. Nonetheless we are still fairly bound to

our conclusion to Section 3, that one who lacked either the confidence or the software to work with

complex numbers could probably do a work-around with double-real matrices.

So how can actuarial science benefit from complex random variables? The great benefit will come

from new ways of thinking. The first step will be to overcome the habit of picturing a complex

number as half real and half imaginary. Historically, it was only after numbers had expanded from

rational to irrational that the whole set was called “real.” Numbers ultimately are sets; zero is just

the empty set. How real are sets? Regardless of their mathematical reality, they are not physically

real. Complex numbers were deemed “real” because mathematicians needed them for the solution

of polynomial equations. In the nineteenth century this spurred the development of abstract

algebra. At first new ways of thinking amount to differences in degree; at some point many develop

21 Ife a = 0 for some a, then for all z e z = e z − a + a = e z − a e a = e z − a ⋅ 0 = 0 . One who sees that
lim e x ⋅ eiy = 0 ⋅ eiy = 0 might propose to add the ordinate Re( z ) = x = −∞ to the complex plane. But not only
x → −∞
is this proposal artificial; it also militates against the standard theory of complex variables, according to which all points
infinitely far from zero constitute one and the same point at infinity.

Casualty Actuarial Society E-Forum, Fall 2015 43


Complex Random Variables

into differences in kind. One might argue, “Why study Euclidean geometry? It all derives from a

few axioms.” True, but great theorems (e.g., that the sum of the angles of a triangle is the sum of

two right angles) can be a long way from their axioms. A theorem means more than the course of

its proof; often there are many proofs of a theorem. Furthermore, mathematicians often work

backwards from accepted or desired truths to efficient and elegant sets of axioms. Perhaps the

most wonderful thing about mathematics is its “unreasonable effectiveness in the natural sciences,”

to quote physicist Eugene Wigner. The causality between pure and applied mathematics works in

both directions. Therefore, it is likely that complex random variables and vectors will find their way

into actuarial science. But it will take years, even decades, and technology and education will have to

prepare for it.

14. CONCLUSION
Just as physics divides into different areas, e.g., theoretical, experimental, and applied, so too

actuarial science, though perhaps more concentrated on business application, justifiably has and

needs a theoretical component. Theory and application cross-fertilize each other. In this paper we

have proposed to add complex numbers to the probability and statistics of actuarial theory. With

patience, the technically inclined actuary should be able to understand the theory of complex

random variables delineated herein. In fact, our multivariate approach may even more difficult to

understand than the complex-function theory; but both belong together. Although all complex

matrices and operations were formed from double-real counterparts, we believe that the “sum is

greater than the parts,” i.e., that the assimilation of this theory will lead to higher-order thinking and

creativity. In the sixteenth century the “fiction” of i = − 1 allowed mathematicians to solve more

equations. Although at first complex solutions were deemed “extraneous roots,” eventually their

practicality became recognized; so that today complex numbers are essential for science and

Casualty Actuarial Society E-Forum, Fall 2015 44


Complex Random Variables

engineering. Applying complex numbers to probability has lagged; but even now it is part of signal

processing in electrical engineering. Knowing how rapidly science has developed with nuclear

physics, molecular biology, space exploration, and computers, who would dare to bet against the

usefulness of complex random variables to actuarial science by the mid-2030s, when many scientists

and futurists expect nuclear fusion to be harnessed and available for commercial purposes?

Casualty Actuarial Society E-Forum, Fall 2015 45


Complex Random Variables

REFERENCES
[1.] Halliwell, Leigh J., Conjoint Prediction of Paid and Incurred Losses, CAS Forum, Summer 1997, 241-380,
www.casact.org/pubs/forum/97sforum/97sf1241.pdf.

[2.] Halliwell, Leigh J., “Classifying the Tails of Loss Distributions,” CAS E-Forum, Spring 2013, Volume 2,
www.casact.org/pubs/forum/13spforumv2/Haliwell.pdf.

[3.] Havil, Julian, Gamma: Exploring Euler’s Constant, Princeton University Press, 2003.

[4.] Healy, M. J. R., Matrices for Statistics, Oxford, Clarendon Press, 1986.

[5.] Hogg, Robert V., and Stuart A. Klugman, Loss Distributions, New York, Wiley, 1984.

[6.] Johnson, Richard A., and Dean Wichern, Applied Multivariate Statistical Analysis (Third Edition), Englewood
Cliffs, NJ, Prentice Hall, 1992.

[7.] Judge, George G., Hill, R. C., et al., Introduction to the Theory and Practice of Econometrics (Second Edition), New
York, Wiley, 1988.

[8.] Klugman, Stuart A., et al., Loss Models: From Data to Decisions, New York, Wiley, 1998.

[9.] Meyers, Glenn, “The Skew Normal Distribution and Beyond,” Actuarial Review, May 2013, p. 15,
www.casact.org/newsletter/pdfUpload/ar/AR_May2013_1.pdf.

[10.] Million, Elizabeth, The Hadamard Product, 2007, http://buzzard.ups.edu/courses/2007spring/


projects/million-paper.pdf.

[11.] Schneider, Hans, and George Barker, Matrices and Linear Algebra, New York, Dover, 1973.

[12.] Veeravalli, V. V., Proper Complex Random Variables and Vectors, 2006, http://courses.engr.
illinois.edu/ece461/handouts/notes6.pdf.

[13.] Venter, G.G., "Credibility," Foundations of Casualty Actuarial Science (Third Edition), Casualty Actuarial Society,
1996.

[14.] Venter, G.G., "Tails of Copulas," PCAS LXXXIX, 2002, 68-113, www.casact.org/pubs/proceed/
proceed02/02068.pdf.

[15.] Weyl, Hermann, The Theory of Groups and Quantum Mechanics, New York, Dover, 1950 (ET by H. P. Robertson
from second German Edition 1930).

[16.] Wikipedia contributors, “Arcsine distribution,” Wikipedia, The Free Encyclopedia,


http://en.wikipedia.org/wiki/Arcsine_distribution (accessed October 2014).

[17.] Wikipedia contributors, “Complex normal distribution,” Wikipedia, The Free Encyclopedia,
http://en.wikipedia.org/wiki/Complex_normal_distribution (accessed October 2014).

[18.] Wikipedia contributors, “Complex number,” Wikipedia, The Free Encyclopedia,


http://en.wikipedia.org/wiki/Complex_number (accessed October 2014).

Casualty Actuarial Society E-Forum, Fall 2015 46


Complex Random Variables

APPENDIX A

REAL-VALUED LOGNORMAL RANDOM VECTORS


Feeling that the treatment of lognormal random vectors in Section 8 would be too long, we have

decided to prepare for it in Appendices A and B. According to Section 7, the probability density

function of real-valued n×1 normal random vector x with mean µ and variance Σ is:


1
( x −μ )′ Σ −1 ( x −μ )
f x (x ) =
1 2
e
(2π )n Σ

Therefore, ∫ f (x )dV = 1 .
x The single integral over ℜ n represents an n-multiple integral over each
x∈ℜ n

xj from –∞ to +∞; dV = dx 1  dx n .

 ∑t jxj 
n

The moment generating function of x is M x (t ) = E e t′x [ ] = E e j =1  , where t is a suitable n×1



 
 

vector. 22 Partial derivatives of the moment generating function evaluated at t = 0 n×1 equal moments

of x, since:

∂ k1 +kn M x (t )
∂ k1 x 1  ∂ kn x n
[
= E x1k1  x nkn e t ′x ] t =0
[
= E x1k1  x nk n ]
t =0

But lognormal moments are values of the function itself. For example, if t = e j , the jth unit vector,

then M x (e j ) = E e [ ] = E[e ].
e′j x Xj
[
Likewise, M x (e j + e k ) = E e j e X k . The moment generating
X
]
function of x, if it exists, is the key to the moments of e x .

22 All real-valued t vectors are suitable; Appendix B will extend the suitability to complex t.

Casualty Actuarial Society E-Forum, Fall 2015 47


Complex Random Variables

The moment generating function of the real-valued multivariate normal x is:

M x (t ) = E e t′x [ ]
1 −
1
( x −μ )′ Σ −1 ( x −μ ) t ′x
= ∫ (2π ) n
Σ
e 2
e dV
x∈ℜ n

1 −
1
{
( x −μ )′ Σ −1 ( x −μ )− 2 t ′x }
= ∫ (2π ) n
Σ
e 2
dV
x∈ℜ n

A multivariate “completion of the square” results in the identity:

(x − μ )′ Σ −1 (x − μ ) − 2t ′x = (x − [μ + Σt ])′ Σ −1 (x − [μ + Σt ]) − 2t ′μ − t ′Σt
We leave it for the reader to verify. By substitution, we have:


1
{
( x −μ )′ Σ −1 ( x −μ )− 2 t ′x }
M x (t ) =
1
∫ (2π ) n
Σ
e 2
dV
x∈ℜ n

1 −
1
{
( x −[μ + Σt ])′ Σ −1 ( x −[μ + Σt ])− 2 t ′μ − t ′Σt }
= ∫ (2π ) n
Σ
e 2
dV
x∈ℜ n

1 −
1
( x −[μ + Σt ])′ Σ −1 ( x −[μ + Σt ])
= ∫ (2π ) n
Σ
e 2
dV ⋅ e t′μ + t′Σt 2
x∈ℜ n

= 1 ⋅ e t′μ + t′Σt 2
= e t′μ + t′Σt 2

The reduction of the integral to unity in the second last line is due to the fact that

1 −
1
( x −[μ + Σt ])′ Σ −1 ( x −[μ + Σt ])
e 2
is the probability density function of the real-valued n×1 normal
(2π )n Σ
random vector with mean μ + Σt and variance Σ. This new mean is valid if it is real-valued, which

will be so if t is real-valued. In fact, μ + Σt is real-valued if and only if t is real-valued.

Casualty Actuarial Society E-Forum, Fall 2015 48


Complex Random Variables

So the moment generating function of the real-valued normal multivariate x ~ N (μ, Σ ) is

M x (t ) = e t′μ + t′Σt 2 , which is valid at least for t ∈ ℜ n . As a check: 23

∂M x (t ) ∂M x (t )
= (μ + Σt )e t′μ + t′Σt 2 ⇒ E [x]= =μ
∂t ∂t t =0

And for the second derivative:

∂ 2 M x (t ) ∂ (μ + Σt )e t′μ + t′Σt 2
=
∂t∂t ′ ∂t ′

=  Σ + (μ + Σt )(μ + Σt ) e t′μ + t′Σt 2
 
∂ 2 M x (t )
⇒ E [xx ′] = = Σ + μμ ′
∂t∂t ′ t =0
⇒ Var [x] = E [xx ′] − μμ ′ = Σ

The lognormal moments follow from the moment generating function:

Ee[ ] = E[e ] = M (e ) = e
Xj e′j x
x j
e′j μ + e′j Σe j 2
=e
µ j + Σ jj 2

The second moments are conveniently expressed in terms of first moments:

[ ]
E e j e X k = E e
X

(e j + e k )′ x 

(e j + e k )′ μ + (e j + e k )′ Σ (e j + e k ) 2
=e
µ j + µ k + (Σ jj + Σ jk + Σ kj + Σ kk ) 2
=e
µ j + Σ jj 2 + µ k + Σ kk 2 + (Σ jk + Σ kj ) 2
=e
µ j + Σ jj 2 (Σ jk + Σ kj ) 2
=e ⋅ e µ k + Σ kk 2 ⋅ e
µ j + Σ jj 2 (Σ jk + Σ jk ) 2
=e ⋅ e µ k + Σ kk 2 ⋅ e
=Ee [ ]E[e ]⋅ e
Xj Xk Σ jk

23 The vector formulation of partial differentiation is explained in Appendix A.17 of Judge [1988].

Casualty Actuarial Society E-Forum, Fall 2015 49


Complex Random Variables

[ X
] [
So, Cov e j , e X k = E e j e X k − E e
X
] [ ]E[e ] = E[e ]E[e ](e
Xj Xk Xj Xk Σ jk
)
− 1 , which is the multivariate

equivalent of the well-known scalar formula CV 2 e X = Var e X [ ] [ ] E[e ] X 2


[ ]
= e σ − 1 . Letting E e x
2

denote the n×1 vector whose jth element is E e [ ],


Xj 24 
[ ]

and diag  E e x  as its n×n diagonalization,
 


[ ]  
[ ]  
[ ]
we have Var [x] = diag  E e x {e Σ − 1n×n }diag  E e x  . Because diag  E e x  is diagonal in positive

     

elements (hence, symmetric and PD), Var [x] is NND [or PD] if and only if e Σ −1n×n is NND [or

PD]. 25 Symmetry is no issue here, because for real-valued matrices, Σ is symmetric if and only if

e Σ −1n×n is symmetric.

The relation between Σ and Τ = e Σ − 1n×n is merits a discussion whose result will be clear from a

consideration of 2×2 matrices. Since Σ 2×2 is symmetric, it is defined in terms of three real numbers:

a b 
Σ=  . Now Σ is NND if and only if 1) a ≥ 0 , 2) d ≥ 0 , and 3) ad − b 2 ≥ 0 . Σ is PD if and
b d 

only if these three conditions are strictly greater than zero. If a or d is zero, by the third condition b

also must be zero. 26 Since we are not interested in degenerate random variables, which are

effectively constants, we will require a and d to be positive. With this requirement, Σ is NND if and

24 e A , i.e., that the exponential function of matrix


This would follow naturally from the “elementwise” interpretation of
A
A is the matrix of the exponential functions of the elements of A. But if A is a square matrix, e may have the

“matrix” interpretation I n + ∑ A j j!.
j =1
25 PD [positive-definite] and NND [non-negative-definite] are defined in Section 4.
 X   Cov[ X 1 , X 1 ] Cov[ X 1 , X 2 ] a b 
26 Since NND matrices represent variances, Σ = Var  1  =  =  . The
 X 2  Cov[ X 2 , X 1 ] Cov[ X 2 , X 2 ] b d 
fact that a or d equals 0 implies that b equals 0 means that a random variable can’t covary with another random variable
unless it covaries with itself.

Casualty Actuarial Society E-Forum, Fall 2015 50


Complex Random Variables

only if b 2 ≤ ad , and PD if and only if b 2 < ad . Since a and d are positive, so too is ad, as well as

the geometric mean γ = ad . So Σ is NND if and only if − γ ≤ b ≤ γ , and PD if and only if

a+d
− γ < b < γ . It is well-known that min (a, d ) ≤ γ ≤ ≤ max(a, d ) with equality if and only if
2

a=d.

e a − 1 e b − 1
Now the same three conditions determine the definiteness of Τ = e Σ − 12×2 =  b .
 e − 1 e d
− 1

Since we required a and d to be positive, both e a − 1 and e d − 1 are positive. This leaves the

( )( ) ( )(
definiteness of Τ dependent on the relation between e b − 1 e b − 1 and e a − 1 e d − 1 . We will )
next examine this relation according to the three cases b = 0 , b > 0 , and b < 0 , all of which must

be subject to − γ ≤ b ≤ γ .

( )( ) (
First, if b = 0 , then − γ < b < γ and Σ is PD. Furthermore, e b − 1 e b − 1 = 0 < e a − 1 e d − 1 . )( )
Therefore, in this case, the lognormal transformation Σ → Τ = e Σ − 12×2 is from PD to PD. And

zero covariance in the normal pair produces zero covariance in the lognormal pair. In fact, since

zero covariance between normal bivariates implies independence (cf. §2.5.7 of Judge [1988]), the

lognormal bivariates also are independent.

In the second case, b > 0 , or more fully, 0 < b ≤ γ . Σ is PD if and only if b < γ . Define the

(( ) )
function ϕ (x ) = ln e x − 1 x for positive real x (or x ∈ ℜ + ). A graph will show that the function

strictly increases, i.e., ϕ ( x1 ) < ϕ ( x 2 ) if and only if x1 < x 2 . Moreover, the function is concave

upward. This means that the line segment between points ( x1 , ϕ ( x1 )) and ( x 2 , ϕ ( x 2 )) lies above

Casualty Actuarial Society E-Forum, Fall 2015 51


Complex Random Variables

ϕ (x1 ) + ϕ (x 2 )  x + x1 
the curve ϕ ( x ) for intermediate values of x. In particular, ≥ ϕ 1 .
2  2 

 x + x1 
Equivalently, for all x1 , x 2 ∈ ℜ + , 2ϕ  1  ≤ ϕ ( x1 ) + ϕ ( x 2 ) with equality if and only if x1 = x 2 .
 2 

a+d  a+d
Therefore, since a and d are positive, 2ϕ   ≤ ϕ (a ) + ϕ (d ) . And since 0 < γ = ad ≤ ,
 2  2

a+d 
2ϕ (γ ) ≤ 2ϕ   ≤ ϕ (a ) + ϕ (d ) . So 2ϕ (γ ) ≤ ϕ (a ) + ϕ (d ) with equality if and only if a = d .
 2 

Furthermore, since in this case 0<b≤γ , 2ϕ (b ) ≤ 2ϕ (γ ) ≤ ϕ (a ) + ϕ (d ) . Hence,

2ϕ (b ) ≤ ϕ (a ) + ϕ (d ) . The equality prevails if and only if b = γ = a = d , or if and only if a = b = d .

If a = b = d then Σ is not PD; otherwise Σ is PD. Hence:

 eb − 1   ea −1  ed −1
2 ln  = 2ϕ (b ) ≤ ϕ (a ) + ϕ (d ) = ln  + ln 
 b   a   d 

The inequality is preserved by exponentiation:

2
 eb − 1   e a − 1  e d − 1 
  ≤   
 b   a  d 

This leads at last to the inequality:

2 2
 eb − 1  b
( )
e − 1 = b 
b 2 2
 ≤
b2 a
( )( ) (
e − 1 e d − 1 =   e a − 1 e d − 1 ≤ 1 ⋅ e a − 1 e d − 1)( ) ( )( )
 b  ad γ 

( ) (
2
)( )
Therefore, in this case e b − 1 ≤ e a − 1 e d − 1 with equality if and only if a = b = d . This means

that if b > 0 , the lognormal transformation Σ → Τ = e Σ − 12×2 is from NND to NND. But Τ is

( ) (
2
)( )
NND only if e b − 1 = e a − 1 e d − 1 , or only if a = b = d . Otherwise, Τ is PD. So, when

b > 0 , Τ = e Σ − 12×2 is PD except when all four elements of Σ have the same positive value. Even

Casualty Actuarial Society E-Forum, Fall 2015 52


Complex Random Variables

 a ad 
the NND matrix Σ =   log-transforms into a PD matrix, unless a = d . So all PD and
 ad d

most NND normal variances transform into PD lognormal variances. A NND lognormal variance

indicates that at least one element of the normal random vector is duplicated.

In the third and final case, b < 0 , or more fully, − γ ≤ b < 0 . This is equivalent to 0 < −b ≤ γ , or

( ) (
2
)( )
to the second case with − b . In that case, e −b − 1 ≤ e a − 1 e d − 1 with equality if and only if

a = −b = d . But from this, as well as from the fact that 0 < e 2b < e 2⋅0 < 1 , it follows:

(e b
) 2
(
− 1 = e 2 b 1 − e −b )2
( )
2
( )( ) ( )(
= e 2 b e −b − 1 ≤ e 2 b e a − 1 e d − 1 < 1 ⋅ e a − 1 e d − 1 )
( ) (
2
)( )
So in this case the inequality is strict: e b − 1 < e a − 1 e d − 1 , and Τ is PD. Therefore, if b < 0 ,

the lognormal transform Σ → Τ = e Σ − 12×2 is PD, even if Σ is NND.

To summarize, the lognormal transformation Σ → Τ = e Σ − 12×2 is PD if Σ is PD. Even when Σ is

not PD, but merely NND, Τ is almost always PD. Only when Σ is so NND as to conceal a

random-variable duplication is its lognormal transformation NND.

The Hadamard (elementwise) product and Schur’s product theorem allow for an understanding of

the general lognormal transformation Σ → Τ = e Σ − 1n×n . Denoting the elementwise nth power of Σ

n factors
   ∞
as Σ n = Σ    Σ , we can express elementwise exponentiation as e Σ = ∑ Σ  j j! . So
j =0


Τ = e Σ − 1n×n = ∑ Σ  j j! . According to Schur’s theorem (§3 of Million [2007]), the Hadamard
j =1

Casualty Actuarial Society E-Forum, Fall 2015 53


Complex Random Variables

product of two NND matrices is NND. 27 Since Σ is NND, its powers Σ  j are NND, as well as the

terms Σ  j j! . Being the sum of a countable number of NND matrices, Τ also must be NND. 28

But if just one of the terms of the sum is PD, the sum itself must be PD. Therefore, if Σ is PD,

then Τ also is PD.

Now the kernel of m×n matrix A is the set of all x ∈ ℜ n such that Ax = 0 , or

ker (A ) = {x : Ax = 0} . The kernel is a linear subspace of ℜ n and its dimensionality is n − rank (A ) .

By the Cholesky decomposition the NND matrix U can be factored as U n×n = W ′Wn×n . The


quadratic form in U is x ′Ux = x ′W ′Wx = (Wx ) (Wx ) . If x ′Ux = 0 , then Wx = 0 n×1 , and

Ux = W ′Wx = W ′0 n×1 = 0 n×1 . Conversely, if Ux = 0 n×1 , then x ′Ux = 0 . So the kernel of NND

matrix U is precisely the solution set of x ′Ux = 0 , i.e., x ′Ux = 0 if and only if x ∈ ker (U ) .


Therefore, the kernel of Τ = ∑ Σ  j j! is the intersection of the kernels of Σ  j , or
j =1


( )
ker (Τ) =  ker Σ  j . It is possible for this intersection to be of dimension zero, i.e., for it to equal
j =1

{0 n×1 }, even though the kernel of no Σ  j is. Because of the accumulation of intersections in

k →∞

∑Σ j =1
j
j! , the lognormal transformation of NND matrix Σ tends to be “more PD” than Σ itself.

27 Appendix C provides a proof of this theorem.


 
28 For the quadratic form of a sum equals the sum of the quadratic forms: x ′ ∑ U k  x = ∑ x ′U k x . In fact, if the
 k  k

coefficients aj are non-negative, then ∑a Σ
j =1
j
j
is NND, provided that the sum converges, as e Σ does.

Casualty Actuarial Society E-Forum, Fall 2015 54


Complex Random Variables

We showed above that if Σ is PD, then Τ = e Σ − 1n×n is PD. But even if Σ is NND, Τ will be PD,

Σ jj Σ jk 
unless Σ contains a 2×2 subvariance  whose four elements are all equal.
Σ kj Σ kk 

Casualty Actuarial Society E-Forum, Fall 2015 55


Complex Random Variables

APPENDIX B

THE NORMAL MOMENT GENERATING FUNCTION AND COMPLEX


ARGUMENTS
In Appendix A we saw that the moment generating function of the real-valued normal multivariate

x ~ N (μ, Σ ) , viz., M x (t ) = e t′μ + t′Σt 2 , is valid at least for t ∈ ℜ n . The validity rests on the identity

1 −
1
( x −[μ + Σt ])′ Σ −1 ( x −[μ + Σt ])
∫ (2π ) n
Σ
e 2
dV = 1 for real-valued ξ = μ + Σt . But in Section 8 we must
x∈ℜ n

1
− ( x − ξ )′ Σ −1 ( x − ξ )
know the value of ϕ (ξ ) =
1
∫ (2π ) n
Σ
e 2
dV when ξ is a complex n×1 vector. So in
x∈ℜ n

this appendix, we will prove that for all ξ ∈ C n , ϕ (ξ ) = 1 .

The proof begins with diagonalization. Since Σ is symmetric and PD, Σ −1 exists and is symmetric

and PD. According to the Cholesky decomposition (Healy [1986, §7.2]), there exists a real-valued

n×n matrix W such that W ′W = Σ −1 . Due to theorems on matrix rank, W must be non-singular, or

invertible. So the transformation y = Wx is one-to-one. And letting ζ = Wξ , we have

y − ζ = W (x − ξ ) . Moreover, the volume element in the y coordinates is:

dVx
dVy = W dVx = W dVx = W′ W dVx = W′W dVx = Σ −1 dVx =
2
.
Σ

Hence:

Casualty Actuarial Society E-Forum, Fall 2015 56


Complex Random Variables

1
− ( x − ξ )′ Σ −1 ( x − ξ )
ϕ (ξ ) =
1
∫ (2π ) n
Σ
e 2
dV
x∈ℜ n

1
1 − ( x − ξ )′ W′W ( x − ξ )
= ∫ (2π ) n
Σ
e 2
dVx
x∈ℜ n

1 −
1
[W ( x −ξ )]′ [W ( x −ξ )] dVx
= ∫ (2π ) n
e 2

Σ
x∈ℜ n

1 −
1
( y − ζ )′ ( y − ζ )
= ∫ (2π ) n
e 2
dVy
y∈ℜ n

∑( )2
1 n
− y j −ζ j
1
∫ (2π )
2 j =1
= e dV
n
y∈ℜ n

∑( )2
1 n
n − y j −ζ j
1
∫ ∏
2 j =1
= e dV
y∈ℜ n j =1 2π

∑( )2
1 n
n ∞ − y j −ζ j
1
=∏ ∫
2 j =1
e dy j
j =1 y j = −∞ 2π

= ∏ ψ(ζ j )
n

j =1

1
b
− ( x − ζ )2
In the last line ψ(ζ ) = lim dx . Obviously, if ζ is real-valued, ψ(ζ ) = 1 . So the
1

a → −∞ x = a 2π
e 2

b → +∞

issue of the value of a moment generating function of a complex variable resolves into the issue of

the “total probability” of a unit-variance normal random variable with a complex mean. 29

To evaluate ψ(ζ ) requires some complex analysis with contour integrals. First, consider the

z2

standard-normal density function with a complex argument: f ( z ) =
1
e 2
. By function-

composition rules, since both z 2 and the exponential function are “entire” functions (i.e., analytic

29 We deliberately put ‘total probability’ in quotes because the probability density function with complex ζ is not proper;
it may produce negative and even complex values for probability densities.

Casualty Actuarial Society E-Forum, Fall 2015 57


Complex Random Variables

over the whole complex plane), so too is f ( z ) . Therefore, ∫ f (z )dz = 0 for any closed contour C
C

(cf. Appendix D.7 of Havil [2003]). Let C be the parallelogram traced from vertex z = b to vertex

z = a to vertex z = a − ζ to vertex z = b − ζ and finally back to vertex z = b . Therefore:

0 = ∫ f ( z )dz
C
a a −ζ b −ζ b
= ∫ f ( z )dz + ∫ f (z )dz + ∫ f (z )dz + ∫ f (z )dz
b a a −ζ b −ζ

The line segments along which the second and fourth integrals proceed are finite; their common

length is L = (a − ζ ) − a = − ζ = ζ = b − (b − ζ ) , where ζ = ζ ζ ≥ 0 . By the triangle inequality

a −ζ a −ζ a −ζ

∫ f (z )dz ≤ ∫ f (z )dz = ∫ f (z ) dz . But f ( z ) is a continuous real-valued function, so over a


a a a

closed interval it must be upper-bounded by some positive real number M. Hence,

a −ζ a −ζ a −ζ a −ζ

∫ f (z )dz ≤ ∫ f (z ) dz ≤ ∫ Sup( f (z ) ) dz = Sup( f (z ∈ [a, a − ζ ]) ) ∫ dz = M (a ) ⋅ L .


a a a a
Likewise,

b b

∫ f (z )dz ≤ Sup( f (z ∈ [b, b − ζ ]) ) ∫ dz = M (b ) ⋅ L .


b −ζ b −ζ

Now, in general:

Casualty Actuarial Society E-Forum, Fall 2015 58


Complex Random Variables

f (z ) = f (z ) f (z )
z2 z2
1 − 1 −
= e 2
⋅ e 2

2π 2π
z2 z2
− −
e 2
⋅e 2
=

z2 z2
− −
∝ e 2
⋅e 2

z2 +z2

∝ e 2

∝ e − Re (z )
2

Therefore, since a ∈ ℜ :

lim M (a ) ⋅ L ∝ lim Sup e − Re (z ) ; z ∈ [a, a − ζ ] ⋅ L


2

a → −∞ a → −∞  
 − a 2 Re   z  
2

  a   z  ζ  
∝ lim Sup  e  
; ∈ 1, 1 − ⋅L
a 2 → +∞
 a  a  
 
∝ e −∞ Re (1) ⋅ L
∝0

Similarly, lim M (b ) ⋅ L ∝ 0 . So in the limit as a → −∞ and b → +∞ on the real axis, the second
b → +∞

and fourth integrals approach zero. Accordingly:

0 = lim {0}
a → −∞
b → +∞

 a a −ζ b −ζ b 
= lim ∫ f ( z )dz + ∫ f ( z )dz + ∫ f ( z )dz + ∫ f ( z )dz 
a → −∞  b 
b → +∞ 
a a −ζ b −ζ

 
a b −ζ

= lim ∫ f ( z )dz + ∫ f ( z )dz 


a → −∞  b 
b → +∞ 
a −ζ

b −ζ a b +∞ 1
− z2
lim ∫ f (z )dz = − lim ∫ f (z )dz = lim ∫ f (z )dz = ∫
1
And so, e 2
dz = 1
a → −∞ a − ζ a → −∞ b a → −∞ a −∞ 2π
b → +∞ b → +∞ b → +∞

Casualty Actuarial Society E-Forum, Fall 2015 59


Complex Random Variables

So, at length:

1
b
− ( x − ζ )2
ψ(ζ ) = lim
1

a → −∞ x = a 2π
e 2
dx
b → +∞
1
b
− (( z + ζ )− ζ )2
d (z + ζ )
1
= lim ∫
a → −∞ x = a 2π
e 2

b → +∞
b −ζ 1
− z2
d (z + ζ )
1
= lim ∫
a → −∞ z = a − ζ 2π
e 2

b → +∞
b −ζ 1
1 − z2
= lim ∫
a → −∞ z = a − ζ 2π
e 2
dz
b → +∞
b −ζ

= lim ∫ f (z )dz
a → −∞ z = a − ζ
b → +∞

=1

+∞ 1
− ( x − ζ )2
So, working backwards, what we proved for one dimension, viz., ψ(ζ ) =
1

x = −∞ 2π
e 2
dx = 1 ,

1
− ( x − ξ )′ Σ −1 ( x − ξ )
applies n-dimensionally: for all ξ ∈ C , ϕ (ξ ) =
1
∫ (2π ) dV = 1 . Therefore, even
n 2
e
Σ
n
x∈ℜ n

for complex t, M x (t ) = e t′μ + t′Σt 2 . Complex values are allowable as arguments in the moment

generating function of a real-valued normal vector. This result is critical to Section 8.

Though we believe the contour-integral proof above to be worthwhile for its instructional value, a

simple proof comes from the powerful theorem of analytic continuation (cf. Appendix D.12 of

Havel [2003]). This theorem concerns two functions that are analytic within a common domain. If

the functions are equal over any smooth curve within the domain, no matter how short, 30 then they

30 The length of the curve must be positive; equality at single points, or punctuated equality, does not qualify.

Casualty Actuarial Society E-Forum, Fall 2015 60


Complex Random Variables

1
b
− ( x − ζ )2
are equal over all the domain. Now ψ(ζ ) = lim
1

a → −∞ x = a 2π
e 2
dx is analytic over all the complex
b → +∞

plane. And for all real-valued ζ, ψ(ζ ) = 1 . So ψ(ζ ) and f (ζ ) = 1 are two functions analytic over the

complex plane and identical on the real axis. Therefore, by analytic continuation ψ(ζ ) must equal

one for all complex ζ. Analytic continuation is analogous with the theorem in real analysis that two

smooth functions equal over any interval are equal everywhere. Analytic continuation derives from

the fact that a complex derivative is the same in all directions. It is mistaken to regard the real and

imaginary parts of the derivative as partial derivatives, as if they applied respectively to the real and

imaginary axes of the independent variable. Rather, the whole derivative applies in every direction.

Casualty Actuarial Society E-Forum, Fall 2015 61


Complex Random Variables

APPENDIX C

EIGEN-DECOMPOSITION AND SCHUR’S PRODUCT THEOREM


Appendix A quoted Schur’s Product Theorem, viz., that the Hadamard product of non-negative-

definite (NND) matrices is NND. Million [2007] proves it as Theorem 3.4; however, we believe our

proof in this appendix to be simpler; moreover, it affords a review of eigen-decomposition. Those

familiar with eigen-decomposition may skip to the last paragraph.

Let Γ be an n×n Hermetian NND matrix. As explained in Section 4, ‘Hermetian’ means that

Γ = Γ * ; ‘NND’ means that for every complex n×n vector z (or z ∈ C n ), z * Γz ≥ 0 . Positive

definiteness [PD] is a stricter condition, in which z * Γz = 0 if and only if z = 0 n×1 .

Complex scalar λ and non-zero vector ν form an “eigenvalue-eigenvector” pair of Γ, if Γν = λν .

Since ν = 0 n×1 is excluded as a trivial solution, vector ν can be scaled to unity, or ν * ν = 1 . But

Γν = λν if and only if (Γ − λI n )ν = 0 n×1 . If Γ − λI n is non-singular, or invertible, then:

ν = I n ν = (Γ − λI n ) (Γ − λI n )ν = (Γ − λI n )−1 0 n×1 = 0 n×1


−1

Hence, allowable eigenvectors require for Γ − λI n to be singular, or for its determinant Γ − λI n to

be zero. Since the determinant is an nth-degree equation (with complex coefficients based on the

elements of Γ) in λ, it has n root values of λ, not necessarily distinct. So the determinant can be

factored as f (λ ) = Γ − λI n = (λ − λ 1 ) (λ − λ n ) . Since f (λ j ) = Γ − λ j I n = 0 , there exist non-

zero solutions to (Γ − λI n )ν = 0 n×1 . So for every eigenvalue there is a non-zero eigenvector, even a

non-zero eigenvector of unit magnitude.

Casualty Actuarial Society E-Forum, Fall 2015 62


Complex Random Variables

The first important result is that the eigenvalues of Γ are real-valued and non-negative. Consider the

jth eigenvalue-eigenvector pair, which satisfies the equation Γν j = λ j ν j . Therefore,

ν *j Γν j = λ j ν *j ν j . Since Γ is NND, ν *j Γν j is real-valued and non-negative. Also, ν *j ν j is real-

valued and positive. Therefore, their quotient λj is a real-valued and non-negative scalar.

Furthermore, if Γ is PD, ν *j Γν j is positive, as well as λj.

The second important result is that eigenvectors paired with unequal eigenvalues are orthogonal.

Let the two unequal eigenvalues be λ j ≠ λ k . Because the eigenvalues are real-valued, λ j = λ j . The

eigenvector equations are Γν j = λ j ν j and Γν k = λ k ν k . The following string of equations relies on

Γ’s being Hermetian (so Γ = Γ * ):

(λ j − λ k )ν *j ν k = λ j ν *j ν k − λ k ν *j ν k
= λ j ν *j ν k − λ k ν *j ν k
( ) −λ ν ν
= λ j ν *k ν j
*
k
*
j k

= (ν Γν ) − ν Γν
* * *
k j j k

= ν Γ ν k − ν Γν k
*
j
* *
j

= ν *j Γν k − ν *j Γν k
=0

Because λ j − λ k ≠ 0 , the eigenvectors must be orthogonal, or ν *j ν k = 0 . If all the eigenvectors are

distinct, the eigenvectors form an orthogonal basis of C n . But even if not, the kernel of each

{ }
eigenvalue, or ker (Γ − λ j I n ) = z ∈ C n : Γz = λ j z is a linear subspace of C n whose rank or

dimensionality equals the multiplicity of the root λj in the characteristic equation f (λ ) = Γ − λI n .

This means that the number of mutually orthogonal eigenvectors paired with an eigenvalue equals

Casualty Actuarial Society E-Forum, Fall 2015 63


Complex Random Variables

how many times that eigenvalue is a root of its characteristic equation. Consequently, there exist n

eigenvalue-eigenvector pairs (λ j , ν j ) such that Γν j = λ j ν j and ν *j ν k = δ ij . 31

Now, define W as the partitioned matrix Wn×n = [ν 1  ν n ] . The jkth element of W*W equals

ν *j ν k = δ ij ; hence, W * W = I n . A matrix whose transjugate is its inverse is called “unitary,” as is

W. 32 Furthermore, define Λ as the n×n diagonal matrix whose jjth element is λj. Then:

λ 1 
ΓW = Γ[ν 1  ν n ] = [Γν 1  Γν n ] = [λ 1 ν 1  λ n ν n ] = [ν 1 
 ν n ]   = WΛ

 λ n 

And so, Γ = ΓI n = ΓWW * = WΛW * , and Γ is said to be “diagonalized.” Thus have we shown,

33
assuming the theory of equations, the third important result, viz., that every NND Hermetian

matrix can be diagonalized. Other matrices can be diagonalized; the NND [or PD] consists in the

fact that all the eigenvalues of this diagonalization are non-negative [or positive].

The fourth and final “eigen” result relies on the identity W * ν j = e j , which just extracts the jth

columns of each side of W * W = I n . As in Appendix A, e j is the jth unit vector. Therefore:

31 The Kronecker delta, δ ij , is the function IF(i = j , 1, 0 ) .


32 To be precise, at this point W* is only the left-inverse of W. But by matrix-rank theorems, the rank of W equals n, so
W has a unique full inverse ( ) ( )
W -1 . Then W * = W * I n = W * WW -1 = W * W W -1 = I n W -1 = W -1 .
33 The theory of equations guarantees the existence of the roots of the nth-degree equation f (λ ) = Γ − λI n .
Appendix D.9 of Havel [2003] contains a quick and lucid proof of this, the fundamental theorem of algebra.

Casualty Actuarial Society E-Forum, Fall 2015 64


Complex Random Variables

Γ = WΛW *
 n 
= W ∑ λ j e j e*j  W *
 j =1 
 n
(
= W ∑ λ j W * ν j W * ν j)( ) W
* *

 j =1 
 n 
= W ∑ λ j W * ν j ν *j W  W *
 j =1 
 n 
= WW *  ∑ λ j ν j ν *j  WW *
 j =1 
 n 
= I n  ∑ λ j ν j ν *j I n
 j =1 
n
= ∑ λ j ν j ν *j
j =1

n
The form ∑λ ν
j =1
j j ν *j is called the “spectral decomposition” of Γ (§7.4 of Healy [1986]), which plays

the leading role in the following succinct proof of Schur’s product theorem.

If Σ and Τ are two n×n Hermetian NND definite matrices, we may spectrally decompose them as

n n
Σ = ∑ λ j ν j ν *j and Τ = ∑ κ j η j η*j , where all the λ and κ scalars are non-negative. Accordingly:
j =1 j =1

Casualty Actuarial Society E-Forum, Fall 2015 65


Complex Random Variables

(Σ  Τ) jk = (Σ ) jk (Τ) jk
 n   n 
=  ∑ λ r ν r ν *r   ∑ κ s η s η*s 
 r =1  jk  s =1  jk
 n  n 
=  ∑ λ r (ν r ) j (ν r )k  ∑ κ s (η s ) j (η s )k 
 r =1  s =1 
n n
= ∑∑ λ r κ s (ν r ) j (η s ) j (ν r )k (η s )k
r =1 s =1
n n
= ∑∑ λ r κ s (ν r  η s ) j (ν r  η s )k
r =1 s =1

( )
n n
= ∑∑ λ r κ s (ν r  η s ) j ν r  η s k
r =1 s =1

( )
n n
= ∑∑ λ r κ s (ν r  η s )(ν r  η s )
*
jk
r =1 s =1

 n n *
=  ∑∑ λ r κ s (ν r  η s )(ν r  η s ) 
 r =1 s =1  jk

n n
Hence, Σ  Τ = ∑∑ λ r κ s (ν r  η s )(ν r  η s ) . Since each matrix (ν r  η s )(ν r  η s ) is Hermetian
* *

r =1 s =1

NND, and each scalar λ r κ s is non-negative, Σ  Τ must be Hermetian NND. Therefore, the

Hadamard product of NND matrices is NND.

Casualty Actuarial Society E-Forum, Fall 2015 66

You might also like