Forum 15fforum Halliwell Complex

Complex Random Variables
Leigh J. Halliwell, FCAS, MAAA

______________________________________________________________________________
Abstract: Rarely have casualty actuaries needed, much less wanted, to work with complex numbers. One readily
could wisecrack about imaginary dollars and creative accounting. However, complex numbers are well
established in mathematics; they even provided the impetus for abstract algebra. Moreover, they are essential in
several scientific fields, most notably in electromagnetism and quantum mechanics, the two fields to which most
of the sparse material about complex random variables is tied. This paper will introduce complex random
variables to an actuarial audience, arguing that complex random variables will eventually prove useful in the field
of actuarial science. First, it will describe the two ways in which statistical work with complex numbers differs
from that with real numbers, viz., in transjugation versus transposition and in rank versus dimension. Next, it
will introduce the mean and the variance of the complex random vector, and derive the distribution function of
the standard complex normal random vector. Then it will derive the general distribution of the complex normal
multivariate and discuss the behavior and moments of complex lognormal variables, a limiting case of which is
iΘ
the unit-circle random variable W = e for real Θ uniformly distributed. Finally, it will suggest several
foreseeable actuarial applications of the preceding theory, especially its application to linear statistical modeling.
Though the paper will be algebraically intense, it will require little knowledge of complex-function theory. But
some of that theory, viz., Cauchy’s theorem and analytic continuation, will arise in an appendix on the complex
moment generating function of a normal random multivariate.
Keywords: Complex numbers, matrices, and random vectors; augmented variance; lognormal and unit-circle
distributions; determinism; Cauchy-Riemann; analytic continuation
______________________________________________________________________________
1. INTRODUCTION
Even though their education has touched on algebra and calculus with complex numbers, most
casualty actuaries would be hard-pressed to cite an actuarial use for numbers of the form x + iy .
Their use in the discrete Fourier transformation (Klugman [1998], §4.7.1) is notable; however, many
would view this as a trick or convenience, rather than as indicating any further usefulness. In this
paper we will develop a probability theory for complex random variables and vectors, arguing that
such a theory will eventually find actuarial uses. The development, lengthy and sometimes arduous,
will take the following steps. Sections 2-4 will base complex matrices in certain real-valued matrices
called “double-real.” This serves the aim of our presentation, namely, to analogize from real-valued
random variables and vectors to complex ones. Transposition and dimension in the real-valued
Casualty Actuarial Society E-Forum, Fall 2015 1

realm become transjugation and rank in the complex. These differences figure into the standard
quadratic form of Section 5, where also the distribution of the standard complex normal random
vector is derived. Section 6 will elaborate on the variance of a complex random vector, as well as
introduce “augmented variance,” i.e., the variance of dyad whose second part is the complex
conjugate of the first. Section 7 derives of the formula for the distribution of the general complex
normal multivariate. Of special interest to many casualty actuaries should be the treatment of the
complex lognormal random vector in Section 8, an intuition into whose behavior Section 9 provides
on a univariate or scalar level. Even further simplification in the next two sections leads to the unit-
circle random variable, which is the only random variable with widespread deterministic effects. In
Section 12 we adapt the linear statistical model to complex multivariates. Finally, Section 13 lists
foreseeable applications of complex random variables. However, we believe their greatest benefit
resides not in their concrete applications, but rather in their fostering abstractions of thought and
imagination. Three appendices delve into mathematical issues too complicated for the body of
paper. Those who work on an advanced level with lognormal random variables should read
Appendix A (“Real-Valued Lognormal Random Vectors”), regardless of their interest in complex
random variables.
2. INVERTING COMPLEX MATRICES

Let m×n complex matrix Z be composed of real and imaginary parts X and Y, i.e., Z = X + iY . Of
course, X and Y also must be m×n. Since only square matrices have inverses, our purpose here
requires that m = n . Complex matrix W = A + iB is an inverse of Z if and only if ZW = WZ = I n ,
where In is the n×n identity matrix. Because such an inverse must be unique, we may say that
Z −1 = W . Under what conditions does W exist?

First, define the conjugate of Z as Z = X − iY . Since the conjugate of a product equals the product
of the conjugates, 1 if Z is non-singular, then Z Z −1 = ZZ −1 = I n = I n . Similarly, Z −1 Z = I n .
−1
Therefore, Z too is non-singular, and Z = Z -1 . Moreover, if Z is non-singular, so too are i n Z
and i n Z . Therefore, the invertibility of X + iY , − Y + iX , − X − iY , Y − iX , X − iY , Y + iX ,
− X + iY , and − Y − iX is true for all eight or true for none. Invertibility is no respecter of the real
and imaginary parts.
Now if the inverse of Z is W = A + iB , then I n = (X + iY )(A + iB) = (A + iB)(X + iY ) .
Expanding the first equality, we have:
I n = (X + iY )(A + iB)
= XA + iYA + iXB + i 2 YB
= XA + iYA + iXB − YB
= (XA − YB) + i (YA + XB)
Therefore, ZW = I n if and only if XA − YB = I n and YA + XB = 0 n×n . We may combine the last
two equations into the partitioned-matrix form:
X − Y  A  I n 
Y =
 X   B 0 
Since YA + XB = 0 if and only if − YA − XB = 0 , another form just as valid is:
 X − Y   − B  0 
=
Y
 X   A  I n 
We may combine these two forms into the balanced form:
 X − Y   A − B  I n 0
= = I 2n
Y
 X   B A  0 I n 
1 If Z and W are conformable to multiplication:

ZW = (X + iY )(A + iB) = XA − YB + i (XB + YA ) = XA − YB − i (XB + YA ) = (X − iY )(A − iB) = Z W

 X − Y   A − B  I n 0
Therefore, ZW = I n if and only if  = . By a similar expansion of the
Y X   B A  0 I n 
 A − B  X − Y   I n 0
last equality above, WZ = I n if and only if  = . Hence, we conclude
B A  Y X  0 I n 
that the n×n complex matrix Z = X + iY is non-singular, or has an inverse, if and only if the 2n×2n
−1
X − Y  X − Y 
real-valued matrix  is non-singular. Moreover, if exists, it will have the form
Y X  Y
 X 
 A − B
B  and Z −1 will equal A + iB .
 A
3. COMPLEX MATRICES AS DOUBLE-REAL MATRICES

That the problem of inverting an n×n complex matrix resolves into the problem of inverting a
2n×2n real-valued matrix suggests that with complex numbers one somehow gets “two for the price
of one.” It even hints of a relation between the general m×n complex matrix Z = X + iY and the
X − Y  X − Y 
2m×2n complex matrix   . If X and Y are m×n real matrices, we will call  a
Y X Y X 
double-real matrix. The matrix is double in two senses; first, in that it involves two (same-sized and
real-valued) matrices X and Y, and second, in that its right half is redundant, or reproducible from
its left.
Returning to the hint above, we easily see an addition analogy:
X − Y1  X 2 − Y2 
X 1 + iY1 + X 2 + iY2 ⇔  1 +
 Y1 X 1   Y2 X 2 

And if Z1 is m×n and Z2 is n×p, so that the matrices are conformable to multiplication, then
Z1 Z 2 = (X 1 + iY1 )(X 2 + iY2 ) = (X 1 X 2 − Y1 Y2 ) + i (X 1 Y2 + Y1 X 2 ) . This is analogous with the
double-real multiplication:
X 1 − Y1  X 2 − Y2   X 1 X 2 − Y1 Y2 − X 1 Y2 − Y1 X 2 
Y =
 1 X 1   Y2 X 2  X 1 Y2 + Y1 X 2 X 1 X 2 − Y1 Y2 
Rather trivial is the analogy between the m×n complex zero matrix and the 2m×2n double-real zero
0 − 0
matrix  m×n , as well as that between the n×n complex identity matrix and the 2n×2n double-
0 0
I −0 
real identity matrix  n .
0 I n 
The general 2m×2n double-real matrix may itself be decomposed into quasi-real and quasi-imaginary
X − Y  X 0   0 − Y 
parts:  = + . And in the case of square matrices ( m = n ) this extends
Y X   0 X  Y 0 
 X − Y   X 0  0 − I n  Y 0 
to the form = + , wherein the double-real matrix
Y
 X   0 X  I n 0   0 Y 
0 − In 
I is analogous with the imaginary unit, inasmuch as:
 n 0 
− In 
2
0 0 − I n  0 − I n  − I n 0  I 0
I  = = = (− 1) n
 n 0  I n 0  I n 0   0 
− In  0 I n 
Finally, one of the most important theorems of linear algebra is that every m×n complex matrix
Z = X + iY may be reduced by invertible transformations to “canonical form” (Healy [1986], 32-
34). In symbols, for every Z there exist non-singular matrices U and V such that:

I 0
U m×m Z m×n Vn×n =  r
0 0 m×n
The m×n real matrix on the right side of the equation consists entirely of zeroes except for r
instances of one along its main diagonal. Since invertible matrix operations can reposition the ones,
it is further stipulated that the ones appear as a block in the upper-left corner. Although many
reductions of Z to canonical form exist, the canonical forms themselves must all contain the same
number of ones, r, which is defined as the rank of Z. Providing the matrices with real and complex
parts, we have:
U m×m Z m×n Vn×n = (P + iQ )(X + iY )(R + iS)

= (PXR − QYR − PYS − QXS) + i (PXS − QYS + PYR + QXR )
= A + iB
I 0
= r + i 0 m×n
0 0
The double-real analogue to this is:
  I r 0 
   0 m×n 
 P − Q X − Y  R − S   A − B   0 0  
Q = =
 P  Y X   S R   B A    I r 0 
 0 m×n 0 0  
  
 P − Q
As shown in the previous section,  is non-singular, or invertible, if and only if
Q P 
R − S
U = P + i Q is non-singular; the same is true for  . Therefore, the rank of the double-real
S R 
analogue of a complex matrix is twice the rank of the complex matrix. Moreover, the 2r instances of
one correspond to r quasi-real and r quasi-imaginary instances. It is not possible for the
contribution to the rank of a matrix to be real without its being imaginary, and vice versa.

To conclude this section, there are extensive analogies between complex and double-real matrices,
analogies so extensive that one who lacked either the confidence or the software to work with
complex numbers could probably do a work-around with double-real matrices. 2
4. COMPLEX MATRICES AND VARIANCE

′
Σ = Var [x] = E (x − µ )(x − µ )  is a real-valued n×n matrix, whose jkth element is the covariance of
 
the jth element of x with the kth element. Since the covariance of two real-valued random variables is
symmetric, Σ must be a symmetric matrix. But a realistic Σ must have one other property, viz., non-
negative definiteness (NND). This means that for every real-valued n×1 vector ξ, ξ ′Σξ ≥ 0 . 3 This
must be true, because ξ ′Σξ is the variance of the real-valued random variable ξ ′x :
′ ′ ′
Var [ξ ′x] = E (ξ ′x − ξ ′µ )(ξ ′x − ξ ′µ )  = E ξ ′(x − µ )(x − µ ) ξ  = ξ ′E (x − µ )(x − µ )  ξ = ξ ′Σξ
     
Although variances of real-valued random variables may be zero, they must not be negative. Now if
ξ ′Σξ > 0 for all ξ ≠ 0 n×1 , the variance Σ is said to be positive-definite (PD). Every invertible NND
matrix must be PD. Moreover, every NND matrix may be expressed as the product of some real
matrix and its transpose, the most common method for doing this being the Cholesky
 x − y
2 The representation of the complex scalar z = x + iy as the real 2×2 matrix  is a common theme in
y x 
modern algebra (e.g., section 7.2 of the Wikipedia article “Complex number”). We have merely extended the
representation to complex matrices. Our representation is even more meaningful when expressed in the Kronecker-
 X − Y   X 0   0 − Y   1 0 0 − 1
product form Y  =  +  =  ⊗X+  ⊗ Y. Due to certain properties
 X   0 X  Y 0  0 1  1 0
of the Kronecker product (cf. Judge [1988], Appendix A.15), all the analogies of this section would hold even in the
 1 0 0 − 1
commuted form X⊗  +Y⊗  . In practical terms this means that it matters not whether the form
0 1  1 0
is 2×2 of m×n or m×n of 2×2.
3 []
More accurately, ξ ′Σξ ≥ 0 , since the quadratic form ξ′Σξ is a 1×1 matrix. The relevant point is that 1×1 real-
valued matrices are as orderable as their real-valued elements are. Appendices A.13 and A.14 of Judge [1988] provide
introductions to quadratic forms and definiteness that are sufficient to prove the theorems used herein.

decomposition (Healy [1986], §7.2). Accordingly, if Σ is NND, then A ′ΣA ≥ 0 for any conformable
real-valued matrix A. Finally, if Σ is PD and real-valued n×r matrix A is of full column rank, i.e.,
rank (A n×r ) = r , then the r×r matrix A ′ΣA is PD.
X − Y 
In the remainder of this section we will show how the analogy between X + i Y and 
Y X 
leads to a proper definition of the variance of a complex random vector. We start by considering
X − Y 
Y as a real-valued variance matrix. In order to be so, first it must be symmetric:
 X 
′
 X − Y   X ′ Y ′  X − Y 
Y = =
 X  − Y ′ X ′ Y X 
X − Y 
Hence,  is symmetric if and only if X ′ = X and Y ′ = −Y . In words, X is symmetric and
Y X 
Y is skew-symmetric. Clearly, the main diagonal of a skew-symmetric matrix must be zero. But of
greater significance, if a and b are real-valued n×1 vectors:
′
a ′Yb = (a ′Yb )1×1 = (a ′Yb ) = b ′Y ′a = b ′(− Y )a = −b ′Ya
Consequently, if b = a :
a ′Ya + a ′Ya a ′Ya + (− a ′Ya )

a ′Ya = = = 01×1
2 2
Next, considering the specifications on X, Y, a, and b, we evaluate the 2×2 quadratic form:

′
 a − b  X − Y   a − b   a ′ b′ X − Y   a − b 
b =
 a  Y X  b a  − b′ a ′ Y X  b a 
 a ′X + b′Y b′X − a ′Y   a − b 
= 
− b′X + a ′Y a ′X + b′Y  b a 
 a ′Xa + b′Ya − a ′Yb + b′Xb − a ′Xb − b′Yb + b′Xa − a ′Ya 
=
− b′Xa + a ′Ya + a ′Xb + b′Yb a ′Xa + b′Ya − a ′Yb + b′Xb
a ′Xa + b′Ya − a ′Yb + b′Xb − a ′Xb − 0 + b′Xa − 0
=
 − b′Xa + 0 + a ′Xb + 0 a ′Xa + b′Ya − a ′Yb + b′Xb
a ′Xa + b′Ya − a ′Yb + b′Xb − a ′Xb + b′Xa 
=
 − b′Xa + a ′Xb a ′Xa + b′Ya − a ′Yb + b′Xb
a ′Xa + b′Ya − a ′Yb + b′Xb 0
=
 0 a ′Xa + b′Ya − a ′Yb + b′Xb
 a ′ X −Y a 
    
0
 b  Y X  b  
= 
′
 a  X − Y  a  
0  

 b  Y X  b  
′ ′
 a − b  X − Y   a − b  a  X − Y  a 
Therefore,  is PD [or NND] if and only if    is PD
b a  Y X  b a  b  Y X  b
[or NND].
a − b
Now the double-real 2n×2 matrix  is analogous with the n×1 complex vector a + ib . Its
b a 
′
 a − b   a ′ b ′
transpose   = − b ′ a ′ is analogous with the 1×n complex vector a ′ − ib ′ . Moreover,
 b a   
′ ′ ′
a ′ − ib ′ = (a − ib ) = a + ib = (a + ib ) = (a + ib ) , where ‘*’ is the combined operation of
*

X − Y 
transposition and conjugation (order irrelevant). 4 And  is analogous with the n×n
Y X 
complex matrix X + i Y . Accordingly, the complex analogue of the double-real quadratic form
′
 a − b  X − Y   a − b  X − Y 
b is (a + ib )* (X + iY )(a + ib ) . Moreover, since  is
 a  Y X  b a  Y X 
symmetric, (X + iY ) = X ′ − iY ′ = X − i (− Y ) = X + iY . A matrix equal to its transposed conjugate

*
is said to be Hermetian: matrix Γ is Hermetian if and only if Γ* = Γ . Therefore, Γn×n = X + iY is
the variance matrix of some complex random variable z n×1 = x + iy if and only if Γ is Hermetian
X − Y 
and   is non-negative-definite. 5
Y X
′ ′
a  X − Y  a  a  X − Y  a 
Because (a + ib ) (X + iY )(a + ib ) =   +i⋅0 =   , the definiteness
*
Y    Y X  b
b   X  b b 
X − Y 
of Γ = X + iY is the same as the definiteness of   . Therefore, a matrix qualifies as the
 Y X 
variance matrix of some complex random vector if and only if it is Hermetian and NND. Just as the
variance matrix of a real-valued random vector factors as Σ = A ′A n×n for some real-valued A, so too
the variance matrix of a complex random vector factors as Γ = A * A n×n for some complex A.
Likewise, every invertible Hermetian NND matrix must be PD. Due to the skew symmetry of their
4 The transposed conjugate is sometimes called the “transjugate,” which in linear algebra is commonly symbolized with
the asterisk. Physicists prefer the “dagger” notation A†, though physicist Hermann Weyl [1950, p. 17] called it the
~
“Hermetian conjugate” and symbolized it as A.
X − Y 
5 It is superfluous to add ‘symmetric’ here. For Γ = X + iY is Hermetian if and only if Y is symmetric.
 X 

complex parts, the main diagonals of Hermetian matrices must be real-valued. If the matrices are
NND [or PD], all the elements of their main diagonals must be non-negative [or positive].
Let Γ represent the variance of the complex random vector z. Its jkth element represents the
covariance of the jth element of z with the kth element. Since Γ is Hermetian,
γ kj = [Γ ]kj = [Γ * ]kj = [Γ ′]kj = [Γ ] jk = [Γ ] jk = γ jk . Because of this, it is fitting and natural to define
the variance of a complex random vector as:

(
Γ = Var [z ] = E (z − µ ) z − µ )′  = E[(z − µ )(z − µ ) ]
*
 
The complex formula is like the real formula except that the second factor in the expectation is
transjugated, not simply transposed. This renders Γ Hermetian, since:
[
Γ * = E (z − µ )(z − µ )
* *
] = E [{(z − µ )(z − µ ) } ] = E[(z − µ )(z − µ ) ] = Γ
* * *
It also renders Γ NND. For since (z − µ )(z − µ ) is NND, its expectation over the probability
*
distribution of z must also be so. Usually Γ is PD, in which case Γ −1 exists.
5. THE EXPECTATION OF THE STANDARD QUADRATIC FORM

The most common quadratic form in zn×1 involves the variance of the complex random variable,
viz., (z − µ ) Γ −1 (z − µ ) , where Γ = Var [z ] . The expectation of this quadratic form equals n, the
*
rank of the variance. The following proof uses the trace function. The trace of a matrix is the sum
of its main-diagonal elements, and if A and B are conformable tr ( AB ) = tr (BA) . Moreover, the
trace of the expectation equals the expectation of the trace. Consequently:

[ ] [( )]
E (z − µ ) Γ −1 (z − µ ) = E tr (z − µ ) Γ −1 (z − µ )
* *
= E [tr (Γ (z − µ )(z − µ ) )]
−1 *
= tr (E [Γ (z − µ )(z − µ ) ])
−1 *
= tr (Γ E [(z − µ )(z − µ ) ])
−1 *
(
= tr Γ −1Γ )
= tr (I n )
=n
The analogies above between complex and double-real matrices might suggest the result to be 2n.
[ ]
However, for real-valued random variables E (x − µ ) Σ −1 (x − µ ) = n , and the complex case is a
*
superset of the real. So by extension, the complex case must be the same.
But an insight is available into why the value is n, rather than 2n. Let x and y be n×1 real-valued
random vectors. Assume their means to be zero, and their variances to be identity matrices (so zero
covariance):
x  0 n×1  I n 0 
y  ~  μ = 0 , Σ = 0 
    n×1   I n  
The quadratic form is:
−1
n  x2 y 2j 
[x′ y ′]Σ −1   = [x′ y ′] n
x I 0  x x
= [x y ]  = x x + y y = ∑
′ ′ ′ ′  + 
j
y  0 I n  y   
y j =1

 1 1 

Since the elements have unit variances, the expectation is:
 x   n  x 2j y 2j  n  E x 2j E y 2j
E [x′ y ′]Σ −1    = E ∑   = ∑ 
[ ] [ ] = n
1 1
 1
+
 j =1  1
+
 ∑  1 + 1  = 2n
  y   
 
j =1 1   1  j =1
Now let z be the n×1 complex random vector x + iy . Since E [x + iy ] = 0 n×1 , the variance of z is:

Γ = Var [z ]
= Var [x + iy ]
 x 
= Var [I n iI n ]  
 y  
 x  x 
*

= E [I n iI n ]  [I n iI n ]   
 y   y   
 x x
*

= E [I n iI n ]    [I n iI n ] 
*
 y  y  
x x *   I n 
= [I n iI n ]E        
 y  y   − iI n 
x  I 
= [I n iI n ]Var    n 
 y   − iI n 
I 0   I n 
= [I n iI n ] n  
 0 I n   − iI n 
 I 
= [I n iI n ] n 
 − iI n 
= I n −i 2 I n
= 2I n
The complex quadratic form is:
n ( x − iy )( x + iy ) n x2 + y2
z *z n z z
x
z Γ z = z (2I n ) = [x ′ y ′]Σ −1  
1
=∑ =∑ =∑
−1 −1
z=
* * j j j j j j j j
2 j =1 2 j =1 2 j =1 1+1 2 y 
The complex form is half the real-valued form; hence, its expectation equals n. The condensation of
the 2n real dimensions into n complex ones inverts the order of operations:
n  x 2j y 2j  n x2 + y2
∑  +  ⇒ ∑
j j

j =1  1 1   j =1 1+1

Within the sigma operator, the sum of two quotients becomes the quotient of two sums. A proof
for general variance Γ involves diagonalizing Γ, i.e., that Γ can be eigen-decomposed as
Γ = WΛW * , where Λ is diagonal and WW * = W * W = I n . 6
At this point we can derive the standard complex normal distribution. The normal distribution is
−
( x − µ )2
f X (x ) =
1
e 2σ 2
. The standard complex normal random variable is formed from two
2πσ 2
independent real normal variables whose means equal zero and whose variances equal one half:
( x − 0 )2 ( y − 0 )2
− − 1 − x 2 1 − y 2 1 − (x 2 + y 2 ) 1 − z z
f Z (z ) =
1 1
e 2(1 2 ) e 2(1 2 ) = e e = e = e
2π (1 2 ) 2π (1 2 ) π π π π
The distribution of the n×1 standard complex normal random vector is f z (z ) =

1
e z z . A vector so
*
π n
distributed has mean E [z ] = 0 n×1 and variance Var [z ] = E [zz ′] = I n .
6. COMPLEX VARIANCE, PSEUDOVARIANCE, AND AUGMENTED

VARIANCE
Section 4 justified the definition of the variance of a complex random vector as:

(
Γ = Var [z ] = E (z − µ ) z − µ )′  = E[(z − µ )(z − µ ) ]
*
 
The naïve formula differs from this by one critical symbol (prime versus asterisk):
′
C = E (z − µ )(z − µ ) 
 
6Cf. Appendix C for eigen-decomposition and diagonalization. We believe the insight about commuting sums and
quotients to be valuable as an abstraction. But of course, a vector of n independent complex random variables of unit
2n
1
variance translates into a vector of 2n independent real random variables of half-unit variance, and ∑2 =n.
j =1
Because
of the half-unit real variance, the formula in the next paragraph for the standard complex normal distribution, lacking
any factors of two, is simpler than the formula for the standard real normal distribution.

This naïveté leads many to conclude that Var [iz ] = i 2Var [z ] = −Var [z ] , whereas it is actually: 7
[ *
] [ *
] [ *
]
Var [iz ] = E i (z − µ ){i (z − µ )} = E i (z − µ )i (z − µ ) = iiE (z − µ )(z − µ ) = i (− i )Var [z ] = Var [z ]
Nevertheless, there is a role for the naïve formula, which reduces to:
′  ′
[
C = E (z − µ )(z − µ )  = E (z − µ )(z − µ )  = E (z − µ ) z − µ
 
 
( ) ] = Cov[z, z ]
*
Veeravalli [2006], whose notation we follow, calls C the “relation matrix.” The Wikipedia article
“Complex normal distribution” calls it the “pseudocovariance matrix.” Because of the naïveté that
leads many to a false conclusion, we prefer the ‘pseudo’ terminology (better, “pseudovariance”) to
something as bland as “relation matrix.” However, a useful and non-pejorative concept is what we
will call the “augmented variance.”
The augmented variance is the variance of the complex random vector z augmented with its
z   z   E [z ]  µ 
conjugate z , i.e., the 2n×1 vector   . Its expectation is E   =   =   . And its variance is
z   z   E [z ]  µ 
(for brevity we ignore the mean):
z  z  z  *  z   Cov[z, z ] Cov[z, z ]

Var   = E       = E   [ z ′ z ′] =  
z    z   z   z   Cov[z , z ] Cov[z , z ]
In two ways this matrix is redundant. First, Cov[z , z ] = E [zz ′] = E [zz ′] = Cov[z, z ] ; equivalently,
Var [z ] = Var [z ] = Γ . And second, Cov[z , z ] = E [z z ′] = E [zz′] = Cov[z, z ] = C . Therefore:
 z  Cov[z, z ] Cov[z, z ]  Γ C
Var   =  = 
 z  Cov[z , z ] Cov[z , z ]  C Γ 
7 In general, for any complex scalar α, Var [α z ] = ααVar [z ] .

As with any valid variance matrix, the augmented variance must be Hermetian. Hence,
′ ′
 Γ C   Γ C   Γ C   Γ C  Γ*
*
C′
C Γ  = C Γ  = C Γ  =   = *  , from which follow Γ = Γ and C′ = C .
*
      C Γ  C Γ′ 
Moreover, it must be at least NND, if not PD. It is important to note from this that the
pseudovariance is an essential part of the augmented z; it is possible for two random variables to
have the same variance and to covary differently with their conjugates. How a complex random
vector covaries with its conjugate is useful information; it is even a parameter of the general complex
normal distribution, which we will treat next.
7. THE COMPLEX NORMAL DISTRIBUTION

All the information for deriving the complex normal distribution of z n×1 = x + iy is contained in
the parameters of the real-valued multivariate normal distribution:
x  μ x  Σ xx Σ xy  
 μ =   Σ = Σ 
y  Σ yy  
~ N ,
 × ×
μ y 
2 n 1 2 n 2 n
    yx
According to this variance structure, the real and imaginary parts of z may covary, as long as the
covariance is symmetric: Σ yx = Σ′xy . The grand Σ matrix must be symmetric and PD. The
probability density function of this multivariate normal is: 8
 x −μ x   x −μ x 
−
1
[
x ′ −μ ′x ]
y′ −μ ′y Σ −1   −
1
[
x ′ −μ ′x ]
y′ −μ ′y Σ −1  
f  x  (x , y ) =
1  y −μ y  1  y −μ y 
=
2 2
e e
 
y  (2π )2 n Σ π n 2 2n Σ
8 As derived briefly by Judge [1988, pp 49f]. Chapter 4 of Johnson [1992] is thorough. To be precise, Σ under the
radical should be Σ , the absolute value of the determinant of Σ. However, the determinant of a PD matrix must be
positive (cf. Judge [1988, A.14(1)]).

 z   x + iy   I n iI n   x  x
Since z = x + iy , the augmented vector is   =  =    = Ξ n   . We will call
 z   x − iy   I n − iI n   y  y 
I iI n 
Ξn =  n the augmentation matrix; this linear function of the real-valued vectors produces
I n − iI n 
the complex vector and its conjugate. An important equation is:
I iI n   I n I n  2I n 0 
Ξ n Ξ *n =  n = = 2I 2 n
I n − iI n  − iI n iI n   0 2I n 
Therefore, Ξn has an inverse, viz., one half of its transjugate.
z  μ x  μ x + iμ y  μ 
The augmented mean is E   = Ξ n   =   =   . The augmented variance is:
z  μ y  μ x − iμ y  μ 
z 
Var   = Ξ n ΣΞ *n
z 
I iI n  Σ xx Σ xy   I n I n 
= n 
I n − iI n  Σ yx Σ yy  − iI n iI n 
Σ xx + iΣ yx Σ xy + iΣ yy   I n I n 
=
Σ xx − iΣ yx Σ xy − iΣ yy  − iI n iI n 
Σ xx + Σ yy − i (Σ xy − Σ yx ) Σ xx − Σ yy + i (Σ xy + Σ yx )
=
Σ xx − Σ yy − i (Σ xy + Σ yx ) Σ xx + Σ yy + i (Σ xy − Σ yx )
 Γ C
= 
C Γ 
And so:
−1
 z   Γ C
−1
Var   =   (
= Ξ n ΣΞ *n )
−1
( )
= Ξ *n
−1
Σ −1 (Ξ n )
−1
z  C Γ 
−1
z  −1  Γ C −1
This can be reformulated as Ξ Var   Ξ n = Ξ *n 
*
 Ξn = Σ .
Γ
n
 
z  C 

We now work these augmented forms into the probability density function:
 x −μ x 
−
1
[
x ′ −μ ′x ]
y′ −μ ′y Σ −1  
f  x  (x , y ) =
1 2  y −μ y 
e
 
y 
π n 2 2n Σ
 z   x −μ x 
1 −
1
[
x ′ −μ ′x ]
y′ −μ ′y Ξ *nVar −1   Ξ n  
 z   y −μ y 
=
2
e
π n 2I 2 n Σ
*
1   x −μ x   −1  z 
  x −μ x  
−  Ξn   Var    Ξ n  y μ  
1 2   y −μ y   z   − y  
= e
π n Ξ n Ξ *n Σ
1 −
1
[ ] z   z −μ 
z′ −μ ′ z′ −μ ′ Var −1    
 z   z −μ 
= e 2
π n Ξ n ΣΞ *n
1 −
1
[ ] z   z −μ 
z′ −μ ′ z′ −μ ′ Var −1    
 z   z −μ 
= e 2
z 
π n Var  
z 
However, this is not quite the density function of z, since the differential volume has not been
considered. The correct formula is f z (z )dVz = f  x  (x , y )dVxy . The differential volume in the xy
 
y 
n
coordinates is dVxy = ∏ dx dy
j =1
j j . A change of dxj entails an equal change in the real part of dzj,
even as a change of dyj entails an equal change in the imaginary part of dzj. Accordingly,
∏ dx j (i ⋅ dy j ) = i n ∏ dx j dy j = i n
n n n n
dVz =
j =1 j =1
∏ dx j dy j = 1 ⋅ ∏ dx j dy j = dVxy . It so happens that
j =1 j =1
Ξn does not distort volume; but this had to be demonstrated. 9
So finally, the probability density function of the complex random vector z is:
9 This will be abstruse to some actuaries. However, the integration theory is implicit in the change-of-variables
technique outlined in Hogg [1984, pp 42-46]. That the n×n determinant represents “the volume function of an n-
dimensional parallelepiped” is beautifully explained in Chapter 4 of Schneider [1973].

dVz
f z (z ) = f z (z ) ⋅ 1 = f z (z ) = f  x  (x , y )
dVxy  
y  
1 −
1
[ ]  z   z −μ 
z′ −μ ′ z′ −μ ′ Var −1    
 z   z −μ 
= e 2
z 
π n Var  
z 
−1
1 −
1
[ ] Γ C   z −μ 
z′ − μ ′ z′ − μ ′    
 C Γ   z −μ 
= e 2
Γ C
πn
C Γ
This formula is equivalent to the one found in the Wikipedia article “Complex normal distribution.”
Although Γ Γ − CΓ −1C appears within the radical of that article’s formula, it can be shown that
Γ C
= Γ Γ − CΓ −1C . As far as allowable parameters are concerned, µ may be any complex
C Γ
Γ C  Γ C
vector. is allowed if and only if Ξ *n   Ξ n = 4Σ is real-valued and PD.
C Γ C Γ 
Veeravalli [2006] defines a “proper” complex variable as one whose pseudo[co]variance matrix is
0n×n. Inserting zero for C into the formula, we derive the probability density function of a proper
complex random variable whose variance is Γ:

−1
−
1
[ ]
 Γ C   z −μ 
z′ − μ ′ z′ − μ ′    
f z (z ) =
1 2  C Γ   z −μ 
e
Γ C
πn
C Γ
−1
1 −
1
[
z′ − μ ′ z′ − μ ′  ]
 Γ 0   z −μ 
  
 0 Γ   z −μ 
= e 2
Γ 0
πn
0 Γ
1
1
( ) −1
−  z′ −μ ′ Γ −1 ( z −μ )+ ( z′ −μ ′ )Γ z −μ  ( )
= e 2 
πn Γ Γ
1 −
1
{( ) ( )
z′ −μ ′ Γ −1 ( z −μ )+ z′ −μ ′ Γ −1 ( z′ −μ ′ ) }
= e 2
πn Γ Γ
1 −
1
{( ) ( )
z′ −μ ′ Γ −1 ( z −μ )+ z′ −μ ′ Γ −1 ( z −μ ) }
= e 2
πn Γ
2
e −(z′−μ′ )Γ
1 −1
( z −μ )
=
π Γ
n
1
e − ( z −μ ) Γ ( z −μ )
* −1
=
π Γ
n
The transformations in the last several lines rely on the fact that Γ is Hermetian and PD. Now the
standard complex random vector is a proper complex random vector with mean zero and variance
1
In. Therefore, in confirmation of Section 5, its density function is e −z z .
*
πn
8. THE COMPLEX LOGNORMAL RANDOM VECTOR AND ITS

MOMENTS
A complex lognormal random vector is the elementwise exponentiation of a complex normal
random vector: w n×1 = e z n×1 . Its conjugate also is lognormal, since w = e z = e z . Deriving the
probability density function of w is precluded by the fact that e : z → w is many-to-one.
Specifically, e z = w = e z +i (2πk ) for any integer k. So unlike the real-valued lognormal random
variable, whose density function can be found in Klugman [1998, §A.4.11], an analytic form for the

complex lognormal density is not available. However, even for the real-valued lognormal the
density function is of little value; its moments are commonly derived from the moment generating
function of the normal variable on which it is based. So too, the moment generating function of the
complex normal random vector is available for deriving the lognormal moments.
We hereby define the moment generating function of the complex n×1 random vector z as
[ ]
M z (s n×1 , t n×1 ) = E e s′z + t′z . Since this definition may differ from other definitions in the sparse
literature, we should justify it. First, because we will take derivatives of this function with respect to
s and t, the function must be differentiable. This demands simple transposition in the linear
combination, i.e., s′z + t ′z rather than the transjugation s*z + t * z . For transjugation would involve
ds
derivatives of the form , which do not exist, as they violate the Cauchy-Riemann condition. 10
ds
Second, even though moments of z are conjugates of moments of z, we will need second-order
moments involving both z and z . For this reason both terms must be in the exponent of the
moment generating function.
10 Cf. Appendix D.1.3 of Havil [2003]. Express f ( z = x + iy ) in terms of real-valued functions, i.e., as
∂u ∂x ∂v ∂x 
u ( x, y ) + i ⋅ v( x, y ) . The derivative is based on the matrix of real-valued partial derivatives  .
∂u ∂y ∂v ∂y 
For the derivative to be the same in both directions, the Cauchy-Riemann condition must hold, viz., that
 1 0
∂u ∂x = ∂v ∂y and ∂v ∂x = − ∂u ∂y . But for f ( z ) = z = x − iy , the partial-derivative matrix is  ;
0 −1
hence ∂u ∂x ≠ ∂v ∂y . The Cauchy-Riemann condition becomes intuitive when one regards a valid complex
 1 0
derivative as a double-real 2×2 matrix (Section 3). Compare this with f ( z ) = z = x + iy , whose matrix is  ,
0 1
which represents the complex number 1.

 z   x + iy   I n iI n   x  x
We start with terminology from Section 7, viz., that   =  =    = Ξ n   and
 z   x − iy   I n − iI n   y  y 
x  μ x  Σ xx Σ xy  
that   ~ N  μ 2 n×1 =  , Σ 2 n×2 n =  . According to Appendix A, the moment
y  
 μ y  Σ yx Σ yy  
x
generating function of the real-valued normal random vector   is:
y 
μ x  1  Σ xx Σ xy   a 
 a    [a ′ x
b′ ]   [a ′ b′ ]  + [a ′ b′ ]
μ

Σ yy   b 
 Σ yx
M  x      = E e y  =e  y 2
   b  
y 
 
Consequently:
[
M z (s, t ) = E e s′z + t′z ]
 [s′ t′ ] I n iI n   x  
 I n − iI n   y  
= E e
 
 
 [s′+ t′ i (s′− t′ )]  
x
= E e y  
 
 s + t 
= M  x     
   i (s − t ) 
y 
[(s+ t )′ ] [
μ x  1
i (s − t )′   + (s + t )′
μ y  2
]  Σ xx
i (s − t )′ 
 Σ yx
Σ xy   s + t 
Σ yy   i (s − t )
=e
It is so that we could invoke it here that Appendix B went to the trouble of proving that complex
values are allowed in this moment generating function.
But in two ways we can simplify this expression. First:
′ μ x 
(s + t )′ i (s − t )    = (s + t ) μ x + i (s − t ) μ y = s ′(μ x + iμ y ) + t ′(μ x − iμ y ) = s ′μ z + t ′μ z
′ ′
  μ y
 

z   Γ C Σ xx Σ xy  *
And second, again from Section 7, Var   =  = Ξ n Σ 2 n×2 n Ξ *n = Ξ n  Ξ , or

z  C Γ  Σ yx Σ yy  n
Ξ *n  Γ C Ξ n Σ xx Σ xy 
equivalently,  C Γ  2 = Σ Σ yy 
. Hence:
2    yx
(s + t )′ ′ Σ xx Σ xy   s + t   ′ ′ Ξ  Γ C Ξ n  s + t 
*
i (s − t )      = (s + t ) i (s − t )  n  C Γ  2 i (s − t )
  Σ yx
 Σ yy  i (s − t )   2    
Ξn  s + t  1 I n iI n   s + t  1 (s + t ) − (s − t )  t 
On the right side, i (s − t ) = 2 I = =
− iI n  i (s − t ) 2 (s + t ) + (s − t ) s 
. And on the
2    n
′ ′ Ξ ′ ′  I In 
*
left side, (s + t ) i (s − t )  n = (s + t ) i (s − t )   n = [s ′ t ′] . So:
1
  2 2  − iI n iI n 
(s + t )′ ′ Σ xx Σ xy   s + t   Γ C  t 
i (s − t )      = [s ′ t ′]  
  Σ yx
 Σ yy  i (s − t )  C Γ  s 
And the simplified expression, based on the mean and the augmented variance of z, is:
[
M z (s, t ) = E e s′z + t ′z ]
Γ C  t 
s′μ z + t ′μ z +
1
[s′ t ′ ]  
 C Γ  s 
=e 2
s′μ + t ′μ +
1
(s′Γt + s′Cs + t ′Ct + t ′Γs )
=e 2
As in Appendix A, let e j denote the jth unit vector of ℜ n , or even better, of C n . Then:
[ ] = M (e , 0) = e
1
zj μ j + C jj
2
Ee z j
E e  = E [e ] = M (0, e ) = e [ ]
1
μ j + C jj
=Ee
zj zj 2 zj
  z j
Moreover:
[ ] [ ]= M (e + e k , 0) = e
( )
1
z j +zk μ j +μ k + C jj + C jk + C kj + C kk
E e j ezk = E e
z 2
z j

According to Section 6, C is symmetric ( C′ = C ). This and further simplification leads to:
[ ]= e ( )
[ ]E[e ]⋅ e
    1
μ j +μ k +
1
(C jj + C jk + C kj + C kk ) 1 1
 μ j + C jj  +  μ k + C kk  + C jk + C jk
=e =Ee
zj zk 2  2   2  2 zj zk C jk
Ee e
Hence, mindful of the transjugation (*) in the definition of complex covariance, we have:
[ ]
z

z
 
z
[ ] z z
[
Cov e j , e z k = E e j e z k  − E e j E e z k  = E e j e z k − E e j E e z k = E e j E e z k ⋅ e jk − 1
z C
] [ ] [ ] [ ] [ ]( )
In terms of w = e z this translates as Cov[w, w ] =  E [w ]E [w ]′   (e C − 1n×n ) , in which the ‘◦’ operator

 
represents elementwise multiplication. 11 So too:
 
z z
 
[
Cov e j , e z k  = Cov e j , e z k = E e j  E e z k ⋅ e jk − 1
z C
] [ ]( )
This translates as Cov[w , w ] =  E [w ]E [w ]′   e C − 1n×n .
 
( )
The remaining combination is the mixed form E e j e z k : [ z

]
[ ]= E[e e ]= E[e ]= M (e , e ) = e ( )
1
zj zj z j +zk μ j +μk + Γ jk + C jj + Ckk + Γkj
zk zk 2
Ee e z j k
Since Γ is Hermetian, Γkj = Γjk′ = Γ*jk = Γ jk . Hence:
[ ]= e ( )
[ ]E[e ]⋅ e
    1
μ j +μk +
1
(
Γ jk + C jj + Ckk + Γkj ) 1 1
 μ j + C jj  +  μ k + Ckk  + Γ jk + Γ jk Γ jk
=e =Ee
zj zk 2  2   2  2 zj zk
Ee e
Therefore, [ ] [
Cov e j , e z k = E e j e z k − E e
z z
] [ ]E[e ]= E[e ]E[e ]⋅ (e
zj zk zj zk Γ jk
−1 ,) which translates as

′
Cov[w, w ] =  E [w ]E [w ]   e Γ − 1n×n .

( ) By conjugation, Cov e j , e z k  = E e j  E e z k ⋅ e jk − 1 ,

z

z
 
Γ
[ ]( )
which translates as Cov[w , w ] =  E [w ]E [w ]′   e Γ − 1n×n .
 
( )
11Elementwise multiplication is formally known as the Hadamard, or Hadamard-Schur, product, of which we will make
use in Appendices A and C. Cf. Million [2007].

We conclude this section by expressing it all in terms of w n×1 = e z n×1 . Let z be complex normal with
 z   Γ C
mean µ and augmented variance Var   =   . And let D be the n×1 vector consisting of the
z  C Γ 
main diagonal of C. Then D is the vectorization of the diagonal of C . So the augmented mean of
 w  eμ + D 2 
w is E   =  μ + D 2  . And the augmented variance of w is:
 w  e 
 w  Cov[w, w ] Cov[w, w ]
Var   =  
 w  Cov[w , w ] Cov[w , w ]
(
  E [w ]E [w ]′   e Γ − 1
 n× n ) (
 E [w ]E [w ]′   e C − 1
n× n )
=    
′
(
 E [w ]E [w ]   e − 1n×n

 C
)  
(
 E [w ]E [w ]′   e Γ − 1
n× n )
 E [w ]E [w ]′ ′
E [w ]E [w ]    C
Γ C

− 12 n×2 n 

= ′  e
Γ
 E [w ]E [w ]′ E [w ]E [w ]   
 
 w   z 

′   Var  z 
=  E   E [w w]   e − 12 n×2 n 
 
 w    
  w   w  *   Var  zz  
=  E   E      e   − 12 n×2 n 
 w w   
       
Scaling all the lognormal means to unity (or setting μ = − D 2 ), we can say that the coefficient-of-
z 
Var  
lognormal-augmented-variation matrix equals e z 
− 12 n×2 n , which is analogous with the well-
known coefficient of lognormal variation e σ − 1 .

2
9. THE COMPLEX LOGNORMAL RANDOM VARIABLE

The previous section derived the augmented mean and variance of the lognormal random vector;
this section provides some intuition into it. The complex lognormal random variable, or scalar,

X   0 σ 2 ρστ  
derives from the real-valued normal bivariate   ~ N   ,   . Zero is not much of a
Y   0 ρστ τ 2  
   
restriction; since eCN (μ, V ) = eμ +CN (0, V ) = eμ  eCN (0, V ) , the normal mean affects only the scale of the
lognormal. The variance is written in correlation form, where − 1 ≤ ρ ≤ 1 . As usual, 0 < σ, τ < ∞ .
Z   X  1 i   X   X + iY 
Define   = Ξ1   =    =   . Its mean is zero, and according to Section 7 its
Z   Y  1 − i   Y   X − iY 
variance (the augmented variance) is:
 Z  1 i  σ 2 ρστ   1 1
Var   =   2  
 Z  1 − i  ρστ τ  − i i 
σ 2 + τ 2 − i (ρστ − ρστ ) σ 2 − τ 2 + i (ρστ + ρστ )
= 2 
σ − τ 2 − i (ρστ + ρστ ) σ 2 + τ 2 + i (ρστ − ρστ )
σ 2 + τ2 σ 2 − τ 2 + 2iρστ 
= 2 
σ − τ 2 − 2iρστ σ 2 + τ 2 
We will say little about non-zero correlation ( ρ ≠ 0 ); but at this point a digression on complex
correlation is apt. The coefficient of correlation between Z and its conjugate is:
σ 2 − τ 2 + 2iρστ σ 2 − τ 2 − 2iρστ
ρ ZZ = = = ρ ZZ
σ2 + τ2 σ2 + τ2
As a form of covariance, correlation is Hermetian. Moreover:
0 ≤ ρ ZZ ρ ZZ = ρ ZZ ρ Z Z =
(σ 2
) 2
− τ 2 + 4ρ 2 σ 2 τ 2
≤
(σ 2
)
− τ 2 + 4(1)σ 2 τ 2
2
=
(σ 2
+ τ2 )
2
=1
(σ 2
+ τ2 )
2
(σ 2
+ τ2 )
2
(σ 2
+ τ2 )
2
So, the magnitude of complex correlation is not greater than unity. The imaginary part of the
correlation is zero unless some correlation exists between the real and imaginary parts of the
underlying bivariate. More interesting are the two limits: lim+

ρ ZZ = 1 and lim+
ρ ZZ = −1 . In the
2 2
τ →0 σ →0

first case, Z → Z in a statistical sense, and the correlation approaches one. In the second case,
Z → − Z , and the correlation approaches negative one.
Now if W = e Z , by the formulas of Section 8, E [W ] = e 0+ (σ − τ 2 + 2 iρστ 2) = e (σ ) ⋅ e iρστ and

−τ2 2
2 2
E [W ] = e (σ ) ⋅ e −iρστ . And the augmented variance is:

−τ2 2
2
W   W  W    Var  Z  
* Z 
Var   = E   E    e − 12×2 
W   W  W    

 e (σ 2 − τ 2 ) 2 ⋅ e iρστ  2 2  σ  σ 2 − τ 2 + i ⋅2 ρστ  
[ ]
2
+τ2
=   (σ 2 − τ 2 ) 2 −iρστ  e (σ − τ ) 2 ⋅ e −iρστ e (σ
2
) ⋅ e iρστ    e  σ
−τ2 2
2
− τ 2 −i ⋅2 ρστ σ 2 + τ 2

 
− 12×2 
 
 e ⋅e    

1 e 2iρστ  e σ + τ − 1 e σ − τ + 2iρστ − 1
2 2 2 2
σ2 −τ2
=e  − 2iρστ    σ − τ − 2 iρστ 
− 1 e σ +τ − 1
2 2 2 2
e 1  e 
e σ + τ − 1 eσ − e 2iρστ 
− τ 2 + 4 iρστ
2 2 2
σ2 −τ2
=e  σ 2 − τ 2 − 4iρστ 
− e − 2iρστ eσ +τ2
−1
2
e 
e 2σ − e σ − τ e 2σ − 2 τ + 4iρστ − e σ −τ2
⋅ e 2iρστ 
2 2 2 2 2 2
=  2σ 2 − 2 τ 2 − 4iρστ 
− e σ − τ ⋅ e − 2iρστ e 2σ − e σ − τ
2 2 2 2 2
e 
W  1 W  2 2 1 1
In the first case above, as τ 2 → 0 + , E   → e σ 2   and Var   → e σ e σ − 1 
2
 . Since ( )
W  1 W  1 1
the complex part Y becomes probability-limited to its mean of zero, the complex lognormal
degenerates to the real-valued W = e X . The limiting result is oblivious to the underlying correlation
ρ, since W → W .

W  1 2 e − 1 e −τ − 1
τ
W 
2 2
In the second case, as σ → 0 , E   → e − τ 2   and Var   → e −τ  −τ 2

+ 2
 . As
2
τ2
W  1 W  e − 1 e − 1 
in the first case, both E [W ] = E [W ] and the underlying correlation ρ has disappeared. Nevertheless,
the variance shows W and its conjugate to differ; in fact, their correlation is the real-valued
( 2
)( 2
)
ρ WW = e −τ − 1 eτ − 1 = −e −τ = − E [W ] .
2
Since τ2 > 0 , − 1 < ρ WW < 0 and
0 < E [W ] = E [W ] < 1 .
Both these cases are understandable from the “geometry” of W = e Z = e X +iY = e X e iY . The
complex exponential function is the basis of polar coordinates; e X is the magnitude of W, and Y is
the angle of W in radians counterclockwise from the real axis of the complex plane. Imagine a
canon whose angle and range can be set. In the first case, the angle is fixed at zero, but the range is
variable. This makes for a lognormal distribution along the positive real axis. In the second case,
the canon’s angle varies, but its range is fixed at e 0 = 1 . This makes all the shots to land on the
complex unit circle; hence, their mean lies within the circle, i.e., E [W ] < 1 . Moreover, the symmetry
( )
of Y as N 0, τ 2 -distributed guarantees E [W ] to fall on the real axis, or − 1 < E [W ] < 1 .
Furthermore, since the normal density function strictly decreases in both directions from the mean,
more shots land to the right of the imaginary axis than to the left, so 0 < E [W ] = e −τ < 1 . A “right-
2
handed” canon, or a canon whose angle is measured clockwise from the real axis, fires W = e X e − iY
shots.

A shot from an unrestricted canon will “almost surely” not land on the real axis. 12 If we desire
negative values from the complex lognormal random variable, as a practical matter we must extract
them from its real or complex parts, e.g., U = Re(W ) . One can see in the second case, that as τ 2
grows larger, so too grows larger the probability that U < 0 . As τ 2 → ∞ , the probability
approaches one half. In the limit, the shots are uniformly distributed around the complex unit
circle. In this specialized case ( σ 2 → 0 + and τ 2 → ∞ ), the distribution of U = Re(W ) is
f U (u ) =
1
, for − 1 ≤ u ≤ 1 . 13
π 1− u 2
This suggests a third case, in which τ 2 → ∞ while σ 2 remains at some positive amount. An
intriguing feature of complex variables is that infinite variance in Y leads to a uniform distribution of
e iY . 14 So if W = e Z = e X e iY , U = Re(W ) = e X cos Y will be something of a reflected lognormal;
both its tails will be as heavy as the lognormal’s. 15 In this case:
W  e (σ − τ + 2iρστ ) 2  0
2 2
E   = lim  (σ 2 − τ 2 + 2iρστ ) 2  =  
W  τ →∞ e  0
2
W  e 2σ − e σ − τ e 2σ − 2 τ + 4iρστ − e σ −τ2
⋅ e 2iρστ  e 2σ 
2 2 2 2 2 2 2
0
Var   = lim  2σ 2 − 2 τ 2 − 4iρστ = 
− e σ − τ ⋅ e − 2iρστ e 2σ − e σ − τ
2 2 2 2 2
2σ 2
W  τ →∞ e  0 
2
e
Again, ρ has disappeared from the limiting distribution; but in this case ρ WW = 0 .
12 For an event almost surely to happen means that its probability is unity; for an event almost surely not to happen
means that its probability is zero. The latter case means not that the event will not happen, but rather that the event has
zero probability mass. For example, if X ~ Uniform[0, 1], Prob[X=½] = 0. So X almost surely does not equal ½, even
though ½ is as possible as any other number in the interval.
13 For more on this bimodal Arcsine(-1, 1) distribution see Wikipedia, “Arcsine distribution.”
14 The next section expands on this important subject. “Infinite variance in Y” means “as the variance of Y approaches
iY
infinity.” It does not mean that e is uniform for a variable Y whose variance is infinite, e.g., for a Pareto random
variable whose shape parameter is less than or equal to two.
15 Cf. Halliwell [2013] for a discussion on the right tails of the lognormal and other loss distributions.

In practical work with U = Re(W ) , 16 the angular part e iY will be more important than the
lognormal range e X . For example, one who wanted the tendency for the larger magnitudes of
U = Re(W ) to be positive might set the mean of Y at − π 2 and the correlation ρ to some positive
value. Thus, greater than average values of Y, angling off into quadrants 4 and 1 of the complex
plane, would correlate with larger than average values of X and hence of e X . Of course,
Var [Y ] = τ 2 would have to be small enough that deviations of ± π from E [Y ] = − π 2 would be
tolerably rare. Equivalently, one could set the mean of Y at π 2 and the correlation ρ to some
negative value. As a second example, one who wanted negative values of U to be less frequent than
positive, might set both the mean of Y and ρ to zero, and set the variance of Y so that
Prob[Y > π 2] is desirably small. Some distributions of U for τ 2 >> σ 2 are bimodal, as in the
specialized case σ 2 → 0 + and τ 2 → ∞ . But less extreme parameters would result in unimodal
distributions for U over the entire real number line.
10. THE COMPLEX UNIT-CIRCLE RANDOM VARIABLE
In the previous section we claimed that as the variance τ 2 of the normal random variable Y
approaches infinity, e iY approaches a uniform distribution over the complex unit circle. The
explanation and justification of this claim in this section prepare for an important implication in the
next.
[ ]
Let real-valued random variable Y be distributed as N μ, σ 2 , and let W = e iY . According to the
[ ]
moment-generating-formula of Section 8, M Y (it ) = E e itY = e itμ + (it ) σ
2 2
2
= e itμ −t
2
σ2 2
. Although the
16 In the absence of an analytic distribution, practical work with the complex lognormal would seem to require
simulating its values from the underlying normal distribution.

formula applies to complex values of t, here we’ll restrict it to real values. With t ∈ ℜ M Y (it ) is
known as the characteristic function of real variable Y. And so:
1 if t = 0
M Y (it ) = lim eitμ −t σ2 2
= eitμ lim e −t σ2 2
= δt 0 = 
2 2
lim
σ →∞ σ →∞ σ →∞
0 if t ≠ 0
2 2 2
It is noteworthy, and indicative of a uniformity of some sort, that µ drops out of the result.
Next, let real-valued random variable Θ be uniformly distributed over [a, a + 2πn], where n is a
positive integer; in symbols, Θ ~ U[a, a + 2πn] . Then:
M Θ (it ) = E e itΘ [ ]
a + 2πn
1
= ∫
θ=a
e itθ
2πn
dθ
a + 2πn
e itθ
=
2πitn a
2πitn
−1 e
= e ita
2πitn
 1 if t = 0

=  0 if tn = ±1, ± 2, 
≠ 0 if tn not integral

Letting n approach infinity, we have:
e 2πitn − 1 1 if t = 0
lim M Θ (it ) = e ita lim = δt0 = 
n →∞ 2πitn
n →∞
0 if t = 0
Hence, lim M Θ (it ) = lim M Y (it ) = δ t 0 . The equality of the limits of the characteristic functions of
n →∞ 2
σ →∞
the random variables implies the identity of the limits of their distributions; hence, the diffuse
uniform U[a, a + ∞ ] is “the same” as the diffuse normal N[μ, ∞ ] . 17
17 Quotes are around ‘the same’ because the limiting distributions are not proper distributions. The notion of diffuse
distributions comes from Venter [1996, pp. 406-410], who shows there how different diffuse distributions result in

Indeed, for the limit to be δ t 0 it is not required that n be an integer. But for Θ ~ U[a, a + 2πn] , the
integral moments of W = eiΘ are:
 1 if j = 0
EW [ ] = E[e ]
j ijΘ 
=  0 if jn = ±1, ± 2, 
≠ 0 if jn not integral

So if n is an integer, jn will be an integer, and all the integral moments of W will be zero, except for
the zeroth. Therefore, the integral moments of W = eiΘ are invariant to n, as long as the n in 2πn,
the width of the interval of Θ, is a whole number. Hence, although we hereby define the unit-circle
random variable as e iΘ for Θ ~ U [0, 2π ] , the choice of a = 0 and n = 1 is out of convenience,
rather than out of necessity. The probability for e iΘ to be in an arc of this circle of length l equals
l 2π .
The integral moments of the conjugate of W = e − iΘ are the same, for
EW[ ] = E [W
j j
]= E[W ] = δ
j
j0 [ ]
= δ j 0 = E W j . Alternatively, E W [ ] = E[e ] = δ(
j −ijΘ
− j )0 = δ j 0 . And
the jkth mixed moment is [

E W jW k
] = E[e ijΘ
] [ ]
e −ikΘ = E e i ( j − k )Θ = δ ( j − k )0 = δ jk . Since
E [W ] = E [W ] = 0 , the augmented variance of the unit-circle random variable is:
W   W   WW WW   δ11 δ 20  1 0
Var   = E   [W W ] = E  = = = I2
W   W   W W W W  δ 02 δ11  0 1
Hence, W = eiΘ for Θ ~ U [0, 2π ] is not just a unit-circle random variable; having zero mean and
unit variance, it is the standard unit-circle random variable.
different Bayesian estimates. But here every continuous random variable Y diffuses through the periodicity of eiY into
the same limiting distribution, viz., the Kronecker δ t 0 (note 31).

Multiplying W by a complex constant α ≠ 0 affects the radius of the random variable, whose jkth
mixed moment is:
[
E (αW ) αW
j
( ) ] = α α E[W W ] = α α δ
k j k j k j k
jk
(α α ) j
=
 0
if j = k
if j ≠ k
 αW 
The augmented variance is Var   = α α Var [W ] = α α I 2 . One may consider α as an instance of a
 αW 
complex random variable Α. Due to the independence of Α from W, the jkth mixed moment of
ΑW is [
E ( ΑW ) ΑW
j
( ) ] = E [Α
k j k
Α E W jW][ k
] = E [Α j k
]
Α δ jk = E Α Α [( ) ]δj
jk . Its augmented
 AW 
variance is Var  [ ] [ ]
 = E ΑΑ Var [W ] = {Var [A] + E [Α]E Α }Var [W ]. Unlike the one-dimensional
 AW 
W, ΑW can cover the whole complex plane. However, like W, it too possesses the desirable
[
property that E ( ΑW ) = δ j 0 . 18
j
]
11. UNIT-CIRCULARITY AND DETERMINISM

The single most important quality of a random variable is its mean. In fact, just having reliable
estimates of mean values would satisfy many users of actuarial analyses. Stochastic advances in
actuarial science over the last few decades notwithstanding, much actuarial work remains
deterministic. Determinism is not the reduction of a stochastic answer Y = f ( X ) to its mean
E [Y ] = E [ f ( X )] . Rather, the deterministic assumption is that the expectation of a function of a
random variable equals the function of the expectation of the random variable; in symbols,
18 The existence of the moments [

E Αj Α
k
] needs to be ascertained. In particular, moments for j and k as negative
integers will not exist unless Prob[Α = 0] = 1 − Prob[Α ≠ 0] = 1 − Prob Α Α > 0 = 0 . [ ]

E [Y ] = E [ f ( X )] = f (E [X ]) . Because this assumption is true for linear f, it was felt to be a
reasonable or necessary approximation for non-linear f.
Advances in computing hardware and software, as well as increased technical sophistication, have
made determinism more avoidable and less acceptable. However, the complex unit-circular random
variable provides a habitat for the survival of determinism. To see this, let f be analytic over the
domain of complex random variable Z. From Cauchy’s Integral Formula (Havil [2003, Appendix
D.8 and D.9]) it follows that within the domain of Z, f can be expressed as a convergent series
∞
f ( z ) = a 0 + a1 z +  = a 0 + ∑ a j z j . Taking the expectation, we have:
j =1
[ ]
∞
E [ f (Z )] = a 0 + ∑ a j E Z j
j =1
[ ]
But if for every positive integer j E Z j = E [Z ] , then:
j
[ ]
∞ ∞
E [ f (Z )] = a 0 + ∑ a j E Z j = a 0 + ∑ a j E [Z ] = f (E [Z ])
j
j =1 j =1
Therefore, determinism conveniently works for analytic functions of random variables whose
moments are powers of their means.
Now a real-valued random variable whose moments are powers of its mean would have the
characteristic function:
[ ]
M X (it ) = E e itX = 1 + ∑
∞
(it ) j [ ] = 1 + ∑ (itj)! E[X ]
EX j
∞ j
j
= e itE [ X ] = M E [ X ] (it )
j =1 j! j =1
This is the characteristic function of the “deterministic” random variable, i.e., the random variable
whose probability is massed at one point, its mean. So determinism with real-valued random

variables requires “deterministic” random variables. But some complex random variables, such as
[ ]
the unit-circle, have the property E Z j = E [Z ] without being deterministic.
j
[ ]
In fact, when E Z j = E [Z ] , not only is E [ f (Z )] = f [E [Z ]] . For positive integer k, f
j k
(z ) is as
[
analytic as f itself; hence, E f k
(Z )] = f k
(E[Z ]) . So the determinism with these complex random
variables is valid for all moments; nothing is lost.
In Section 10 we saw that for the unit-circle random variable W = e iΘ and for j = ±1, ± 2,  ,
[ ] [ ]
E W − j = E W j = 0 = E [W ] . Can determinism extend to non-analytic functions which involve
j
the negative moments? For example, let g ( z ) = 1 (η − z ) , for some complex η ≠ 0 . The function
is singular at z = η ; but within the disc {z : z η < 1} = {z : z < η } the function equals the
convergent series:
1  z  z   1 z
2
z2
g ( z ) = 1 (η − z ) = ⋅
1 1
= 1 + +   +  = + 2 + 3 + 
η  z  η  η η  η η η
1 −  
 η
Outside the disc, or for {z : z η > 1} = {z : z > η }, another convergent series represents the
function:
1  η  η   1 η η2
2
g ( z ) = 1 (η − z ) = − ⋅
1 1
= − 1 + +   +  = − − 2 − 3 − 
z  η z  z  z   z z z
1 − 
 z
So, if η > 1 , then W = 1 < η . In this case:

1 W W2 
E [g (W )] = E  + 2 + 3 + 
η η η 
1 E [W ] E W
= + 2 +
2
+
[ ]
η η η3
1 0 0
= + 2 + 3 +
η η η
1
=
η
However, if η < 1 , then W = 1 > η . So in this case:
 1 η η2 
E [g (W )] = E − − 2 − 3 − 
 W W W 
[ ]
−1
[ ]
= − E W − ηE W − η E W −3 − 
−2 2
[ ]
= − 0 − η ⋅ 0 − η2 ⋅ 0 − 
=0
Both answers are correct; however, only the first satisfies the deterministic equation
E [g (W )] = g (E [W ]) = g (0 ) = 1 η .
To understand why the answer depends on whether η is inside or outside the complex unit circle, let
us evaluate E [g (W )] directly:
2π
[(
E [g (W )] = E [1 (η − W )] = E 1 η − e iΘ
)] = ∫ 1 dθ
iθ
2π
θ =0 η − e
The next step is to transform from θ into z = e iθ . So dz = ie iθ dθ = izdθ , and the line integral
transforms into a contour integral over the unit circle C:

2π
1 dθ
E [g (W )] = ∫
θ =0
η − eiθ 2π
1 iz dθ
=∫
C
η − z iz 2π
1 1 dz
=∫
C
η − z z 2π i
1 dz
= ∫
2π i C z (η − z )
1 1 1 1
= ∫  +  dz
2π i C  z η − z  η
1  1 dz 1 dz 
= 
 ∫ + ∫ 
η  2π i C z 2π i C η − z 
1 1 dz 1 dz 
= 
 ∫ − ∫ 
η  2π i C z − 0 2π i C z − η 
Now the value of each of these integrals is one if its singularity is within the unit circle C, and zero if
it is not. 19 Of course, the singularity of the first integral at z = 0 lies within C; hence, its value is
one. The second integral’s singularity at z = η lies within C if and only if η < 1 . Therefore:
1 1 dz  1 if η > 1
E [g (W )] =  = g (E [W ]) ⋅ 
dz 1
 ∫ − ∫
η  2π i C z − 0 2π i C z − η  0 if η < 1
So the deterministic equation will hold for one of the Laurent series according to which the domain
of the non-analytic function is divided into regions of convergence. Fascinating enough is how the
1 if η > 1
function φ(η) =
1 dz
∫ =
2π i C z − η 0 if η < 1
serves as the indicator of a state, viz., the state of being
inside or outside the complex unit circle.
19Technically, the integral has no value if the singularity lies on C; but there are some practical advantages for “splitting
the difference” in that case.

12. THE LINEAR STATISTICAL MODEL

Better known as “regression” models, linear statistical models extend readily into the realm of
complex numbers. A general real-valued form of such models is presented and derived in Halliwell
[1997, Appendix C]:
 y1   X1   e1   e1   Σ11 Σ12 
y  = X β + e , Var e  = Σ 
 2  2  2  2   21 Σ 22 
The subscript ‘1’ denotes observations, ‘2’ denotes predictions. Vector y1 is observed; the whole
design matrix X is hypothesized, as well as the fourfold ‘Σ’ variance structure. Although the
variance structure may be non-negative-definite (NND), the variance of the observations Σ11 must
be positive-definite (PD). Also, the observation design X 1 must be of full column rank. The last
−1
two requirements ensure the existence of the inverses Σ11 and X1′Σ11
−1
X1 ( ) −1
. The best linear
−1
unbiased predictor of y 2 is yˆ 2 = X 2 βˆ + Σ 21Σ11 y 1 − X 1βˆ . ( ) The variance of prediction error
(
y 2 − ŷ 2 is Var [y 2 − yˆ 2 ] = X 2 − Σ 21Σ11
−1
) [ ]( −1
X 1 Var βˆ X 2 − Σ 21Σ11
′
) −1
X 1 + Σ 22 − Σ 21Σ11 Σ12 . Embedded
in these formulas are the estimator of β and its variance:
(
βˆ = X 1′ Σ11
−1
X1 )
−1
X 1′ Σ11
−1
[]
y 1 = Var βˆ ⋅ X 1′ Σ11
−1
y1 .
For the purpose of introducing complex numbers into the linear statistical model we will concern
ourselves here only the estimation of the parameter β. So we drop the subscripts ‘1’ and ‘2’ and
simplify the observation as y = Xβ + e , where Var [e] = Γ . Again, X must be of full column rank
and Γ must be Hermetian PD. According to Section 4, transjugation is to complex matrices what
transposition is to real-valued matrices. Therefore, the short answer for a complex model is:
(
βˆ = X * Γ −1 X )−1
[]
X * Γ −1 y = Var βˆ ⋅ X * Γ −1 y .

However, deriving the solution from the double-real representation in Section 3 will deepen the
understanding. The double-real form of the observation is:
y r − y i  X r − X i  β r − β i  e r − ei 
y = +
 i y r   X i X r   β i β r   e i e r 
All the vectors and matrices in this form are real-valued. The subscripts ‘r’ and ‘i’ denote the real
and imaginary parts of y, X, β, and e. Due to the redundancy of double-real representation, we may
retain just the left column:
y r  X r − X i  β r  e r 
y  =  X +
X r  β i   ei 
 i  i
Note that if X is real-valued, then X i = 0 , and y r and y i become two “data panels,” each with its
own parameter β r and β i . 20
I iI t 
Now let Ξ t =  t , the augmentation matrix of Section 7, where t is the number of
I t − iI t 
observations. Since the augmentation matrix is non-singular, premultiplying the left-column form
by it yields equivalent but insightful forms:
20 This assumes zero covariance between the error vectors.

I t iI t  y r  I t iI t  X r − X i  β r  I t iI t  e r 
I = +
t − iI t   y i  I t − iI t   X i X r  β i  I t − iI t   e i 
y r + iy i  X r + iX i − X i + iX r  β r  e r + ie i 
 y − iy  =  X − iX − X − iX  β  + e − ie 
 r i  r i i r  i   r i
y   X iX  β r  e 
 y  =  X − iX   β  +  e 
    i   
 y   Xβ r + iXβ i  e 
 y  =  Xβ − i Xβ  +  e 
   r i  
 y   Xβ  e 
y  = X β  + e 
     
 y  X 0  β   e 
 y  =  0 X  β  +  e 
      
The first insight is that y = Xβ + e = X β + e is as much observed as y = Xβ + e . The second
e 
insight is that Var   is an augmented variance, whose general form according to Section 6 is
e 
e   Γ C
Var   =   . Therefore, the general form of the observation of a complex linear model is
e   C Γ 
 y  X 0  β   e  e   Γ C
 y  =  0 X  β  +  e  , where Var e  =  C Γ  . Not only is y as observable as y, but also β
          
is as estimable as β. Furthermore, although the augmented variance may default to C = 0 , the
e 
complex linear statistical model does not require   to be “proper complex,” as defined in Section
e 
7.
X 0 
Since X is of full column rank, so too must be   . And since Γ is Hermetian PD, both it and
 0 X
 Γ C
its conjugate Γ are invertible. But the general form of the observation requires   to be
C Γ 

Hermetian PD, hence invertible. A consequence is that both the “determinant” forms Γ − C Γ −1C
and Γ − C Γ −1C are Hermetian PD and invertible. With this background it can be shown, and the
−1
 Γ C Η K
reader should verify, that   =  (
, where Η = Γ − C Γ −1 C )−1
and K = −Γ −1C Η . The
C Γ  K Η
important point is that inversion preserves the augmented-variance form.
 y  X 0  β   e  e   Γ C
The solution of the complex linear model   =     +   , where Var   =   , is:
 y   0 X  β   e  e   C Γ 
−1
 βˆ   X 0   Γ C X 0   X 0   Γ C  y 
* −1 * −1
ˆ =            
 β    0 X   C Γ   0 X    0 X   C Γ   y 
−1
 X * 0  Η K  X 0   X * 0  Η K  y 
=     
 0
 X ′  K Η   0 X    0 X ′  K Η   y 
−1
 X *Η X X * K X   X *Η y + X * K y 
=   
 X′KX X′Η X   X′Ky + X′Η y 
−1
βˆ  X *ΗX X *KX 
And Var  ˆ  =   , which must exist since it is a quadratic form based on the
β   X′KX X′Η X 
−1
 Γ C X 0 
Hermetian PD   and the full column rank   . Reformulate this as:
C Γ   0 X
X *ΗX X *KX   βˆ  X *Ηy + X *Ky 

  ˆ  =  
 X ′K X X ′Η X   β   X′Ky + X′Η y 
The conjugates of the two equations in β̂ and β̂ are the same equations in β̂ and β̂ :
X *ΗX X *KX   βˆ  X *Ηy + X *Ky  X *ΗX X *KX   βˆ 

   =  =  ˆ 
 X′KX X′Η X   βˆ   X′Ky + X′Η y   X′KX X′Η X   β 

 ˆ   βˆ 
Therefore,  β  =  ˆ  . It is well known that the estimator of a linear function of a random variable
 βˆ   β 
is the linear function of the estimator of the random variable. But conjugation is not a linear
function. Nevertheless, we have just proven that the estimator of the conjugate is the conjugate of
the estimator.
13. ACTUARIAL APPLICATIONS OF COMPLEX RANDOM VARIABLES

How might casualty actuaries put complex random variables to work? Since the support of most
complex random variables is a plane, rather than a line, their obvious application is bivariate. An
example is a random variable whose real part is loss and whose imaginary part is LAE. Another
application might pertain to copulas. According to Venter [2002, p. 69], “copulas are joint
distributions of unit random variables.” One could translate these joint distributions into
distributions of complex variables whose support is the complex unit square, i.e., the square whose
vertices are the points z = 0, 1, 1 + i, i . However, for now it seems that real-valued bivariates
provide the necessary theory and technique for these purposes.
Actuaries who have applied log-linear models to triangles with paid increments have been frustrated
applying them to incurred triangles. The problem is that incurred increments are often negative, and
the logarithm of a negative number is not real-valued. This has led Glenn Meyers [2013] to seek
modified lognormal distributions whose support includes the negative real numbers. The persistent
intractability of the log-linear problem was a major reason for our attention to the lognormal
random vector w n×1 = e z n×1 in Section 8. But to model an incurred loss as the exponential function
of a complex number suffers from two drawbacks. First, to model a real-valued loss as e z = e x ⋅ e iy
requires y to be an integral multiple of π. The mixed random variable e X with probability p and

− e X with probability 1 − p is not lognormal. No more suitable are such “denatured” random
( )
variables as Re e Z . Second, one still cannot model the eminently practical value of zero, because
for all z, e z ≠ 0 . 21 At present it does not appear that complex random variables will give birth to
useful distributions of real-valued random variables. Even the unit-circle and indicator random
variables of Sections 10 and 11, as interesting as they are in the theory of analytic functions, most
likely will engender no distributions valuable to actuarial work.
The complex version of the linear model in Section 12 showed us that conjugates of observations
are themselves observations and that conjugates of estimators are estimators of conjugates.
Moreover, there we found a use for augmented variance. Nonetheless we are still fairly bound to
our conclusion to Section 3, that one who lacked either the confidence or the software to work with
complex numbers could probably do a work-around with double-real matrices.
So how can actuarial science benefit from complex random variables? The great benefit will come
from new ways of thinking. The first step will be to overcome the habit of picturing a complex
number as half real and half imaginary. Historically, it was only after numbers had expanded from
rational to irrational that the whole set was called “real.” Numbers ultimately are sets; zero is just
the empty set. How real are sets? Regardless of their mathematical reality, they are not physically
real. Complex numbers were deemed “real” because mathematicians needed them for the solution
of polynomial equations. In the nineteenth century this spurred the development of abstract
algebra. At first new ways of thinking amount to differences in degree; at some point many develop
21 Ife a = 0 for some a, then for all z e z = e z − a + a = e z − a e a = e z − a ⋅ 0 = 0 . One who sees that
lim e x ⋅ eiy = 0 ⋅ eiy = 0 might propose to add the ordinate Re( z ) = x = −∞ to the complex plane. But not only
x → −∞
is this proposal artificial; it also militates against the standard theory of complex variables, according to which all points
infinitely far from zero constitute one and the same point at infinity.

into differences in kind. One might argue, “Why study Euclidean geometry? It all derives from a
few axioms.” True, but great theorems (e.g., that the sum of the angles of a triangle is the sum of
two right angles) can be a long way from their axioms. A theorem means more than the course of
its proof; often there are many proofs of a theorem. Furthermore, mathematicians often work
backwards from accepted or desired truths to efficient and elegant sets of axioms. Perhaps the
most wonderful thing about mathematics is its “unreasonable effectiveness in the natural sciences,”
to quote physicist Eugene Wigner. The causality between pure and applied mathematics works in
both directions. Therefore, it is likely that complex random variables and vectors will find their way
into actuarial science. But it will take years, even decades, and technology and education will have to
prepare for it.
14. CONCLUSION
Just as physics divides into different areas, e.g., theoretical, experimental, and applied, so too
actuarial science, though perhaps more concentrated on business application, justifiably has and
needs a theoretical component. Theory and application cross-fertilize each other. In this paper we
have proposed to add complex numbers to the probability and statistics of actuarial theory. With
patience, the technically inclined actuary should be able to understand the theory of complex
random variables delineated herein. In fact, our multivariate approach may even more difficult to
understand than the complex-function theory; but both belong together. Although all complex
matrices and operations were formed from double-real counterparts, we believe that the “sum is
greater than the parts,” i.e., that the assimilation of this theory will lead to higher-order thinking and
creativity. In the sixteenth century the “fiction” of i = − 1 allowed mathematicians to solve more
equations. Although at first complex solutions were deemed “extraneous roots,” eventually their
practicality became recognized; so that today complex numbers are essential for science and

engineering. Applying complex numbers to probability has lagged; but even now it is part of signal
processing in electrical engineering. Knowing how rapidly science has developed with nuclear
physics, molecular biology, space exploration, and computers, who would dare to bet against the
usefulness of complex random variables to actuarial science by the mid-2030s, when many scientists
and futurists expect nuclear fusion to be harnessed and available for commercial purposes?

REFERENCES
[1.] Halliwell, Leigh J., Conjoint Prediction of Paid and Incurred Losses, CAS Forum, Summer 1997, 241-380,
www.casact.org/pubs/forum/97sforum/97sf1241.pdf.
[2.] Halliwell, Leigh J., “Classifying the Tails of Loss Distributions,” CAS E-Forum, Spring 2013, Volume 2,
www.casact.org/pubs/forum/13spforumv2/Haliwell.pdf.
[3.] Havil, Julian, Gamma: Exploring Euler’s Constant, Princeton University Press, 2003.
[4.] Healy, M. J. R., Matrices for Statistics, Oxford, Clarendon Press, 1986.
[5.] Hogg, Robert V., and Stuart A. Klugman, Loss Distributions, New York, Wiley, 1984.
[6.] Johnson, Richard A., and Dean Wichern, Applied Multivariate Statistical Analysis (Third Edition), Englewood
Cliffs, NJ, Prentice Hall, 1992.
[7.] Judge, George G., Hill, R. C., et al., Introduction to the Theory and Practice of Econometrics (Second Edition), New
York, Wiley, 1988.
[8.] Klugman, Stuart A., et al., Loss Models: From Data to Decisions, New York, Wiley, 1998.
[9.] Meyers, Glenn, “The Skew Normal Distribution and Beyond,” Actuarial Review, May 2013, p. 15,
www.casact.org/newsletter/pdfUpload/ar/AR_May2013_1.pdf.
[10.] Million, Elizabeth, The Hadamard Product, 2007, http://buzzard.ups.edu/courses/2007spring/

projects/million-paper.pdf.
[11.] Schneider, Hans, and George Barker, Matrices and Linear Algebra, New York, Dover, 1973.
[12.] Veeravalli, V. V., Proper Complex Random Variables and Vectors, 2006, http://courses.engr.
illinois.edu/ece461/handouts/notes6.pdf.
[13.] Venter, G.G., "Credibility," Foundations of Casualty Actuarial Science (Third Edition), Casualty Actuarial Society,
1996.
[14.] Venter, G.G., "Tails of Copulas," PCAS LXXXIX, 2002, 68-113, www.casact.org/pubs/proceed/
proceed02/02068.pdf.
[15.] Weyl, Hermann, The Theory of Groups and Quantum Mechanics, New York, Dover, 1950 (ET by H. P. Robertson
from second German Edition 1930).
[16.] Wikipedia contributors, “Arcsine distribution,” Wikipedia, The Free Encyclopedia,

http://en.wikipedia.org/wiki/Arcsine_distribution (accessed October 2014).
[17.] Wikipedia contributors, “Complex normal distribution,” Wikipedia, The Free Encyclopedia,
http://en.wikipedia.org/wiki/Complex_normal_distribution (accessed October 2014).
[18.] Wikipedia contributors, “Complex number,” Wikipedia, The Free Encyclopedia,

http://en.wikipedia.org/wiki/Complex_number (accessed October 2014).

APPENDIX A
REAL-VALUED LOGNORMAL RANDOM VECTORS

Feeling that the treatment of lognormal random vectors in Section 8 would be too long, we have
decided to prepare for it in Appendices A and B. According to Section 7, the probability density
function of real-valued n×1 normal random vector x with mean µ and variance Σ is:
−
1
( x −μ )′ Σ −1 ( x −μ )
f x (x ) =
1 2
e
(2π )n Σ
Therefore, ∫ f (x )dV = 1 .
x The single integral over ℜ n represents an n-multiple integral over each
x∈ℜ n
xj from –∞ to +∞; dV = dx 1  dx n .
 ∑t jxj 
n
The moment generating function of x is M x (t ) = E e t′x [ ] = E e j =1  , where t is a suitable n×1


 
 
vector. 22 Partial derivatives of the moment generating function evaluated at t = 0 n×1 equal moments
of x, since:
∂ k1 +kn M x (t )
∂ k1 x 1  ∂ kn x n
[
= E x1k1  x nkn e t ′x ] t =0
[
= E x1k1  x nk n ]
t =0
But lognormal moments are values of the function itself. For example, if t = e j , the jth unit vector,
then M x (e j ) = E e [ ] = E[e ].
e′j x Xj
[
Likewise, M x (e j + e k ) = E e j e X k . The moment generating
X
]
function of x, if it exists, is the key to the moments of e x .
22 All real-valued t vectors are suitable; Appendix B will extend the suitability to complex t.

The moment generating function of the real-valued multivariate normal x is:
M x (t ) = E e t′x [ ]
1 −
1
( x −μ )′ Σ −1 ( x −μ ) t ′x
= ∫ (2π ) n
Σ
e 2
e dV
x∈ℜ n
1 −
1
{
( x −μ )′ Σ −1 ( x −μ )− 2 t ′x }
= ∫ (2π ) n
Σ
e 2
dV
x∈ℜ n
A multivariate “completion of the square” results in the identity:
(x − μ )′ Σ −1 (x − μ ) − 2t ′x = (x − [μ + Σt ])′ Σ −1 (x − [μ + Σt ]) − 2t ′μ − t ′Σt
We leave it for the reader to verify. By substitution, we have:
−
1
{
( x −μ )′ Σ −1 ( x −μ )− 2 t ′x }
M x (t ) =
1
∫ (2π ) n
Σ
e 2
dV
x∈ℜ n
1 −
1
{
( x −[μ + Σt ])′ Σ −1 ( x −[μ + Σt ])− 2 t ′μ − t ′Σt }
= ∫ (2π ) n
Σ
e 2
dV
x∈ℜ n
1 −
1
( x −[μ + Σt ])′ Σ −1 ( x −[μ + Σt ])
= ∫ (2π ) n
Σ
e 2
dV ⋅ e t′μ + t′Σt 2
x∈ℜ n
= 1 ⋅ e t′μ + t′Σt 2
= e t′μ + t′Σt 2
The reduction of the integral to unity in the second last line is due to the fact that
1 −
1
( x −[μ + Σt ])′ Σ −1 ( x −[μ + Σt ])
e 2
is the probability density function of the real-valued n×1 normal
(2π )n Σ
random vector with mean μ + Σt and variance Σ. This new mean is valid if it is real-valued, which
will be so if t is real-valued. In fact, μ + Σt is real-valued if and only if t is real-valued.

So the moment generating function of the real-valued normal multivariate x ~ N (μ, Σ ) is
M x (t ) = e t′μ + t′Σt 2 , which is valid at least for t ∈ ℜ n . As a check: 23
∂M x (t ) ∂M x (t )
= (μ + Σt )e t′μ + t′Σt 2 ⇒ E [x]= =μ
∂t ∂t t =0
And for the second derivative:
∂ 2 M x (t ) ∂ (μ + Σt )e t′μ + t′Σt 2
=
∂t∂t ′ ∂t ′
′
=  Σ + (μ + Σt )(μ + Σt ) e t′μ + t′Σt 2
 
∂ 2 M x (t )
⇒ E [xx ′] = = Σ + μμ ′
∂t∂t ′ t =0
⇒ Var [x] = E [xx ′] − μμ ′ = Σ
The lognormal moments follow from the moment generating function:
Ee[ ] = E[e ] = M (e ) = e
Xj e′j x
x j
e′j μ + e′j Σe j 2
=e
µ j + Σ jj 2
The second moments are conveniently expressed in terms of first moments:
[ ]
E e j e X k = E e
X

(e j + e k )′ x 

(e j + e k )′ μ + (e j + e k )′ Σ (e j + e k ) 2
=e
µ j + µ k + (Σ jj + Σ jk + Σ kj + Σ kk ) 2
=e
µ j + Σ jj 2 + µ k + Σ kk 2 + (Σ jk + Σ kj ) 2
=e
µ j + Σ jj 2 (Σ jk + Σ kj ) 2
=e ⋅ e µ k + Σ kk 2 ⋅ e
µ j + Σ jj 2 (Σ jk + Σ jk ) 2
=e ⋅ e µ k + Σ kk 2 ⋅ e
=Ee [ ]E[e ]⋅ e
Xj Xk Σ jk
23 The vector formulation of partial differentiation is explained in Appendix A.17 of Judge [1988].

[ X
] [
So, Cov e j , e X k = E e j e X k − E e
X
] [ ]E[e ] = E[e ]E[e ](e
Xj Xk Xj Xk Σ jk
)
− 1 , which is the multivariate
equivalent of the well-known scalar formula CV 2 e X = Var e X [ ] [ ] E[e ] X 2

[ ]
= e σ − 1 . Letting E e x
2
denote the n×1 vector whose jth element is E e [ ],

Xj 24 
[ ]

and diag  E e x  as its n×n diagonalization,
 

[ ]  
[ ]  
[ ]
we have Var [x] = diag  E e x {e Σ − 1n×n }diag  E e x  . Because diag  E e x  is diagonal in positive

     
elements (hence, symmetric and PD), Var [x] is NND [or PD] if and only if e Σ −1n×n is NND [or
PD]. 25 Symmetry is no issue here, because for real-valued matrices, Σ is symmetric if and only if
e Σ −1n×n is symmetric.
The relation between Σ and Τ = e Σ − 1n×n is merits a discussion whose result will be clear from a
consideration of 2×2 matrices. Since Σ 2×2 is symmetric, it is defined in terms of three real numbers:
a b 
Σ=  . Now Σ is NND if and only if 1) a ≥ 0 , 2) d ≥ 0 , and 3) ad − b 2 ≥ 0 . Σ is PD if and
b d 
only if these three conditions are strictly greater than zero. If a or d is zero, by the third condition b
also must be zero. 26 Since we are not interested in degenerate random variables, which are
effectively constants, we will require a and d to be positive. With this requirement, Σ is NND if and
24 e A , i.e., that the exponential function of matrix

This would follow naturally from the “elementwise” interpretation of
A
A is the matrix of the exponential functions of the elements of A. But if A is a square matrix, e may have the
∞
“matrix” interpretation I n + ∑ A j j!.
j =1
25 PD [positive-definite] and NND [non-negative-definite] are defined in Section 4.
 X   Cov[ X 1 , X 1 ] Cov[ X 1 , X 2 ] a b 
26 Since NND matrices represent variances, Σ = Var  1  =  =  . The
 X 2  Cov[ X 2 , X 1 ] Cov[ X 2 , X 2 ] b d 
fact that a or d equals 0 implies that b equals 0 means that a random variable can’t covary with another random variable
unless it covaries with itself.

only if b 2 ≤ ad , and PD if and only if b 2 < ad . Since a and d are positive, so too is ad, as well as
the geometric mean γ = ad . So Σ is NND if and only if − γ ≤ b ≤ γ , and PD if and only if
a+d
− γ < b < γ . It is well-known that min (a, d ) ≤ γ ≤ ≤ max(a, d ) with equality if and only if
2
a=d.
e a − 1 e b − 1
Now the same three conditions determine the definiteness of Τ = e Σ − 12×2 =  b .
 e − 1 e d
− 1
Since we required a and d to be positive, both e a − 1 and e d − 1 are positive. This leaves the
( )( ) ( )(
definiteness of Τ dependent on the relation between e b − 1 e b − 1 and e a − 1 e d − 1 . We will )
next examine this relation according to the three cases b = 0 , b > 0 , and b < 0 , all of which must
be subject to − γ ≤ b ≤ γ .
( )( ) (
First, if b = 0 , then − γ < b < γ and Σ is PD. Furthermore, e b − 1 e b − 1 = 0 < e a − 1 e d − 1 . )( )
Therefore, in this case, the lognormal transformation Σ → Τ = e Σ − 12×2 is from PD to PD. And
zero covariance in the normal pair produces zero covariance in the lognormal pair. In fact, since
zero covariance between normal bivariates implies independence (cf. §2.5.7 of Judge [1988]), the
lognormal bivariates also are independent.
In the second case, b > 0 , or more fully, 0 < b ≤ γ . Σ is PD if and only if b < γ . Define the
(( ) )
function ϕ (x ) = ln e x − 1 x for positive real x (or x ∈ ℜ + ). A graph will show that the function
strictly increases, i.e., ϕ ( x1 ) < ϕ ( x 2 ) if and only if x1 < x 2 . Moreover, the function is concave
upward. This means that the line segment between points ( x1 , ϕ ( x1 )) and ( x 2 , ϕ ( x 2 )) lies above

ϕ (x1 ) + ϕ (x 2 )  x + x1 
the curve ϕ ( x ) for intermediate values of x. In particular, ≥ ϕ 1 .
2  2 
 x + x1 
Equivalently, for all x1 , x 2 ∈ ℜ + , 2ϕ  1  ≤ ϕ ( x1 ) + ϕ ( x 2 ) with equality if and only if x1 = x 2 .
 2 
a+d  a+d
Therefore, since a and d are positive, 2ϕ   ≤ ϕ (a ) + ϕ (d ) . And since 0 < γ = ad ≤ ,
 2  2
a+d 
2ϕ (γ ) ≤ 2ϕ   ≤ ϕ (a ) + ϕ (d ) . So 2ϕ (γ ) ≤ ϕ (a ) + ϕ (d ) with equality if and only if a = d .
 2 
Furthermore, since in this case 0<b≤γ , 2ϕ (b ) ≤ 2ϕ (γ ) ≤ ϕ (a ) + ϕ (d ) . Hence,
2ϕ (b ) ≤ ϕ (a ) + ϕ (d ) . The equality prevails if and only if b = γ = a = d , or if and only if a = b = d .
If a = b = d then Σ is not PD; otherwise Σ is PD. Hence:
 eb − 1   ea −1  ed −1
2 ln  = 2ϕ (b ) ≤ ϕ (a ) + ϕ (d ) = ln  + ln 
 b   a   d 
The inequality is preserved by exponentiation:
2
 eb − 1   e a − 1  e d − 1 
  ≤   
 b   a  d 
This leads at last to the inequality:
2 2
 eb − 1  b
( )
e − 1 = b 
b 2 2
 ≤
b2 a
( )( ) (
e − 1 e d − 1 =   e a − 1 e d − 1 ≤ 1 ⋅ e a − 1 e d − 1)( ) ( )( )
 b  ad γ 
( ) (
2
)( )
Therefore, in this case e b − 1 ≤ e a − 1 e d − 1 with equality if and only if a = b = d . This means
that if b > 0 , the lognormal transformation Σ → Τ = e Σ − 12×2 is from NND to NND. But Τ is
( ) (
2
)( )
NND only if e b − 1 = e a − 1 e d − 1 , or only if a = b = d . Otherwise, Τ is PD. So, when
b > 0 , Τ = e Σ − 12×2 is PD except when all four elements of Σ have the same positive value. Even

 a ad 
the NND matrix Σ =   log-transforms into a PD matrix, unless a = d . So all PD and
 ad d
most NND normal variances transform into PD lognormal variances. A NND lognormal variance
indicates that at least one element of the normal random vector is duplicated.
In the third and final case, b < 0 , or more fully, − γ ≤ b < 0 . This is equivalent to 0 < −b ≤ γ , or
( ) (
2
)( )
to the second case with − b . In that case, e −b − 1 ≤ e a − 1 e d − 1 with equality if and only if
a = −b = d . But from this, as well as from the fact that 0 < e 2b < e 2⋅0 < 1 , it follows:
(e b
) 2
(
− 1 = e 2 b 1 − e −b )2
( )
2
( )( ) ( )(
= e 2 b e −b − 1 ≤ e 2 b e a − 1 e d − 1 < 1 ⋅ e a − 1 e d − 1 )
( ) (
2
)( )
So in this case the inequality is strict: e b − 1 < e a − 1 e d − 1 , and Τ is PD. Therefore, if b < 0 ,
the lognormal transform Σ → Τ = e Σ − 12×2 is PD, even if Σ is NND.
To summarize, the lognormal transformation Σ → Τ = e Σ − 12×2 is PD if Σ is PD. Even when Σ is
not PD, but merely NND, Τ is almost always PD. Only when Σ is so NND as to conceal a
random-variable duplication is its lognormal transformation NND.
The Hadamard (elementwise) product and Schur’s product theorem allow for an understanding of
the general lognormal transformation Σ → Τ = e Σ − 1n×n . Denoting the elementwise nth power of Σ
n factors
   ∞
as Σ n = Σ    Σ , we can express elementwise exponentiation as e Σ = ∑ Σ  j j! . So
j =0
∞
Τ = e Σ − 1n×n = ∑ Σ  j j! . According to Schur’s theorem (§3 of Million [2007]), the Hadamard
j =1

product of two NND matrices is NND. 27 Since Σ is NND, its powers Σ  j are NND, as well as the
terms Σ  j j! . Being the sum of a countable number of NND matrices, Τ also must be NND. 28
But if just one of the terms of the sum is PD, the sum itself must be PD. Therefore, if Σ is PD,
then Τ also is PD.
Now the kernel of m×n matrix A is the set of all x ∈ ℜ n such that Ax = 0 , or
ker (A ) = {x : Ax = 0} . The kernel is a linear subspace of ℜ n and its dimensionality is n − rank (A ) .
By the Cholesky decomposition the NND matrix U can be factored as U n×n = W ′Wn×n . The
′
quadratic form in U is x ′Ux = x ′W ′Wx = (Wx ) (Wx ) . If x ′Ux = 0 , then Wx = 0 n×1 , and
Ux = W ′Wx = W ′0 n×1 = 0 n×1 . Conversely, if Ux = 0 n×1 , then x ′Ux = 0 . So the kernel of NND
matrix U is precisely the solution set of x ′Ux = 0 , i.e., x ′Ux = 0 if and only if x ∈ ker (U ) .
∞
Therefore, the kernel of Τ = ∑ Σ  j j! is the intersection of the kernels of Σ  j , or
j =1
∞
( )
ker (Τ) =  ker Σ  j . It is possible for this intersection to be of dimension zero, i.e., for it to equal
j =1
{0 n×1 }, even though the kernel of no Σ  j is. Because of the accumulation of intersections in
k →∞
∑Σ j =1
j
j! , the lognormal transformation of NND matrix Σ tends to be “more PD” than Σ itself.
27 Appendix C provides a proof of this theorem.

 
28 For the quadratic form of a sum equals the sum of the quadratic forms: x ′ ∑ U k  x = ∑ x ′U k x . In fact, if the
 k  k
∞
coefficients aj are non-negative, then ∑a Σ
j =1
j
j
is NND, provided that the sum converges, as e Σ does.

We showed above that if Σ is PD, then Τ = e Σ − 1n×n is PD. But even if Σ is NND, Τ will be PD,
Σ jj Σ jk 
unless Σ contains a 2×2 subvariance  whose four elements are all equal.
Σ kj Σ kk 

APPENDIX B
THE NORMAL MOMENT GENERATING FUNCTION AND COMPLEX

ARGUMENTS
In Appendix A we saw that the moment generating function of the real-valued normal multivariate
x ~ N (μ, Σ ) , viz., M x (t ) = e t′μ + t′Σt 2 , is valid at least for t ∈ ℜ n . The validity rests on the identity
1 −
1
( x −[μ + Σt ])′ Σ −1 ( x −[μ + Σt ])
∫ (2π ) n
Σ
e 2
dV = 1 for real-valued ξ = μ + Σt . But in Section 8 we must
x∈ℜ n
1
− ( x − ξ )′ Σ −1 ( x − ξ )
know the value of ϕ (ξ ) =
1
∫ (2π ) n
Σ
e 2
dV when ξ is a complex n×1 vector. So in
x∈ℜ n
this appendix, we will prove that for all ξ ∈ C n , ϕ (ξ ) = 1 .
The proof begins with diagonalization. Since Σ is symmetric and PD, Σ −1 exists and is symmetric
and PD. According to the Cholesky decomposition (Healy [1986, §7.2]), there exists a real-valued
n×n matrix W such that W ′W = Σ −1 . Due to theorems on matrix rank, W must be non-singular, or
invertible. So the transformation y = Wx is one-to-one. And letting ζ = Wξ , we have
y − ζ = W (x − ξ ) . Moreover, the volume element in the y coordinates is:
dVx
dVy = W dVx = W dVx = W′ W dVx = W′W dVx = Σ −1 dVx =
2
.
Σ
Hence:

1
− ( x − ξ )′ Σ −1 ( x − ξ )
ϕ (ξ ) =
1
∫ (2π ) n
Σ
e 2
dV
x∈ℜ n
1
1 − ( x − ξ )′ W′W ( x − ξ )
= ∫ (2π ) n
Σ
e 2
dVx
x∈ℜ n
1 −
1
[W ( x −ξ )]′ [W ( x −ξ )] dVx
= ∫ (2π ) n
e 2
Σ
x∈ℜ n
1 −
1
( y − ζ )′ ( y − ζ )
= ∫ (2π ) n
e 2
dVy
y∈ℜ n
∑( )2
1 n
− y j −ζ j
1
∫ (2π )
2 j =1
= e dV
n
y∈ℜ n
∑( )2
1 n
n − y j −ζ j
1
∫ ∏
2 j =1
= e dV
y∈ℜ n j =1 2π
∑( )2
1 n
n ∞ − y j −ζ j
1
=∏ ∫
2 j =1
e dy j
j =1 y j = −∞ 2π
= ∏ ψ(ζ j )
n
j =1
1
b
− ( x − ζ )2
In the last line ψ(ζ ) = lim dx . Obviously, if ζ is real-valued, ψ(ζ ) = 1 . So the
1
∫
a → −∞ x = a 2π
e 2
b → +∞
issue of the value of a moment generating function of a complex variable resolves into the issue of
the “total probability” of a unit-variance normal random variable with a complex mean. 29
To evaluate ψ(ζ ) requires some complex analysis with contour integrals. First, consider the
z2
−
standard-normal density function with a complex argument: f ( z ) =
1
e 2
. By function-
2π
composition rules, since both z 2 and the exponential function are “entire” functions (i.e., analytic
29 We deliberately put ‘total probability’ in quotes because the probability density function with complex ζ is not proper;
it may produce negative and even complex values for probability densities.

over the whole complex plane), so too is f ( z ) . Therefore, ∫ f (z )dz = 0 for any closed contour C
C
(cf. Appendix D.7 of Havil [2003]). Let C be the parallelogram traced from vertex z = b to vertex
z = a to vertex z = a − ζ to vertex z = b − ζ and finally back to vertex z = b . Therefore:
0 = ∫ f ( z )dz
C
a a −ζ b −ζ b
= ∫ f ( z )dz + ∫ f (z )dz + ∫ f (z )dz + ∫ f (z )dz
b a a −ζ b −ζ
The line segments along which the second and fourth integrals proceed are finite; their common
length is L = (a − ζ ) − a = − ζ = ζ = b − (b − ζ ) , where ζ = ζ ζ ≥ 0 . By the triangle inequality
a −ζ a −ζ a −ζ
∫ f (z )dz ≤ ∫ f (z )dz = ∫ f (z ) dz . But f ( z ) is a continuous real-valued function, so over a

a a a
closed interval it must be upper-bounded by some positive real number M. Hence,
a −ζ a −ζ a −ζ a −ζ
∫ f (z )dz ≤ ∫ f (z ) dz ≤ ∫ Sup( f (z ) ) dz = Sup( f (z ∈ [a, a − ζ ]) ) ∫ dz = M (a ) ⋅ L .

a a a a
Likewise,
b b
∫ f (z )dz ≤ Sup( f (z ∈ [b, b − ζ ]) ) ∫ dz = M (b ) ⋅ L .

b −ζ b −ζ
Now, in general:

f (z ) = f (z ) f (z )
z2 z2
1 − 1 −
= e 2
⋅ e 2
2π 2π
z2 z2
− −
e 2
⋅e 2
=
2π
z2 z2
− −
∝ e 2
⋅e 2
z2 +z2
−
∝ e 2
∝ e − Re (z )
2
Therefore, since a ∈ ℜ :
lim M (a ) ⋅ L ∝ lim Sup e − Re (z ) ; z ∈ [a, a − ζ ] ⋅ L

2
a → −∞ a → −∞  
 − a 2 Re   z  
2

  a   z  ζ  
∝ lim Sup  e  
; ∈ 1, 1 − ⋅L
a 2 → +∞
 a  a  
 
∝ e −∞ Re (1) ⋅ L
∝0
Similarly, lim M (b ) ⋅ L ∝ 0 . So in the limit as a → −∞ and b → +∞ on the real axis, the second
b → +∞
and fourth integrals approach zero. Accordingly:
0 = lim {0}
a → −∞
b → +∞
 a a −ζ b −ζ b 
= lim ∫ f ( z )dz + ∫ f ( z )dz + ∫ f ( z )dz + ∫ f ( z )dz 
a → −∞  b 
b → +∞ 
a a −ζ b −ζ
 
a b −ζ
= lim ∫ f ( z )dz + ∫ f ( z )dz 

a → −∞  b 
b → +∞ 
a −ζ
b −ζ a b +∞ 1
− z2
lim ∫ f (z )dz = − lim ∫ f (z )dz = lim ∫ f (z )dz = ∫
1
And so, e 2
dz = 1
a → −∞ a − ζ a → −∞ b a → −∞ a −∞ 2π
b → +∞ b → +∞ b → +∞

So, at length:
1
b
− ( x − ζ )2
ψ(ζ ) = lim
1
∫
a → −∞ x = a 2π
e 2
dx
b → +∞
1
b
− (( z + ζ )− ζ )2
d (z + ζ )
1
= lim ∫
a → −∞ x = a 2π
e 2
b → +∞
b −ζ 1
− z2
d (z + ζ )
1
= lim ∫
a → −∞ z = a − ζ 2π
e 2
b → +∞
b −ζ 1
1 − z2
= lim ∫
a → −∞ z = a − ζ 2π
e 2
dz
b → +∞
b −ζ
= lim ∫ f (z )dz
a → −∞ z = a − ζ
b → +∞
=1
+∞ 1
− ( x − ζ )2
So, working backwards, what we proved for one dimension, viz., ψ(ζ ) =
1
∫
x = −∞ 2π
e 2
dx = 1 ,
1
− ( x − ξ )′ Σ −1 ( x − ξ )
applies n-dimensionally: for all ξ ∈ C , ϕ (ξ ) =
1
∫ (2π ) dV = 1 . Therefore, even
n 2
e
Σ
n
x∈ℜ n
for complex t, M x (t ) = e t′μ + t′Σt 2 . Complex values are allowable as arguments in the moment
generating function of a real-valued normal vector. This result is critical to Section 8.
Though we believe the contour-integral proof above to be worthwhile for its instructional value, a
simple proof comes from the powerful theorem of analytic continuation (cf. Appendix D.12 of
Havel [2003]). This theorem concerns two functions that are analytic within a common domain. If
the functions are equal over any smooth curve within the domain, no matter how short, 30 then they
30 The length of the curve must be positive; equality at single points, or punctuated equality, does not qualify.

1
b
− ( x − ζ )2
are equal over all the domain. Now ψ(ζ ) = lim
1
∫
a → −∞ x = a 2π
e 2
dx is analytic over all the complex
b → +∞
plane. And for all real-valued ζ, ψ(ζ ) = 1 . So ψ(ζ ) and f (ζ ) = 1 are two functions analytic over the
complex plane and identical on the real axis. Therefore, by analytic continuation ψ(ζ ) must equal
one for all complex ζ. Analytic continuation is analogous with the theorem in real analysis that two
smooth functions equal over any interval are equal everywhere. Analytic continuation derives from
the fact that a complex derivative is the same in all directions. It is mistaken to regard the real and
imaginary parts of the derivative as partial derivatives, as if they applied respectively to the real and
imaginary axes of the independent variable. Rather, the whole derivative applies in every direction.

APPENDIX C
EIGEN-DECOMPOSITION AND SCHUR’S PRODUCT THEOREM

Appendix A quoted Schur’s Product Theorem, viz., that the Hadamard product of non-negative-
definite (NND) matrices is NND. Million [2007] proves it as Theorem 3.4; however, we believe our
proof in this appendix to be simpler; moreover, it affords a review of eigen-decomposition. Those
familiar with eigen-decomposition may skip to the last paragraph.
Let Γ be an n×n Hermetian NND matrix. As explained in Section 4, ‘Hermetian’ means that
Γ = Γ * ; ‘NND’ means that for every complex n×n vector z (or z ∈ C n ), z * Γz ≥ 0 . Positive
definiteness [PD] is a stricter condition, in which z * Γz = 0 if and only if z = 0 n×1 .
Complex scalar λ and non-zero vector ν form an “eigenvalue-eigenvector” pair of Γ, if Γν = λν .
Since ν = 0 n×1 is excluded as a trivial solution, vector ν can be scaled to unity, or ν * ν = 1 . But
Γν = λν if and only if (Γ − λI n )ν = 0 n×1 . If Γ − λI n is non-singular, or invertible, then:
ν = I n ν = (Γ − λI n ) (Γ − λI n )ν = (Γ − λI n )−1 0 n×1 = 0 n×1

−1
Hence, allowable eigenvectors require for Γ − λI n to be singular, or for its determinant Γ − λI n to
be zero. Since the determinant is an nth-degree equation (with complex coefficients based on the
elements of Γ) in λ, it has n root values of λ, not necessarily distinct. So the determinant can be
factored as f (λ ) = Γ − λI n = (λ − λ 1 ) (λ − λ n ) . Since f (λ j ) = Γ − λ j I n = 0 , there exist non-
zero solutions to (Γ − λI n )ν = 0 n×1 . So for every eigenvalue there is a non-zero eigenvector, even a
non-zero eigenvector of unit magnitude.

The first important result is that the eigenvalues of Γ are real-valued and non-negative. Consider the
jth eigenvalue-eigenvector pair, which satisfies the equation Γν j = λ j ν j . Therefore,
ν *j Γν j = λ j ν *j ν j . Since Γ is NND, ν *j Γν j is real-valued and non-negative. Also, ν *j ν j is real-
valued and positive. Therefore, their quotient λj is a real-valued and non-negative scalar.
Furthermore, if Γ is PD, ν *j Γν j is positive, as well as λj.
The second important result is that eigenvectors paired with unequal eigenvalues are orthogonal.
Let the two unequal eigenvalues be λ j ≠ λ k . Because the eigenvalues are real-valued, λ j = λ j . The
eigenvector equations are Γν j = λ j ν j and Γν k = λ k ν k . The following string of equations relies on
Γ’s being Hermetian (so Γ = Γ * ):
(λ j − λ k )ν *j ν k = λ j ν *j ν k − λ k ν *j ν k
= λ j ν *j ν k − λ k ν *j ν k
( ) −λ ν ν
= λ j ν *k ν j
*
k
*
j k
= (ν Γν ) − ν Γν
* * *
k j j k
= ν Γ ν k − ν Γν k
*
j
* *
j
= ν *j Γν k − ν *j Γν k
=0
Because λ j − λ k ≠ 0 , the eigenvectors must be orthogonal, or ν *j ν k = 0 . If all the eigenvectors are
distinct, the eigenvectors form an orthogonal basis of C n . But even if not, the kernel of each
{ }
eigenvalue, or ker (Γ − λ j I n ) = z ∈ C n : Γz = λ j z is a linear subspace of C n whose rank or
dimensionality equals the multiplicity of the root λj in the characteristic equation f (λ ) = Γ − λI n .
This means that the number of mutually orthogonal eigenvectors paired with an eigenvalue equals

how many times that eigenvalue is a root of its characteristic equation. Consequently, there exist n
eigenvalue-eigenvector pairs (λ j , ν j ) such that Γν j = λ j ν j and ν *j ν k = δ ij . 31
Now, define W as the partitioned matrix Wn×n = [ν 1  ν n ] . The jkth element of W*W equals
ν *j ν k = δ ij ; hence, W * W = I n . A matrix whose transjugate is its inverse is called “unitary,” as is
W. 32 Furthermore, define Λ as the n×n diagonal matrix whose jjth element is λj. Then:
λ 1 
ΓW = Γ[ν 1  ν n ] = [Γν 1  Γν n ] = [λ 1 ν 1  λ n ν n ] = [ν 1 
 ν n ]   = WΛ

 λ n 
And so, Γ = ΓI n = ΓWW * = WΛW * , and Γ is said to be “diagonalized.” Thus have we shown,
33
assuming the theory of equations, the third important result, viz., that every NND Hermetian
matrix can be diagonalized. Other matrices can be diagonalized; the NND [or PD] consists in the
fact that all the eigenvalues of this diagonalization are non-negative [or positive].
The fourth and final “eigen” result relies on the identity W * ν j = e j , which just extracts the jth
columns of each side of W * W = I n . As in Appendix A, e j is the jth unit vector. Therefore:
31 The Kronecker delta, δ ij , is the function IF(i = j , 1, 0 ) .

32 To be precise, at this point W* is only the left-inverse of W. But by matrix-rank theorems, the rank of W equals n, so
W has a unique full inverse ( ) ( )
W -1 . Then W * = W * I n = W * WW -1 = W * W W -1 = I n W -1 = W -1 .
33 The theory of equations guarantees the existence of the roots of the nth-degree equation f (λ ) = Γ − λI n .
Appendix D.9 of Havel [2003] contains a quick and lucid proof of this, the fundamental theorem of algebra.

Γ = WΛW *
 n 
= W ∑ λ j e j e*j  W *
 j =1 
 n
(
= W ∑ λ j W * ν j W * ν j)( ) W
* *
 j =1 
 n 
= W ∑ λ j W * ν j ν *j W  W *
 j =1 
 n 
= WW *  ∑ λ j ν j ν *j  WW *
 j =1 
 n 
= I n  ∑ λ j ν j ν *j I n
 j =1 
n
= ∑ λ j ν j ν *j
j =1
n
The form ∑λ ν
j =1
j j ν *j is called the “spectral decomposition” of Γ (§7.4 of Healy [1986]), which plays
the leading role in the following succinct proof of Schur’s product theorem.
If Σ and Τ are two n×n Hermetian NND definite matrices, we may spectrally decompose them as
n n
Σ = ∑ λ j ν j ν *j and Τ = ∑ κ j η j η*j , where all the λ and κ scalars are non-negative. Accordingly:
j =1 j =1

(Σ  Τ) jk = (Σ ) jk (Τ) jk
 n   n 
=  ∑ λ r ν r ν *r   ∑ κ s η s η*s 
 r =1  jk  s =1  jk
 n  n 
=  ∑ λ r (ν r ) j (ν r )k  ∑ κ s (η s ) j (η s )k 
 r =1  s =1 
n n
= ∑∑ λ r κ s (ν r ) j (η s ) j (ν r )k (η s )k
r =1 s =1
n n
= ∑∑ λ r κ s (ν r  η s ) j (ν r  η s )k
r =1 s =1
( )
n n
= ∑∑ λ r κ s (ν r  η s ) j ν r  η s k
r =1 s =1
( )
n n
= ∑∑ λ r κ s (ν r  η s )(ν r  η s )
*
jk
r =1 s =1
 n n *
=  ∑∑ λ r κ s (ν r  η s )(ν r  η s ) 
 r =1 s =1  jk
n n
Hence, Σ  Τ = ∑∑ λ r κ s (ν r  η s )(ν r  η s ) . Since each matrix (ν r  η s )(ν r  η s ) is Hermetian
* *
r =1 s =1
NND, and each scalar λ r κ s is non-negative, Σ  Τ must be Hermetian NND. Therefore, the
Hadamard product of NND matrices is NND.

Forum 15fforum Halliwell Complex

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Forum 15fforum Halliwell Complex

Uploaded by

Copyright:

Available Formats

Complex Random Variables

Leigh J. Halliwell, FCAS, MAAA

Casualty Actuarial Society E-Forum, Fall 2015 1

Appendix A (“Real-Valued Lognormal Random Vectors”), regardless of their interest in complex

2. INVERTING COMPLEX MATRICES

requires that m = n . Complex matrix W = A + iB is an inverse of Z if and only if ZW = WZ = I n ,

Z −1 = W . Under what conditions does W exist?

Casualty Actuarial Society E-Forum, Fall 2015 2

of the conjugates, 1 if Z is non-singular, then Z Z −1 = ZZ −1 = I n = I n . Similarly, Z −1 Z = I n .

and i n Z . Therefore, the invertibility of X + iY , − Y + iX , − X − iY , Y − iX , X − iY , Y + iX ,

and imaginary parts.

Now if the inverse of Z is W = A + iB , then I n = (X + iY )(A + iB) = (A + iB)(X + iY ) .

Expanding the first equality, we have:

Therefore, ZW = I n if and only if XA − YB = I n and YA + XB = 0 n×n . We may combine the last

two equations into the partitioned-matrix form:

Since YA + XB = 0 if and only if − YA − XB = 0 , another form just as valid is:

We may combine these two forms into the balanced form:

1 If Z and W are conformable to multiplication:

Casualty Actuarial Society E-Forum, Fall 2015 3

3. COMPLEX MATRICES AS DOUBLE-REAL MATRICES

Returning to the hint above, we easily see an addition analogy:

Casualty Actuarial Society E-Forum, Fall 2015 4

Z1 Z 2 = (X 1 + iY1 )(X 2 + iY2 ) = (X 1 X 2 − Y1 Y2 ) + i (X 1 Y2 + Y1 X 2 ) . This is analogous with the

Z = X + iY may be reduced by invertible transformations to “canonical form” (Healy [1986], 32-

Casualty Actuarial Society E-Forum, Fall 2015 5

U m×m Z m×n Vn×n = (P + iQ )(X + iY )(R + iS)

The double-real analogue to this is:

Casualty Actuarial Society E-Forum, Fall 2015 6

complex numbers could probably do a work-around with double-real matrices. 2

4. COMPLEX MATRICES AND VARIANCE

Casualty Actuarial Society E-Forum, Fall 2015 7

rank (A n×r ) = r , then the r×r matrix A ′ΣA is PD.

greater significance, if a and b are real-valued n×1 vectors:

a ′Ya + a ′Ya a ′Ya + (− a ′Ya )

Casualty Actuarial Society E-Forum, Fall 2015 8

Casualty Actuarial Society E-Forum, Fall 2015 9

symmetric, (X + iY ) = X ′ − iY ′ = X − i (− Y ) = X + iY . A matrix equal to its transposed conjugate

is said to be Hermetian: matrix Γ is Hermetian if and only if Γ* = Γ . Therefore, Γn×n = X + iY is

Casualty Actuarial Society E-Forum, Fall 2015 10

γ kj = [Γ ]kj = [Γ * ]kj = [Γ ′]kj = [Γ ] jk = [Γ ] jk = γ jk . Because of this, it is fitting and natural to define

the variance of a complex random vector as:

transjugated, not simply transposed. This renders Γ Hermetian, since:

distribution of z must also be so. Usually Γ is PD, in which case Γ −1 exists.

5. THE EXPECTATION OF THE STANDARD QUADRATIC FORM

trace of the expectation equals the expectation of the trace. Consequently:

Casualty Actuarial Society E-Forum, Fall 2015 11

The quadratic form is:

Since the elements have unit variances, the expectation is:

Casualty Actuarial Society E-Forum, Fall 2015 12

The complex quadratic form is:

Casualty Actuarial Society E-Forum, Fall 2015 13

for general variance Γ involves diagonalizing Γ, i.e., that Γ can be eigen-decomposed as

Γ = WΛW * , where Λ is diagonal and WW * = W * W = I n . 6

The distribution of the n×1 standard complex normal random vector is f z (z ) =

distributed has mean E [z ] = 0 n×1 and variance Var [z ] = E [zz ′] = I n .

6. COMPLEX VARIANCE, PSEUDOVARIANCE, AND AUGMENTED

Casualty Actuarial Society E-Forum, Fall 2015 14

will call the “augmented variance.”

(for brevity we ignore the mean):

z  z  z  *  z   Cov[z, z ] Cov[z, z ]

Var [z ] = Var [z ] = Γ . And second, Cov[z , z ] = E [z z ′] = E [zz′] = Cov[z, z ] = C . Therefore:

7 In general, for any complex scalar α, Var [α z ] = ααVar [z ] .

Casualty Actuarial Society E-Forum, Fall 2015 15

normal distribution, which we will treat next.