Bs Tools 2015 04 07 17 15 12

Business Statistics
Tommaso Proietti
DEF - Universit`
a di Roma Tor Vergata
Tools for data analysis
Vectors
A vector is a collection of numbers (scalars) ordered by column (or

row).
a1
..
.

5
a = ai , a = 2 , a =
1
..
3
.
an
From a geometric perspective, a is a point in a n-dimensional space
(a subset of Rn ), with coordinates provided by the elements ai .
The symbol a0 (a transpose) denotes a row vector:

a0 = [a1 , a2 , . . . , an ].
Special cases:
0 (zero vector);
ei = [0, . . . , 0, 1, 0, . . . , 0]0 (unit vector);
i = [1, 1, . . . , 1]0 .
Basic operations:
I
Multiplication by a scalar.
Let denote a scalar; a is the vector with elements {ai }.
[Geom. Interpretation: a point in the same direction as a].
Sum of two vectors.

Let a and b be two vectors with the same size n; their sum
c = a + b is the vector with elements ci = ai + bi . [Geometric
Int.: a new point in Rn obtained by the parallelogram rule].
Linear combination.
Let a and b denote 2 vectors and let 1 e 2 be two
coefficients:
1 a + 2 b
is their linear combination.
Figure: Sum of two vectors (parallelogram rule )
c
b
Inner (scalar) product
The inner product between the two n-dimensional vectors a and b

is defined as
n
X
a0 b =
ai bi
i=1
Example:
2
5
a = 1 , b = 2 , a0 b = 25+(1)(2)+3(3) = 3
3
3
Example: x = n1 i0 x is the average of the elements of x.

3
1
4
1

n = 4, x =
2 ,i = 1 ,
7
1
1
1
x0 i = (3 1 + 4 1 + 2 1 + 7 1) = 4
4
4
Example: let x and y be zero mean vectors containing n
measurements on two variables (n1 i0 x = n1 i0 y = 0). Then
Cxy = n1 x0 y is the sample covariance.
Vector Norm or length

By Pythagoras theorem, the norm of a is the square root of the
inner product of a with itself:
||a|| =
a0 a =
n
X
!1/2
ai2
i=1
This is the distance from the origin of the point a or the length of
the vector.
I
||a|| 0
The (normalized) vector
a
||a||
has unit length.
Example: let x a zero mean vector. Then SDx = n1/2 ||x|| is the
sample standard deviation of X ; Dx = ||x||2 is the deviance of X
and Vx = Dx /n is the variance.
Euclidean distance
The distance between the vectors xi and xj is the norm of the

difference vector xi xj :
q
dij = ||xi xj || = (xi xj )0 (xi xj ) =
p
X
(xik xjk )2
!1/2
k=1
Orthogonality
Two vectors are orthogonal, a b, iff their inner product is zero,
a0 b = 0.
Examples

a=
1
1

,b =

c=
||c|| =
1
1
, ||a|| = ||b|| =
1
3

,d =
10 = 3.16, ||d|| =
a0 c = 2
.6
.2
2, a0 b = 0

,
.4 = 0.63, c0 d = 0
Geometric interpretation
Let a, b V Rn .
We can choose a scalar and a vector c, orthogonal to b, b c,
so that, by the parallelogram law, a = c + b.
= b is the orthogonal projection of a along the
The vector a
vector b.
The inner product of both sides of a = c + b and the vector b
yields
a0 b = c0 b + b0 b = ||b||2
Thus, = a0 b/||b||2 , and we can write the projection of a
=
a
a0 b
b
||b||2
= b(b0 b)1 b0 a)
(we can rearrange the formula in this way: a
Note: = (b0 b)1 b0 a represents the coordinate of the point a in

the subspace generated by the vector b.
Example: let x and y be zero mean vectors containing n
measurements on two variables X and Y .
The scalar
Cxy
x0 y
= (x0 x)1 x0 y =
2
||x||
Vx
is the coefficient of the regression of Y on X and
y =
Cxy
x
Vx
is the vector of predicted values of Y .

P
x yi
q P i iP
( i xi2 )( i yi2 )
is the sample correlation coefficient.
Matrices
A matrix is a rectangular
numbers.
x11
x21
..
.
X=
xi1
..
.
xn1
(n p) or two-dimensional array of
x12 . . . x1k
x22 . . . x2k
..
..
..
.
.
.
xi2 . . . xik
..
..
..
.
.
.
xn2 . . . xnk
. . . x1p
. . . x2p
..
..
.
.
,
. . . xip
..
..
.
.
. . . xnp
The rows or columns can be considered as vectors.

We can also write X = {xij }. In a typical data matrix, the index
i = 1, . . . , n, refers to the statistical units, and the index
k = 1, . . . , p, to the variables or attributes.
We can represent X as a partitioned matrix whose generic block is

the 1 p row vector x0i = [xi1 , xi2 , . . . , xik , . . . , xip ], which contains
the profile of the i-th row unit.
0
x1
..
.
0
X=
xi ,
..
.
x0n
Alternatively, we can partition
X = [x1 , x2 , . . . , xk , . . . , xp ] .
where xk is the n 1 column vector referring to the k-th variable.
Matrix manipulations
Matrix transpose: transposition yields the m n matrix with rows
and columns interchanged:
x11 x21 . . . xi1 . . . xn1

x12 x22 . . . xi2 . . . xn2
..
..
..
..
..
..
.
.
.
.
.
.
X0 =
x1k x2k . . . xik . . . xnk ,
..
..
..
..
..
..
.
.
.
.
.
.
x1p x2p . . . xip . . . xnp
Scalar multiplication: X = {xik },
Matrix sum: C = X + Y, cij = {xik + yik }
Matrix product
let A be an e n m matrix whose i-th row is the 1 m vector a0i .
Let also B be an m p matrix whose j-th column is the m 1
vector bj , so that
0
a1
..
.
0
B = [b1 , . . . , bj , . . . , bp ]
A=
ai ,
..
.
a0n
The matrix product C = AB, where A pre-multiplies B, is the
n p matrix with elements
cij = a0i bj =
m
X
k=1
aik bkj , i = 1, . . . , n; j = 1, . . . , p.
Properties
I
(A0 )0 = A
(A + B)0 = A0 + B0
(AB)0 = B0 A0 .
(AB)C = A(BC)
A(B + C) = AB + AC
Il matrix product is not commutative: if n 6= p, BA is not even

defined.
Notice the difference between X0 X (p p matrix of crossproducts)
and XX0 (n n)
Examples:

1
0
1
2 1
A = 5 1 , B =
,i = 1
3
6
3
2
1
2 1
C = AB = 7 11 , i0 C = [21
12
9
BA is not defined.
3]
Mean vector:
1 0
Xi=
x
n
x0 =
x1
..
.
x=
k
..
.
xp
(p 1)
10
i X = [
x1 , . . . , xk , . . . , xp ] , (1 p)
n
Important special cases

I
I
I
A square matrix has many rows as columns (m = n).

A square matrix A is symmetric if A0 = A.
Diagonal matrix: a square matrix with nonzero diagonal
entries
d1
D=
0
..
.
d2
..
.
0
0
0
0
...
..
.
..
.
..
.
...
0
..
.
= diag(d1 , . . . , dn )
0
dn
0
..
.
dn1
0
Identity matrix: if A is n n, In A = A, AIm = A.
1 0 ... 0 0
0 1 ... 0 0
I = ... . . . . . . . . . ... ,
0 0 ... 1 0
0
...
Example: variance covariance

2
s1 s12
s21 s 2
2
..
..
.
.
S=
sk1 sk2
..
..
.
.
sp1 sp2
sk2 =
matrix S (p p):
. . . s1k . . . s1p
. . . s2k . . . s2p
.
..
..
. . . . ..
.
,
2
. . . sk . . . skp
.. . .
..
.
... .
.
. . . spk . . . sp2
n
n
1X
1X
(xik xk )2 , shk =
(xih xh )(xik xk )
n
n
i=1
i=1
S is symmetric (shk = skh ), and p.d.

In terms of the original data matrix X,
1
1
S = (X in
x0 )0 (X in
x0 ) = X0 X xx0 ,
n
n
where in is an n 1 of ones.
In terms of the unit profiles
n
1X
S=
(xi
x)(xi
x)0
n
Matrix rank
Let A be n m.
Column (row) space: the vector space generated by the column
(row) vectors that form the matrix A.
Column (row) rank: the dimension of the vector space generated
by the columns (rows) of the matrix A.
The column and row rank are coincident and so we can define the
rank of the matrix as the maximum number of linearly independent
vectors (those forming either the rows or the columns) and denote
it by r(A). Obviously, r(A) min(n, m).
If r(A) = p the matrix is full column rank.
Note: if Ab = 0 for b 6= 0, the columns of A are linearly
dependent and A is reduced rank.
Determinant
Let A be n n. Its determinant, det(A), or |A|, is a scalar, whose
absolute value measures the volume of the parallelogram delimited
in Rn by the columns of A.
For the identity matrix, |I| = 1.
For D = diag(d1 , . . . , dn )
|D| = d1 d2 dn =
n
Y
di
i=1
Moreover, if is a scalar |D| = n |D|.

If A is 2 2,

a11 a12
= a11 a22 a12 a21
|A| =
a21 a22
The general expression for the determinant is the following Laplace

(cofactor) expansion:
|A| =
n
X
aij (1)i+j |Aij |
j=1
where Aij is the submatrix obtained from A by removing the i-th

row and the j-th column; |Aij | is called a minor of A and
(1)i+j |Aij | is a cofactor.
(We do not need to bother about this formula. The computation
of the determinant is usually carried out by transforming the
matrix into a simpler form, e.g. triangular).
Properties
I
If the columns (rows) of A are linearly dependent, so that

rank(A) < n, |A| = 0.
|AB| = |A| |B|.
|A0 | = |A|.
Trace of a matrix
The trace of a square matrix is the sum of its diagonal elements. If

A is n n:
n
X
tr(A) =
aii
i=1
Properties:
tr(A) = tr(A)
tr(A + B) = tr(A) + tr(B)
tr(A0 ) = tr(A)
tr(AB) = tr(BA)
Linear equations systems

Consider the system of n linear equations in n unknowns, where A
is a known n n coefficients matrix and b a known n 1 vector:
Ax = b
The system is said to be homogeneous if b = 0.
In the latter case there exists non trivial solutions (x 6= 0) iff A has
reduced rank (rank(A) < n) or the columns of A are linearly
dependent, or equivalently |A| = 0.
A non homogeneous system admits a unique solution iff |A| =
6 0,
or equivalently rank(A) = n. In such case the solution can be
written is
x = A1 b
where A1 is the inverse of A.
Matrix Inverse
Let A be a square matrix of dimension p with full rank (r(A) = p).
The inverse matrix is the matrix B which when pre-multiplied or
post-multiplied by A returns the identity matrix:
BA = I, AB = I
In the sequel we shall write B = A1 .
The inverse exists and is unique iff r(A) = p, in which case we say
that A is non singular or invertible.
For a diagonal matrix the computation of the inverse is immediate:
D1 = diag(1/d1 , . . . , 1/dn ).
In general, the computation of the inverse entails the solution of a
system of p 2 linear equations in p 2 unknowns.
The solution can be expressed as follows

A1 =
1
A
|A|
where A is known as the adjoint matrix of A, with elements given

by the cofactors of A:
aji = (1)i+j |Aij |
We now illustrate the 2 2 case. Form the definition of an inverse,

AB = I, it follows

a11 a12
b11 b12
1 0
=
.
a21 a22
b21 b22
0 1
This yields a system of 4 equations in 4 unknowns with solution:

b11 b12
a22 a12
1
= a11 a22 a12 a21
a21
b21 b22
a11

a
a
22
12
1
=
|A|
a21
a11
Examples:
S = diag{10, 20, 1/2}; S1 = diag{0.1, 0.05, 2}
S = diag{10, 20, 0}; S1 does not exist

1.0 0.2
1.04 0.21
1
R=
;R =
0.2 1.0
0.21
1.04

1.0 0.9
5.26 4.74
1
R=
;R =
0.9 1.0
4.74
5.26

1.0 0.9
5.26 4.74
1
R=
;R =
0.9
1.0
4.74 5.26

1 1
R=
; R1 does not exist
1
1

1 1
R=
; R1 does not exist
1 1
10 20 15
15 15 15
0
1
X=
5 3 4 ; (X X) does not exist
20 32 26
Properties of the matrix inverse
Hereby we list some useful properties of matrix inverse.

I
|A1 | =
(A1 )1 = A
(A1 )0 = (A0 )1
(AB)1 = B1 A1
1
|A|
Orthogonal matrix A square matrix is orthogonal if the transpose

and the inverse coincide
A0 A = I
AA0 = I

Bs Tools 2015 04 07 17 15 12

Uploaded by

Copyright:

Available Formats

You might also like

Bs Tools 2015 04 07 17 15 12

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Bs Tools 2015 04 07 17 15 12

Uploaded by

Copyright:

Available Formats

Business Statistics

Tools for data analysis

A vector is a collection of numbers (scalars) ordered by column (or

The symbol a0 (a transpose) denotes a row vector:

Sum of two vectors.

Figure: Sum of two vectors (parallelogram rule )

Inner (scalar) product

The inner product between the two n-dimensional vectors a and b

Example: x = n1 i0 x is the average of the elements of x.

Vector Norm or length

The (normalized) vector

has unit length.

The distance between the vectors xi and xj is the norm of the

Note: = (b0 b)1 b0 a represents the coordinate of the point a in

is the vector of predicted values of Y .

The rows or columns can be considered as vectors.

We can represent X as a partitioned matrix whose generic block is

x11 x21 . . . xi1 . . . xn1

Il matrix product is not commutative: if n 6= p, BA is not even

Important special cases

A square matrix has many rows as columns (m = n).

Identity matrix: if A is n n, In A = A, AIm = A.

Example: variance covariance

S is symmetric (shk = skh ), and p.d.

Moreover, if is a scalar |D| = n |D|.

The general expression for the determinant is the following Laplace

aij (1)i+j |Aij |

where Aij is the submatrix obtained from A by removing the i-th

If the columns (rows) of A are linearly dependent, so that

|AB| = |A| |B|.

The trace of a square matrix is the sum of its diagonal elements. If

Linear equations systems

The solution can be expressed as follows

where A is known as the adjoint matrix of A, with elements given

We now illustrate the 2 2 case. Form the definition of an inverse,

Properties of the matrix inverse

Hereby we list some useful properties of matrix inverse.

Orthogonal matrix A square matrix is orthogonal if the transpose

You might also like