A Summary of

MATH1231 Mathematics 1B


For MATH1231 students, this summary is an extract from the Algebra Notes in the MATH1231/41
Course Pack for revision.
If you found any mistakes or typos, please send me an email to


Chapter 6


Definitions and examples of vector spaces

Definition 1. A vector space V over the field F is a non-empty set V of vectors
on which addition of vectors is defined and multiplication by a scalar is defined in
such a way that the following ten fundamental properties are satisfied:
1. Closure under Addition. If u, v V , then u + v V .
2. Associative Law of Addition. If u, v, w V , then (u+v)+w = u+(v+w).
3. Commutative Law of Addition. If u, v V , then u + v = v + u.
4. Existence of Zero. There exists an element 0 V such that, for all v V ,
v + 0 = v.
5. Existence of Negative. For each v V there exists an element w V
(usually written as v), such that v + w = 0.
6. Closure under Multiplication by a Scalar. If v V and F, then
v V .
7. Associative Law of Multiplication by a Scalar. If , F and v V ,
then (v) = ()v.
8. If v V and 1 F is the scalar one, then 1v = v.
9. Scalar Distributive Law. If , F and v V , then ( + )v = v + v.
10. Vector Distributive Law. If F and u, v V , then (u + v) = u + v.

The following are vectors spaces.

Rn over R, where n is a positive integer.
Cn over C, where n is a positive integer.
P(F), Pn (F) over F, where F is a field. Usually F is either Q, R or C.


Mmn (F) over F, where m, n are positive integers and F is a field.

Furthermore, the following set, its subset of all continuous function and its subset of all differentiable
functions are vector spaces over R.
R[X], the set of all possible real-valued functions with domain X.


Vector arithmetic

Proposition 1. In any vector space V , the following properties hold for addition.
1. Uniqueness of Zero. There is one and only one zero vector.
2. Cancellation Property. If u, v, w V satisfy u + v = u + w, then v = w.
3. Uniqueness of Negatives. For all v V , there exists only one w V such that v + w = 0.
Proposition 2. Suppose that V is a vector space over a field F, F, v V , 0 is the zero scalar
in F and 0 is the zero vector in V . Then the following properties hold for multiplication by a scalar:
1. Multiplication by the zero scalar. 0v = 0,
2. Multiplication of the zero vector. 0 = 0.
3. Multiplication by 1. (1)v = v (the additive inverse of v).
4. Zero products. If v = 0, then either = 0 or v = 0.
5. Cancellation Property. If v = v and v 6= 0 then = .


Definition 1. A subset S of a vector space V is called a subspace of V if S is
itself a vector space over the same field of scalars as V and under the same rules for
addition and multiplication by scalars.
In addition if there is at least one vector in V which is not contained in S, the
subspace S is called a proper subspace of V .

Theorem 1 (Subspace Theorem). A subset S of a vector space V over a field F, under the same
rules for addition and multiplication by scalars, is a subspace of V if and only if
i) The vector 0 in V also belongs to S.
ii) S is closed under vector addition, and
iii) S is closed under multiplication by scalars from F.



Linear combinations and spans

Definition 1. Let S = {v1 , . . . , vn } be a finite set of vectors in a vector space V
over a field F. Then a linear combination of S is a sum of scalar multiples of the
1 v1 + + n vn with 1 , . . . , n F.

Proposition 1 (Closure under Linear Combinations). If S is a finite set of vectors in a vector

space V , then every linear combination of S is also a vector in V .

Definition 2. Let S = {v1 , . . . , vn } be a finite set of vectors in a vector space V

over a field F. Then the span of the set S is the set of all linear combinations of S,
that is,
span (S) = span (v1 , . . . , vn )
= {v V : v = 1 v1 + + n vn

for some 1 , . . . , n F}.

Theorem 2 (A span is a subspace). If S is a finite, non-empty set of vectors in a vector space V ,

then span(S) is a subspace of V . Further, span(S) is the smallest subspace containing S (in the
sense that span(S) is a subspace of every subspace which contains S).

Definition 3. A finite set S of vectors in a vector space V is called a spanning

set for V if span (S) = V or equivalently, if every vector in V can be expressed as
a linear combination of vectors in S.


Matrices and spans in Rm

Proposition 3 (Matrices, Linear Combinations and Spans).

If S = {v1 , . . . , vn } is a set of
vectors in Rm and A is the m n matrix whose columns are the vectors v1 , . . . , vn then
a) a vector b in Rm can be expressed as a linear combination of S if and only if it can be
expressed in the form Ax for some x in Rn ,
b) a vector b in Rm belongs to span (S) if and only if the equation Ax = b has a solution x in
Rn .

Definition 4. The subspace of Rm spanned by the columns of an m n matrix A

is called the column space of A and is denoted by col(A).




Solving problems about spans

Linear independence
Definition 1. Suppose that S = {v1 , . . . , vn } is a subset of a vector space. The
set S is a linearly independent set if the only values of the scalars 1 , 2 , . . . , n
for which
1 v1 + + n vn = 0
are 1 = 2 = = n = 0.

Definition 2. Suppose that S = {v1 , . . . , vn } is a subset of a vector space. The

set S = {v1 , . . . , vn } is a linearly dependent set if it is not a linearly independent
set, that is, if there exist scalars 1 , . . . , n , not all zero, such that
1 v1 + + n vn = 0.


Solving problems about linear independence

We have seen that questions about spans in Rm can be answered by relating them to questions
about the existence of solutions for systems of linear equations. The same is true for questions
about linear dependence in Rm .
Proposition 1. If S = {a1 , . . . , an } is a set of vectors in Rm and A is the m n matrix whose
columns are the vectors a1 , . . . , an then the set S is linearly dependent if and only if the system
Ax = 0 has at least one non-zero solution x Rn .


Uniqueness and linear independence

Theorem 2 (Uniqueness of Linear Combinations). Let S be a finite, non-empty set of vectors

in a vector space and let v be a vector which can be written as a linear combination of S. Then
the values of the scalars in the linear combination for v are unique if and only if S is a linearly
independent set.


Spans and linear independence

Theorem 3. A set of vectors S is a linearly independent set if and only if no vector in S can be
written as a linear combination of the other vectors in S, that is, if and only if no vector in S is
in the span of the other vectors in S.
Note. The theorem is equivalent to:
A set of vectors S is a linearly dependent set if and only if at least one vector in S is in the
span of the other vectors in S.
Theorem 4. If S is a finite subset of a vector space V and the vector v is in V , then
span (S {v}) = span (S) if and only if v span (S).
Theorem 5. Suppose that S is a finite subset of a vector space. The span of every proper subset
of S is a proper subspace of span (S) if and only if S is a linearly independent set.
Theorem 6. If S is a finite linearly independent subset of a vector space V and v is in V but not
in span (S) then S {v} is a linearly independent set.



Basis and dimension


Definition 1. A set of vectors B in a vector space V is called a basis for V if:
1. B is a linearly independent set, and
2. B is a spanning set for V (that is, span (B) = V ).

Let B = {v1 , . . . , vn } be a basis for a vector space V over F. Every vector v V can be
uniquely written as
v = 1 v1 + + n vn ,

where 1 , . . . , n F.

An orthonormal basis is a basis whose elements are all of length 1 and are mutually orthogonal.
The advantage of using an orthonormal basis is that we can write easily any vector as the unique
linear combination of the basis by dot product.



Theorem 1. The number of vectors in any spanning set for a vector space V is always greater
than or equal to the number of vectors in any linearly independent set in V .
Theorem 2. If a vector space V has a finite basis then every set of basis vectors for V contains
the same number of vectors, that is, if B1 = {u1 , . . . , um } and B2 = {v1 , . . . , vn } are two bases for
the same vector space V then m = n.

Definition 2. If V is a vector space with a finite basis, the dimension of V ,

denoted by dim(V ), is the number of vectors in any basis for V . V is called a finite
dimensional vector space.

Theorem 3. Suppose that V is a finite dimensional vector space.

1. the number of vectors in any spanning set for V is greater than or equal to the dimension of
2. the number of vectors in any linearly independent set in V is less than or equal to the dimension of V ;
3. if the number of vectors in a spanning set is equal to the dimension then the set is also a
linearly independent set and hence a basis for V ;
4. if the number of vectors in a linearly independent set is equal to the dimension then the set
is also a spanning set and hence a basis for V .



Existence and construction of bases

Theorem 4. If S is a finite non-empty subset of a vector space then S contains a subset which is
a basis for span (S).
In particular, if V is any non-zero vector space which can be spanned by a finite set of vectors
then V has a basis.
Theorem 5. Suppose that V is a vector space which can be spanned by a finite set of vectors. If S
is a linearly independent subset of V then there exists a basis for V which contains S as a subset.
In other words, every linearly independent subset of V can be extended to a basis for V .
Theorem 6 (Reducing a spanning set to a basis in Rm ). Suppose that S = {v1 , . . . , vn } is any
subset of Rm and A is the matrix whose columns are the members of S. If U is a row-echelon form
for A and S is created from S by deleting those vectors which correspond to non-leading columns
in U then S is a basis for span (S).
Theorem 7 (Extending a linearly independent set to a basis in Rm ).
Suppose that S = {v1 , . . . , vn } is a linearly independent subset of Rm and A is the matrix whose
columns are the members of S followed by the members of the standard basis for Rm . If U is a rowechelon form for A and S is created by choosing those columns of A which correspond to leading
columns in U then S is a basis for Rm containing S as a subset.
Proposition 8. If V is a finite-dimensional space and W is a subspace of V and dim(W ) = dim(V )
then W = V .

Chapter 7


Introduction to linear maps

Definition 1. Let V and W be two vector spaces over the same field F. A function
T : V W is called a linear map or linear transformation if the following two
conditions are satisfied.
Addition Condition. T (v + v ) = T (v) + T (v ) for all v, v V , and
Scalar Multiplication Condition. T (v) = T (v) for all F and v V .

Proposition 1. If T : V W is a linear map, then

1. T (0) = 0 and
2. T (v) = T (v) for all v V .
Theorem 2. A function T : V W is a linear map if and only if for all 1 , 2 F and v1 , v2 V
T (1 v1 + 2 v2 ) = 1 T (v1 ) + 2 T (v2 ).


Theorem 3. If T is a linear map with domain V and S is a set of vectors in V , then the function
value of a linear combination of S is equal to the corresponding linear combination of the function
values of S, that is, if S = {v1 , . . . , vn } and 1 ,. . .,n are scalars, then
T (1 v1 + + n vn ) = 1 T (v1 ) + + n T (vn ).
Theorem 4. For a linear map T : V W , the function values for every vector in the domain are
known if and only if the function values for a basis of the domain are known.
Further, if B = {v1 , . . . , vn } is a basis for the domain V then for all v V we have
T (v) = x1 T (v1 ) + + xn T (vn ),
where x1 , . . . , xn are the scalars in the unique linear combination v = x1 v1 + + xn vn of the basis



Linear maps from Rn to Rm and m n matrices

Theorem 1. For each m n matrix A, the function TA : Rn Rm , defined by

TA (x) = Ax


x Rn ,

is a linear map.
Theorem 2 (Matrix Representation Theorem). Let T : Rn Rm be a linear map and let the
vectors ej for 1 6 j 6 n be the standard basis vectors for Rn . Then the m n matrix A whose
columns are given by
aj = T (ej ) for 1 6 j 6 n
has the property that
T (x) = Ax for all x Rn .


Geometric examples of linear transformations

Proposition 1. Suppose that T : Rn Rm is a linear map. It maps a line in Rn to either a

line or a point in Rm .


Subspaces associated with linear maps

The kernel of a map

The kernel of a map.

Definition 1. Let T : V W be a linear map. Then the kernel of T (written
ker(T )) is the set of all zeroes of T , that is, it is the subset of the domain V defined
ker(T ) = {v V : T (v) = 0}.

Definition 2. For an m n matrix A, the kernel of A is the subset of Rn defined

ker(A) = {x Rn : Ax = 0} ,
that is, it is the set of all solutions of the homogeneous equation Ax = 0.
Theorem 1. If T : V W is a linear map, then ker(T ) is a subspace of the domain V .
Definition 3. The nullity of a linear map T is the dimension of ker(T ). The
nullity of a matrix A is the dimension of ker(A).
Proposition 2. Let A be an m n matrix with real entries and TA : Rn Rm the associated
linear transformation. Then
ker(TA ) = ker(A)


Proposition 3. For a matrix A:

nullity(A) = maximum number of independent vectors in the solution space of Ax = 0
= number of parameters in the solution of Ax = 0 obtained by Gaussian
elimination and back substitution
= number of non-leading columns in an equivalent row-echelon form U for A.

Proposition 4. The columns of a matrix A are linearly independent if and only if nullity(A) = 0.


Definition 4. Let T : V W be a linear map. Then the image of T is the set
of all function values of T , that is, it is the subset of the codomain W defined by
im(T ) = {w W : w = T (v) for some v V }.

Definition 5. The image of an m n matrix A is the subset of Rm defined by

im(A) = {b Rm : b = Ax for some x Rn } .
Theorem 5. Let T : V W be a linear map between vector spaces V and W . Then im(T ) is a
subspace of the codomain W of T .
Proposition 6. Let A be an m n matrix with real entries and TA : Rn Rm the associated
linear transformation. Then
im(A) = im(TA )

Definition 6. The rank of a linear map T is the dimension of im(T ). The rank
of a matrix A is the dimension of im(A).
Proposition 7. For a matrix A:
rank(A) = maximal number of linearly independent columns of A
= number of leading columns in a row-echelon form U for A


Rank, nullity and solutions of Ax = b

Theorem 8 (Rank-Nullity Theorem for Matrices). For any matrix A,

rank(A) + nullity(A) = number of columns of A.
Theorem 9 (Rank-Nullity Theorem). Suppose V and W are finite dimensional vector spaces and
T : V W is linear. Then
rank(T ) + nullity(T ) = dim(V ).



Theorem 10. The equation Ax = b has:

1. no solution if rank(A) 6= rank([A|b]), and
2. at least one solution if rank(A) = rank([A|b]). Further,
i) if nullity(A) = 0 the solution is unique, whereas,
ii) if nullity(A) = > 0, then the general solution is of the form
x = xp + 1 k1 + + k


1 , . . . , R,

where xp is any solution of Ax = b, and where {k1 , . . . , k } is a basis for ker(A).

Chapter 8


Definitions and examples

Definition 1. Let T : V V be a linear map. Then if a scalar and non-zero
vector v V satisfy
T (v) = v,
then is called an eigenvalue of T and v is called an eigenvector of T for the
eigenvalue .

Note. An eigenvector is non-zero, but zero can be an eigenvalue.

Definition 2. Let A Mnn (C) be a square matrix. Then if a scalar C and

non-zero vector v Cn satisfy
Av = v,
then is called an eigenvalue of A and v is called an eigenvector of A for the
eigenvalue .


Some fundamental results

Theorem 1. A scalar is an eigenvalue of a square matrix A if and only if det(A I) = 0, and

then v is an eigenvector of A for the eigenvalue if and only if v is a non-zero solution of the
homogeneous equation (A I)v = 0, i.e., if and only if v ker(A I) and v 6= 0.
Theorem 2. If A is an n n matrix and C, then det(A I) is a complex polynomial of
degree n in .

Definition 3. For a square matrix A, the polynomial p() = det(A I) is called

the characteristic polynomial for the matrix A.



Theorem 3. An n n matrix A has exactly n eigenvalues in C (counted according to their

multiplicities). These eigenvalues are the zeroes of the characteristic polynomial p() = det(AI).
Note. The equation p() = 0 is called the characteristic equation for A.



Calculation of eigenvalues and eigenvectors

Eigenvectors, bases, and diagonalisation

Theorem 1. If an n n matrix has n distinct eigenvalues then it has n linearly independent

Note. Even if the n n matrix does not have n distinct eigenvalues, it may have n linearly
independent eigenvectors.
Theorem 2. If an n n matrix A has n linearly independent eigenvectors, then there exists an
invertible matrix M and a diagonal matrix D such that
M 1 AM = D.
Further, the diagonal elements of D are the eigenvalues of A and the columns of M are the eigenvectors of A, with the jth column of M being the eigenvector corresponding to the jth element of
the diagonal of D.
Conversely if M 1 AM = D with D diagonal then the columns of M are n linearly independent
eigenvectors of A.
Definition 1. A square matrix A is said to be a diagonalisable matrix if there
exists an invertible matrix M and diagonal matrix D such that M 1 AM = D.


Applications of eigenvalues and eigenvectors

Powers of A

Proposition 1. Let D be the diagonal matrix

1 0 . . . 0
0 2

D= .
.. .
. .
0 0 . . . n
Then, for k > 1,

1 0 . . . 0
0 k

Dk = .
.. .
. .
0 0 . . . kn
Proposition 2. If A is diagonalisable, that is, if there exists an invertible matrix M and diagonal
matrix D such that M 1 AM = D, then
Ak = M D k M 1

for integer

k > 1.



Solution of first-order linear differential equations

Proposition 3. y(t) = vet is a solution of

= Ay
if and only if is an eigenvalue of A and v is an eigenvector for the eigenvalue .
Proposition 4. If u1 (t) and u2 (t) are two solutions of the equation
= Ay,
then any linear combination of u1 and u2 is also a solution.




Chapter 9


Some Preliminary Set Theory

Definition 1. A set is a collection of objects. These objects are called elements.

Definition 2.
A set A is a subset of a set B (written A B) if and only if
each element of A is also an element of B; that is, if x A, then x B.
The power set P(A) of A is set of all subsets of A.
The universal set S is the set that denotes all objects of given interest.
The empty set (or {}) is the set with no elements.

Definition 3. A set S is countable if its elements can be listed as a sequence.

Definition 4.

For all subsets A, B S, define the following set operations:

complement of A:


= {x S : x
/ A}

intersection of A and B:

A B = {x S : x A and x B}

union of A and B:

A B = {x S : x A or x B}


A B = {x S : x A but x
/ B} = A B c

Definition 5. Sets A and B are disjoint (or mutually exclusive) if and only if
AB =



Definition 6. Disjoint subsets A1 , . . . , Ak partition a set B if and only if

A1 Ak = B
Lemma 1. If A1 , . . . , An partition S and B is a subset of S, then A1 B, . . . , An B partition B.
Definition 7. If A is a set, then |A| is the number of elements in A.

The Inclusion-Exclusion Principle.


|A B| = |A| + |B| |A B|

Sample Space and Probability Axioms
Definition 1. A sample space of an experiment is a set of all possible outcomes.

Definition 2. A probability P on a sample space S is any real function on P(S)

that satisfies the following conditions:
(a) 0 6 P (A) 6 1 for all A S;
(b) P () = 0;
(c) P (S) = 1;
(d) If A and B are disjoint, then P (A B) = P (A) + P (B).
Theorem 1. Let P be a probability on a sample space S, and let A be an event in S.
1. If S is finite (or countable), then P (A) =
P ({a}) .

2. If S is finite and P ({a}) is constant for all outcomes a S, then P (A) =
3. If S is finite (or countable), then
P ({a}) = 1 .


Rules for Probabilities

Theorem 2. Let A and B be events of a sample space S.

1. P (A B) = P (A) + P (B) P (A B)
2. P (Ac ) = 1 P (A)
3. If A B, then P (A) 6 P (B).

(Addition Rule)




Conditional Probabilities

We now consider what happens if we restrict the sample space from S to some event in S.

Definition 3. The conditional probability of A given B is denoted and defined

P (A B)
P (A|B) =
provided that P (B) 6= 0 .
P (B)

Lemma 3. For any fixed event B, the function P (A|B) is a probability on S.

Multiplication Rule

P (A B) = P (A|B)P (B) = P (B|A)P (A)

Total Probability Rule

If A1 , . . . , An partition S and B is an event, then

P (B) =


P (B|Ai )P (Ai ) .


Bayes Rule
If A1 , . . . , An partition S and B is an event, then

P (Aj |B) =

P (B|Aj )P (Aj )

P (B|Ai )P (Ai )



Statistical Independence
Definition 4. Events A and B are (statistically) independent if and only if
P (A B) = P (A)P (B)

Definition 5. Events A1 , . . . , An are mutually independent if and only if, for

any Ai1 , . . . , Aik of these,
P (Ai1 Aik ) = P (Ai1 ) P (Aik ) .


Random Variables
Definition 1. A random variable is a real function defined on a sample space.



Definition 2. For a random variable X on some sample space S, define for all
subsets A S and real numbers r R,
{X A} = {s S : X(s) A}
{X = r} = {s S : X(s) = r}
{X 6 r} = {s S : X(s) 6 r}
... and so on.

Definition 3.
is given by

The cumulative distribution function of a random variable X

FX (x) = P (X 6 x)


for x R .

Discrete Random Variables

Definition 4. A random variable X is discrete if its image is countable.

Definition 5. The probability distribution of a discrete random variable X is

the function P (X = x) on R. We sometimes write this as pk = P (X = xk ).


The Mean and Variance of a Discrete Random Variable

Definition 6. The expected value (or mean) of a discrete random variable X
with probability distribution pk
E(X) =
xk p k .
all k

The expected value E(X) is often denoted by or X .

Theorem 1. Let X be a discrete random variable with probability distribution pk = P (X = xk ).
Then for any real function g(x), the expected value of Y = g(X) is
E(Y ) = E(g(X)) =
g(xk )pk .

Definition 7. The variance of a discrete random variable X is

Var(X) = E (X E(X))2 .
The standard deviation of X is SD(X) = Var(X) .



2 .
The standard deviation is often denoted by or X , and the variance is often written as 2 or X

Var(X) = E(X 2 ) (E(X))2 .

Theorem 2.

Theorem 3. If a and b are constants, then

E(aX + b) = aE(X) + b
Var(aX + b) = a2 Var(X)
SD(aX + b) = |a| SD(X) .


Special Distributions

A Bernoulli trial is an experiment with two outcomes, often success and failure, or Y(es) and
N(o), or {1, 0}, where P (Y ) and P (N ) are denoted by p and q = 1 p, respectively. A Bernoulli
process is an experiment composed of a sequence of identical and mutually independent Bernoulli


The Binomial Distribution

Definition 1. The Binomial distribution B(n, p) for n N is the function
n k
B(n, p, k) =
p (1 p)nk where k = 0, 1, . . . , n .

Theorem 1. If X is the random variable that counts the successes of some Bernoulli process with
n trials having success probability p, then X has the binomial distribution B(n, p).
We write X B(n, p) to denote that X is a random variable with this distribution.
Theorem 2. If X is a random variable and X B(n, p), then
E(X) = np ;

Var(X) = npq = np(1 p).


Geometric Distribution
Definition 2. The Geometric distribution G(p) is the function
G(p, k) = (1 p)k1 p = q k1 p where k = 1, 2, . . . .

Theorem 3. Consider an infinite Bernoulli process of trials each of which has success probability p.
If the random variable X is the number of trials conducted until success occurs for the first time,
then X has the geometric distribution G(p).
Theorem 4. If X G(p) and n is a positive integer, then P (X > n) = (1 p)n = q n .
Theorem 6. If X is a random variable and X G(p), then
E(X) =


Var(X) =

p2 .




Sign Tests

Often, we have a sample of data consisting of independent observations of some quantity of interest,
and it might be of interest to see whether the observed values differ systematically from some fixed
and pre-determined value.
To answer this question, one may use a sign test approach as follows:
1. Count the number of observations that are strictly greater than the target value (+).
2. Count the total number of observations that are either strictly greater (+) or strictly smaller
() than the target value.
3. Calculate the tail probability that measures how often one would expect to observe as many
increases (+) as were observed, if there were equal probability of + and .
In this course, we will say that if the tail probability is less than 5% then we will regard this as


Continuous random variables

Definition 1. Random variable X is continuous if and only if FX (x) is continuous.

Strictly speaking, FX (x) must actually be piecewise differentiable, which means that FX (x) is
differentiable except for at most countably many points. However, the above definition is good
enough for our present purposes.
Definition 2. The probability density function f (x) of a continuous random
variable X is defined by
F (x) , x R
f (x) = fX (x) =
if F (x) is differentiable, and lim
F (x) if F (x) is not differentiable at x = a.
xa dx
Theorem 1. F (x) =

f (t)dt.


The mean and variance of a continuous random variable

Definition 3. The expected value (or mean) of a continuous random variable X
with probability density function f (x) is defined to be
xf (x)dx .
= E(X) =

Theorem 2. If X is a continuous random variable with density function f (x), and g(x) is a real
function, then the expected value of Y = g(X) is
E(Y ) = E(g(X)) =
g(x)f (x)dx .



Definition 4. The variance of a continuous random variable X is

Var (X) = E((X E(X))2 ) = E(X 2 ) (E(X))2 .
The standard deviation of X is = SD(X) = Var (X).
Theorem 3. If a and b are constants, then
E(aX + b) = aE(X) + b
Var(aX + b) = a2 Var(X)
SD(aX + b) = |a| SD(X) .
Theorem 4. If E(X) = and Var(X) = 2 , and Z =


, then E(Z) = 0 and Var(Z) = 1.

Special Continuous Distributions

The Normal Distribution
Definition 1. A continuous random variable X has normal distribution N (, 2 )
if it has probability density
1 x
(x) =
e 2
where < x < .
2 2

Theorem 1. If X is a continuous random variable and X N (, 2 ), then

E(X) =

Var(X) = 2 .
N (0, 1).

The normal distribution is used, among other things, to approximate the binomial distribution
B(n, p) when n grows large.

Theorem 2. If X N (, 2 ), then

