Professional Documents
Culture Documents
Stat513 l12
Stat513 l12
1
And not just in R
Python 3.7.2 (default, Feb 12 2019, 08:15:36)
[Clang 10.0.0 (clang-1000.11.45.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information
>>> -0.035+0.025+0.01
-1.734723475976807e-18
>>> -0.035+0.01+0.025
0.0
>>> sum(-0.035,0.025, 0.01)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: sum expected at most 2 arguments, got 3
>>> sum([-0.035, 0.025, 0.01])
-1.734723475976807e-18
>>> sum([-0.035, 0.01, 0.025])
0.0
>>>
>>> mean([-0.035,0.01, 0.025])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name ’mean’ is not defined
2
Numerics can be treacherous
Time series: let Wt be a “white noise” with standard normal
distribution: Wt are uncorrelated (and thus independent) random
variables with mean 0 and variance 1. An AR(1) process
Yt = ϕYt−1 + Wt is “stationary” (let us say: stable), if |ϕ| < 1.
> tser=rep(0,100)
> for (k in 2:100) tser[k] = (1/2)*tser[k-1]+rnorm(1) ## phi=1/2
> plot.ts(tser)
2
1
tser
0
-1
-2
-3
0 20 40 60 80 100
Time
3
On the other hand
On the other hand, the AR(1) process with ϕ = 2 is “explosive”
> tser=rep(0,100)
> for (k in 2:100) tser[k] = 2*tser[k-1]+rnorm(1) ## phi=2
> plot.ts(tser)
4e+29
2e+29
tser
0e+00
0 20 40 60 80 100
Index
5
Really?
1.0
0.0
tss
-1.0
-2.0
0 20 40 60 80 100
Index
So: when I start with Y1 as above, set Yt = Yt−1 + Wt, with the
same Wt I used above, I should get the same thing, right?
> tser=rep(0,100)
> tser[1]=tss[1]
> for (k in 2:100) tser[k] = 2*tser[k-1]+inno[k]
> plot.ts(tser)
6
How come???
2.0e+13
tser
1.0e+13
0.0e+00
0 20 40 60 80 100
Index
7
Hm... [1:50]
forward
1.0
0.0
tser[1:50]
-1.0
-2.0
0 10 20 30 40 50
Index
backward
1.0
0.0
tss[1:50]
-1.0
-2.0
0 10 20 30 40 50
Index
8
A bit more... [1:57]
forward
2
1
tser[1:57]
0
-1
-2
0 10 20 30 40 50
Index
backward
1.0
0.0
tss[1:57]
-1.0
-2.0
0 10 20 30 40 50
Index
9
And yet a bit more... [1:60]
forward
20
15
tser[1:60]
10
5
0
0 10 20 30 40 50 60
Index
backward
1.0
0.0
tss[1:60]
-1.0
-2.0
0 10 20 30 40 50 60
Index
10
A tale of expert code I: floating point arithmetics
Floating-point arithmetics: numbers are represented as
base ∗ 10exponent - which has inevitable consequences
> 0.000001*1000000
[1] 1
> x=0; for (k in (1:1000000)) x=x+0.000001
> x
[1] 1
> x-1
[1] 7.918111e-12
11
A better algorithm thus does it
> x=0; for (k in (1:1000000)) x=x+0.000001; x=x+1000000
> x
[1] 1000001
> x-1000000
[1] 1
> x-1000001
[1] 0
Yeah, but what to do in general? The solution seems to be: use
addition programmed by experts
> sum
function (..., na.rm = FALSE) .Primitive("sum")
> x=sum(c(1000000,rep(0.000001,1000000)))
> x
[1] 1000001
> x-1000000
[1] 1
> x-1000001
[1] -2.561137e-09
12
Vectorization alone does not do it
13
A tale of expert code II: never invert a matrix...
The theory for a linear model y ∼ Xβ suggests that you obtain the
least squares estimates via the formula
b = (XTX)−1XTy
However, in computing you are never ever (well, every rule has an
exception, but still) supposed to do
b <- solve(t(X) %*% X) %*% t(X) %*% y
Doing alternatively
b <- solve(crossprod(X)) %*% crossprod(X, y)
does not really save it
14
... but rather solve (a system of) equations
To this end,
b <- solve(crossprod(X), crossprod(X, y))
may work pretty well; but experts know that the best way is via a so-
called QR decomposition (MATLAB “backslash” operator), which
in R amounts to
b <- qr.solve(X, y)
This is correct - but many people do not need to know that much;
unless they are in certain special situations), they may just do
b <- coef(lm(y ~ X-1))
and it amounts to the same thing!
15
Showing the difference is, however, a bit intricate...
16
So...
First, let us try this:
> sum((x1-x)^2)
[1] 9.795661e-10
> sum((x2-x)^2)
[1] 8.119665e-10
> sum((x3-x)^2)
[1] 7.313153e-22
This is only mildly convincing (and in fact, may be even other way
round in some versions)
But this one seems to stay:
> sum((bb - AA %*% x1)^2)
[1] 2.482263e-13
> sum((bb - AA %*% x2)^2)
[1] 3.111039e-20
> sum((bb - AA %*% x3)^2)
[1] 1.84273e-29
> sum((bb - AA %*% x)^2)
[1] 0
17
Vector and matrix algebra
18
Type conversions
General format as.type
> qr.solve(X, y)
x
20733.83 -20728.85
> as.vector(qr.solve(X, y))
[1] 20733.83 -20728.85
> as.vector(coef(lm(y~X-1)))
[1] 20733.83 -20728.85
> as.vector(solve(crossprod(X), crossprod(X, y)))
[1] 20737.19 -20732.21
> as.vector(solve(t(X) %*% X) %*% t(X) %*% y)
[1] 20737.20 -20732.22
Note: in R, vectors are interpreted not rowwise or columnwise, but in
an “ambiguous manner”: whatever suits more for a multiplication to
succeed. In other words, the same square matrix can be multiplied by
the same vector from both sides: X %*% a or a %*% X - which creates
usually no problem, until we have an expression a %*% a which is
always a number, aTa for column vectors. If we want to obtain
aaT, a matrix, we need to write a %*% t(a)
19
Potpourri
> numeric(4)
[1] 0 0 0 0
> rep(0,4)
[1] 0 0 0 0
> rep(c(0,1),4)
[1] 0 1 0 1 0 1 0 1
> rep(c(0,1),c(3,2))
[1] 0 0 0 1 1
> X=matrix(0,nrow=2,ncol=2)
> X=matrix(1:4,nrow=2,ncol=2)
> X
[,1] [,2]
[1,] 1 3
[2,] 2 4
> as.vector(X)
[1] 1 2 3 4
> as.matrix(1:4)
[,1]
[1,] 1
[2,] 2
[3,] 3
[4,] 4
20
Finally, reminder
21
Some reminders from linear algebra
Useful formulae: (AB)T = BTAT
det(AB) = det(A) det(B) det(AT) = det(A)
Useful definitions: we say that matrix A is
nonnegative definite (or positive semidefinite): xTAx > 0 for every x
positive definite: xTAx > 0 for every x 6= 0
The definitions imply that A is a square matrix; some automatically
require that it is also symmetric, so better check (in statistics it is
almost always symmetric matrices the definitions are applied to)
Useful habit in theory (albeit not observed by R in practice): consider
vectors as n × 1 columns (in statistics, it is always like this)
Useful caution: if a is an n × 1 vector, then aTa is a number (which
we did denote by kak22), but aaT is an n × n matrix. In general,
matrix multiplication is not commutative: AB is in general different
from BA
Useful principle: block matrices are multiplied in a same way as usual
matrices, only blocks are itself matrices, thus multiplied as such, and
hence the dimensions must match
Useful practice: check dimensions
22
Appendix: some Python again
Adding again
Python 3.7.2 (default, Feb 12 2019, 08:15:36)
[Clang 10.0.0 (clang-1000.11.45.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information
>>> 0.000001*1000000
1.0
>>> x=0
>>> for k in range(1000000): x=x+0.000001
>>> x
1.000000000007918
>>> x-1
7.918110611626616e-12
>>> x=1000000
>>> for k in range(1000000): x=x+0.000001
>>> x
1000001.0000076145
>>> x-1000000
1.00000761449337
>>> x-1000001
7.614493370056152e-06
23
Elementary arithmetics also no problem
24
Now, the code of the experts
25