Professional Documents
Culture Documents
Recursive Last Squares 2
Recursive Last Squares 2
Recursive Last Squares 2
• We’ve said that the optimal estimate, x ̂ , is the one that minimizes this ‘loss’:
̂ = argmin x ℒLS(x) = argmin x (e12 + e22 + … + em2 )
xLS
• We can ask which x makes our measurement most likely. Or, in other words,
which x maximizes the conditional probability of y :
x ̂ = argmax x p(y | x)
ymeas
p(y | xb)
Which x is the
most likely given
p(y | xa) the measurement?
p(y | xc)
p(y | xd)
y
Measurement Model
• Then:
p(y | x) = 𝒩(x, σ 2) p(y | x)
y=x
Least Squares and Maximum Likelihood
• Probability density function of a Gaussian is:
2 1 −(z − μ)2
𝒩(z; μ, σ ) = e 2σ2
σ 2π
• Our conditional measurement likelihood is
2
p(y | x) = 𝒩(y; x, σ )
1 −(y − x)2
= e 2σ2
2πσ 2
̂
xMLE = argmax x p(y | x)
• Instead of trying to optimize the likelihood directly, we can take its logarithm:
̂
xMLE = argmax x p(y | x) The logarithm is
= argmax x log p(y | x) monotonically increasing!
• Resulting in:
1 We’ve combined
log p(y | x) = − 2 ((y1 − x)2 + … + (ym − x)2) + C several terms into a
2σ constant that does
not depend on x.
We can ignore it for
the purposes of
maximization
Least Squares and Maximum Likelihood
• Since
̂
xMLE = argmin x − (log p(y | x))
1
= argmin x 2 ((y1 − x)2 + … + (ym − x)2)
2σ
Least Squares and Maximum Likelihood
• So:
1
̂
xMLE = argmin x 2 ((y1 − x) + … + (ym − x) )
2 2
2σ
2 ( σ12 )
1 (y1 − x) (ym − x)
̂
xMLE = argmin x + … +
σm2
In both cases,
̂
xMLE ̂ = argmin x ℒLS(x) = argmin x ℒMLE(x)
= xLS
The Central Limit Theorem
• In realistic systems like self driving cars, there are many sources of ‘noise’
Central Limit Theorem: When independent random variables are added, their
normalized sum tends towards a normal distribution.
1 1068 1 1068
2 988
2 988
3 1002
3 1002 4 996
4 996 5 (outlier) 1430
x ̂ = 1013.5 x ̂ = 1096.8
Summary | Least Squares and
Maximum Likelihood
• LS and WLS produce the same estimates as maximum likelihood assuming Gaussian
noise
• Central Limit Theorem states that complex errors will tend towards a Gaussian
distribution.
• Least squares estimates are significantly affected by outliers