Professional Documents
Culture Documents
ECE 368 Course Review: Probabilistic Reasoning 2023
ECE 368 Course Review: Probabilistic Reasoning 2023
Probabilistic Reasoning
2023
1. What is Probabilistic Reasoning
• Why? Decision-Making with Uncertain Information
• Events not directly observable, Measurement errors, …
• Analytics Pipelines: Observe, Analyze, Decide, Act
• Probabilistic Inference: Classification and Regression
• and Learning
Data Evidence
Events Observe Analyze Decide Action
𝑃 𝐷 𝐻𝑖 𝑃 𝐻𝑖
𝑃 𝐻𝑖 𝐷 = ~𝑃 𝐷 𝐻𝑖 𝑃 𝐻𝑖
𝑃𝐷
Frequentist Bayesian
• A priori probability not used • A priori probability over hypotheses
• Hypotheses not usually result of an • Must know or construct
experiment “subjective” prior
• Objective assessment of evidence • Can explore different priors
• Confidence intervals & p-values • Computationally intensive
• A posteriori probability not used • Learns as data accumulated
• Less computationally intensive • Aposteriori enables decisions
Learning Objectives
1. Joint distributions, marginals, conditionals and Bayes rule.
2. Vector-based probabilistic models, e.g., jointly Gaussian vectors,
binomials, multinomials, conjugate priors
3. Hypothesis Testing: Naïve Bayes, Gaussian Discriminants, Likelihood
Ratio test, Bayesian Testing, Type I/II errors, cost function
4. Estimation: Likelihood function; Linear regression, Bayesian, LMS
5. Graphical models, message-passing inference.
6. Hidden Markov Models (HMMs), the forward-backward algorithm
and Viterbi algorithm
Maximum Likelihood & Bayesian Parameter Estimation
• Example: Bernoulli RV
Properties of Estimators
• Estimation error and Bias
Maximum Likelihood Estimation
• Likelihood
Log Likelihood Function
• Log likelihood
MLE Bernoulli RV
MLE Bernoulli RV
Laplace: Will the Sun Rise Tomorrow?
Frequentist Bayesian
Estimation Using Conditional Expectation
Bayes Inference
• Prior Distribution on ⍬ • Conditional distribution given ⍬
Maximum A Posteriori Probability Rule
MAP Estimate for Binomial with Beta Prior
MAP Rule for Prediction
Maximum Likelihood & Bayesian Parameter Estimation
More on ML, MAP, LMS Estimators
• Comparison of ML, MAP, LMS & Conditional Expectation
• Poisson RV with Gamma Prior
• Gaussian RV with Gaussian Prior
• Multinomial RV with Dirichlet Prior
Frequentist and Bayesian Inference
• Frequentist • Bayesian
LMS & Conditional Expectation
Important Conjugate Priors
Sample Variance
MAP for Gaussian RV with Gaussian Prior
ML Estimator for Multinomial RV
MAP for Multinomial RV with Dirichlet Prior
Estimation of Gaussian Vectors
• Gaussian Vector Estimation Problems
• Conditional Gaussian Distributions
• Marginal Gaussian Distributions
• Gaussian Systems
• ML Estimation
• MAP Estimation
s1
E[ X|Y ] = r X ,Y (y - m2 ) + m1 VAR[ X|Y ] = (1- r 2 X ,Y )s 12
s2
The ML Decision Rule compares the likelihood ratio to 1; other decision rules result by comparing L(x) to other thresholds
The ML Decision Rule compares the likelihood ratio to 1; the corresponding log likelihood ratio test
compares to the log of the original threshold.
We obtain a class of decision rules by varying the threshold.
The ML rule picks the threshold where
the two pdf’s are equal; As the
threshold gamma approaches infinity,
alpha approaches zero and beta
approaches 1. As gamma approaches
infinity, alpha goes to 1, beta to zero.
Neyman Pearson Lemma
Explanation of the Derivation:
• First assume there is a rule that achieves type 1 error alpha
• Next consider minimizing the type 2 error, given that the rule attains type 1 error alpha
• This involved minimizing type 2 error with constraint on type 1, leading to the Lagrangian expression
• The expression is minimized by assigning all values of x to the acceptance region when the integrand in the
previous page is negative; We note that this implies a likelihood ratio test with threshold lambda
• Finally pick lambda so that the type 1 error constraint is met.
Bayesian Hypothesis Testing
Bayesian Binary Hypothesis tests can be designed to minimize the average cost of the decision rule
• Example 1: The cost could be the probability of error, i.e. the sum of type 1 error and type 2 error
• General case: Reward correct decisions C00, C11, and penalize error C01, C10
• Both of these are solved by likelihood ratio tests (see next two charts)
Minimum Cost Decisions
ECE 368
P[Xn = in ,…, X0 = i0 ] = pi ,i
!pi ,i pi (0)
n−1 n 0 1 0
Transition Probability Matrix
• Xn is completely specified by the initial pmf pi(0)
and the matrix of one-step transition probabi-
lities P, or transition probability matrix:
⎡ p p01 p02 ! ⎤
⎢ 00
⎥
⎢ p10 p11 p12 ! ⎥
⎢ ⎥
P=⎢ ⋅ ⋅ ⋅ ⎥
⎢ p pi1 ! ⎥
i0
⎢ ⎥
⎢ ⋅ ⋅ ! ⎥
⎣ ⎦
• Note that each row of P must add to 1 since
1 = å P[ Xn+1 = j Xn = i] = å pij
j j
n-Step Transition Probabilities
P(2) = P(1)P(1) = P2
p(n) = p(0)Pn
Steady-State Probabilities
p j = å pijp i
i
p = pP
åp
i
i =1
• State i is transient if
fi < 1.
p j = å pijp i åp
i
i =1
i
Transient Recurrent
π
j
=0
Positive Null
recurrent recurrent
π
j
>0 π
j
=0
Aperiodic Periodic
lim p jj (n) = p j lim pjj (nd) = dp j
n ®¥ n ®¥
Bayesian Networks
Conditional Independence
Random Fields
Inference on Markov Chains:
Brute Force:
General Case:
Inference of Maximum Likelihood Sequence
Summary: Inference on graphical models
Hidden Markov Model HMM
Viterbi Algorithm
Expectation Maximization
ECE 368
Estimating Gaussian Mixture Model
Estimating HMM