Expectation Maximization Algorithm
The Expectation Maximization (EM) algorithm is an iterative method used to find maximum likelihood estimates of parameters in probabilistic models when the model depends on unobserved latent variables. This algorithm is widely used in machine learning for tasks like clustering, density estimation, and missing data imputation.
Coin Toss Example: Understanding the Basics
Let’s illustrate the EM algorithm with a simple example: estimating the bias of two coins using observed coin toss results.
Scenario
- We have two coins, Coin A and Coin B.
- Each coin has an unknown bias (probability of heads).
- We observe a sequence of coin tosses, but we don’t know which coin was used for each toss.
Goal
Our goal is to estimate the bias of each coin (θA and θB) using the observed data.
EM Algorithm: Step-by-Step
1. Initialization
We start by initializing the bias estimates for both coins. Let’s assume:
- θA(0) = 0.6 (Initial estimate of Coin A bias)
- θB(0) = 0.4 (Initial estimate of Coin B bias)
2. Expectation (E) Step
In the E-step, we calculate the probability that each observed coin toss came from each coin, given the current bias estimates. This is known as the “responsibility” of each coin for each toss.
3. Maximization (M) Step
In the M-step, we update the bias estimates by maximizing the expected log-likelihood of the data, taking into account the responsibilities calculated in the E-step.
Example: Coin Toss Data
Toss | Outcome |
---|---|
1 | Heads |
2 | Tails |
3 | Heads |
4 | Heads |
5 | Tails |
Iteration 1: EM Algorithm
E-Step
We calculate the probability of each toss coming from each coin. For example, for toss 1 (Heads), the probability of it being from Coin A is:
P(Toss 1 from Coin A) = (θA(0)) * (1 - θB(0)) / [(θA(0)) * (1 - θB(0)) + (θB(0)) * (1 - θA(0))]
Plugging in the initial estimates:
P(Toss 1 from Coin A) = (0.6) * (1 - 0.4) / [(0.6) * (1 - 0.4) + (0.4) * (1 - 0.6)] = 0.75
Similarly, we calculate the probabilities for all tosses. The results are shown in the table below:
Toss | Outcome | P(Toss from Coin A) | P(Toss from Coin B) |
---|---|---|---|
1 | Heads | 0.75 | 0.25 |
2 | Tails | 0.25 | 0.75 |
3 | Heads | 0.75 | 0.25 |
4 | Heads | 0.75 | 0.25 |
5 | Tails | 0.25 | 0.75 |
M-Step
We update the bias estimates based on the calculated responsibilities:
θA(1) = [P(Toss 1 from Coin A) + P(Toss 3 from Coin A) + P(Toss 4 from Coin A)] / 3 = (0.75 + 0.75 + 0.75) / 3 = 0.75
θB(1) = [P(Toss 2 from Coin B) + P(Toss 5 from Coin B)] / 2 = (0.75 + 0.75) / 2 = 0.75
Iterative Refinement
We repeat the E-step and M-step for multiple iterations until the bias estimates converge. The EM algorithm iteratively updates the parameter estimates, approaching a maximum likelihood solution.
Conclusion
The Expectation Maximization (EM) algorithm is a powerful tool for estimating parameters in probabilistic models with latent variables. It iteratively refines the parameter estimates through the E-step and M-step, leading to convergence towards a maximum likelihood solution.