Jun Ryu - Distribution Magic

Suppose one has a collection of numbers \(x_1,\ldots, x_n\), which are taken to be independent samples from the \(N(\mu, \sigma_0^2)\) distribution.

Here, \(\sigma_0^2\) is known, but \(\mu\) is unknown. Using the prior distribution \(M ∼ N(\mu_0, \rho_0^2)\) for \(\mu\), derive the formula for the posterior distribution.

To solve this problem, we will utilize the concept of proportionality (\(\propto\)). More specifically, we have the following equation regarding the posterior distribution:

\[f_{\Theta|X}(\theta|x) \propto f_{X|\Theta}(x|\theta)f_{\Theta}(\theta)\]

\[\text{Posterior density} \propto \text{Likelihood} \times \text{Prior density}\]

We will first find the likelihood.

1. Likelihood

We have that: \[f(x_i|\mu, \sigma_0^2) \propto {\kappa}e^{-{(x_i-\mu)^2}/{2\sigma_0^2}}\] since each \(x_i\) is a sample from a normal distribution. (Here, \(\kappa\) is just a normalizing constant.)

Our likelihood is the product of all the PDFs (probability density functions) for \(x_1,\ldots,x_n\):

\[ f(x|\mu, \sigma_0^2) = \prod_{i=1}^n f(x_i|\mu, \sigma_0^2) \propto {\kappa}e^{-{\sum_{i=1}^n(x_i-\mu)^2}/{2\sigma_0^2}} \]

2. Posterior Density

Using the above, we will derive the posterior density. Given that our prior distribution is also normal, we have:

\[ P(\mu|x_1,\ldots,x_n) = f(x|\mu, \sigma_0^2) \cdot g(\mu|\mu_0, \rho_0^2) \] \[ \propto {\kappa}e^{-{\sum_{i=1}^n(x_i-\mu)^2}/{2\sigma_0^2} - (\mu-\mu_0)^2/2\rho_0^2} \]

Now, we just need to simplify the exponent:

\[ -\frac{1}{2}\left(\frac{\sum_{i=1}^n(x_i-\mu)^2}{\sigma_0^2} + \frac{(\mu-\mu_0)^2}{\rho_0^2}\right) \] \[ = -\frac{1}{2}\left(\frac{\sum_{i=1}^n x_i^2 - 2n\bar{x}\mu+n\mu^2}{\sigma_0^2} + \frac{\mu^2-2\mu\mu_0+\mu_0^2}{\rho_0^2}\right) \]

From here, we can drop all terms without \(\mu\) because those are part of our normalizing constant:

\[ = -\frac{1}{2}\left(\frac{-2n\bar{x}\mu+n\mu^2}{\sigma_0^2} + \frac{\mu^2-2\mu\mu_0}{\rho_0^2}\right) \] \[ = -\frac{1}{2}\left(\frac{-2n\bar{x}\mu\rho_0^2+n\mu^2\rho_0^2+\mu^2\sigma_0^2-2\mu\mu_0\sigma_0^2}{\sigma_0^2\rho_0^2}\right) \] \[ = -\frac{1}{2}\left(\frac{\mu^2(n\rho_0^2+\sigma_0^2)-2\mu(\mu_0\sigma_0^2+n\bar{x}\rho_0^2)}{\sigma_0^2\rho_0^2}\right) \]

\[ = -\frac{1}{2}\left(\frac{\mu^2-2\mu\frac{(\mu_0\sigma_0^2+n\bar{x}\rho_0^2)}{(n\rho_0^2+\sigma_0^2)}}{\frac{\sigma_0^2\rho_0^2}{n\rho_0^2+\sigma_0^2}}\right) \]

3. Final Distribution

Here, we will use a neat trick called complete the square. We do this by introducing a constant term \(\kappa_0\) (a term that does not involve \(\mu\)) to the numerator:

\[ = -\frac{1}{2}\left(\frac{\mu^2-2\mu\frac{(\mu_0\sigma_0^2+n\bar{x}\rho_0^2)}{(n\rho_0^2+\sigma_0^2)}+\kappa_0}{\frac{\sigma_0^2\rho_0^2}{n\rho_0^2+\sigma_0^2}}\right) \]

Note

To be more specific, \(\kappa_0 := \left(\frac{\mu_0\sigma_0^2+n\bar{x}\rho_0^2}{n\rho_0^2+\sigma_0^2}\right)^2\), and indeed, this term does not have \(\mu\) and does not affect the distribution.

\[ = -\frac{1}{2}\left(\frac{\left(\mu-\frac{\mu_0\sigma_0^2+n\bar{x}\rho_0^2}{n\rho_0^2+\sigma_0^2}\right)^2}{\frac{\sigma_0^2\rho_0^2}{n\rho_0^2+\sigma_0^2}}\right) \]

We notice that the above form looks a lot like the PDF of a normal distribution. Thus, our posterior PDF can be represented as \({\kappa}e^{{-(\mu-c)^2}/{2\tau^2}}\), where:

\[ \boxed{c = \frac{\mu_0\sigma_0^2+n\bar{x}\rho_0^2}{n\rho_0^2+\sigma_0^2}, \tau^2 = \frac{\sigma_0^2\rho_0^2}{n\rho_0^2+\sigma_0^2}}\]

So, we conclude that our posterior distribution is \(\boxed{N(c, \tau^2)}\).