Skip to main content

Limit Theorems and Estimation

Overview

This section covers the fundamental limit theorems of probability theory and statistical estimation methods.

Key Definitions

Def. (Sample Mean)

For a sample of nn observations X1,X2,,XnX_1, X_2, \ldots, X_n, the sample mean is:

Xˉn=1ni=1nXi\bar{X}_n = \frac{1}{n} \sum_{i=1}^{n} X_i

If the observations are i.i.d. with mean μ\mu and variance σ2\sigma^2, then:

  • E[Xˉn]=μE[\bar{X}_n] = \mu (unbiased)
  • Var(Xˉn)=σ2n\text{Var}(\bar{X}_n) = \frac{\sigma^2}{n}
Def. (Law of Large Numbers)

The Law of Large Numbers states that as the sample size increases, the sample mean converges to the population mean.

For i.i.d. random variables X1,X2,X_1, X_2, \ldots with mean μ\mu and finite variance, the sample mean Xˉn=1ni=1nXi\bar{X}_n = \frac{1}{n} \sum_{i=1}^{n} X_i converges to μ\mu as nn \to \infty.

  • Weak LLN: Convergence in probability: P(Xˉnμ>ϵ)0P(|\bar{X}_n - \mu| > \epsilon) \to 0 as nn \to \infty
  • Strong LLN: Almost sure convergence: P(limnXˉn=μ)=1P(\lim_{n \to \infty} \bar{X}_n = \mu) = 1
  • Applications: Monte Carlo simulations, statistical inference
Def. (Central Limit Theorem)

The Central Limit Theorem states that the distribution of the sample mean approaches a normal distribution as the sample size increases, regardless of the population distribution (under certain conditions).

For i.i.d. random variables X1,X2,,XnX_1, X_2, \ldots, X_n with mean μ\mu and variance σ2\sigma^2:

Xˉnμσ/ndN(0,1)\frac{\bar{X}_n - \mu}{\sigma/\sqrt{n}} \xrightarrow{d} N(0,1)

Equivalently: XˉnN(μ,σ2n)\bar{X}_n \sim N\left(\mu, \frac{\sigma^2}{n}\right) for large nn.

  • Standard Error: SE=σnSE = \frac{\sigma}{\sqrt{n}}
  • Applications: Constructing confidence intervals, hypothesis testing
Def. (Likelihood Function)

For a sample x1,x2,,xnx_1, x_2, \ldots, x_n from a distribution with parameter θ\theta, the likelihood function is:

L(θ)=i=1nf(xi;θ)L(\theta) = \prod_{i=1}^{n} f(x_i; \theta)

where f(x;θ)f(x; \theta) is the PMF or PDF. The likelihood measures how likely the observed data is for different parameter values.

Def. (Maximum Likelihood Estimation)

The Maximum Likelihood Estimator (MLE) is the parameter value θ^\hat{\theta} that maximizes the likelihood function:

θ^MLE=argmaxθL(θ)\hat{\theta}_{MLE} = \arg\max_{\theta} L(\theta)

In practice, we often maximize the log-likelihood:

(θ)=logL(θ)=i=1nlogf(xi;θ)\ell(\theta) = \log L(\theta) = \sum_{i=1}^{n} \log f(x_i; \theta)

To find the MLE, solve: d(θ)dθ=0\frac{d\ell(\theta)}{d\theta} = 0

Def. (Unbiased Estimator)

An estimator θ^\hat{\theta} is unbiased for parameter θ\theta if:

E[θ^]=θE[\hat{\theta}] = \theta

The sample mean Xˉ\bar{X} is an unbiased estimator of the population mean μ\mu.

Examples of MLEs

  • Bernoulli(pp): p^=1ni=1nXi\hat{p} = \frac{1}{n}\sum_{i=1}^{n} X_i (sample proportion)
  • Poisson(λ\lambda): λ^=Xˉ\hat{\lambda} = \bar{X} (sample mean)
  • Normal(μ,σ2\mu, \sigma^2): μ^=Xˉ\hat{\mu} = \bar{X}, σ^2=1ni=1n(XiXˉ)2\hat{\sigma}^2 = \frac{1}{n}\sum_{i=1}^{n}(X_i - \bar{X})^2