Estimator

def. Statistic. let $X_{1}, \dots, X_{n}$ observable random variables [= data of an experiment]. then statistic $T$ is:

T = δ (X_{1}, ..., X_{n})

$δ : {X_{1}, \dots, X n} \to R$ i.e. is a real-valued function
$δ$ cannot contain unknown variables

def. Estimator [= point estimate] is a statistic used to estimate the parameter of the model we think the data is showing. Note the following notation convention:

\hat{θ} = δ (X_{1}, ..., X_{n})

Assume $X$ as an r.v. of an experiment, whose model includes parameter $θ$ .
To estimate ground truth parameter $θ$ , we can use an estimator r.v. $\hat{θ} (X)$
A specific estimate for a particular observed value $x_{1}$ is denoted $\hat{θ} (x_{1})$
An estimator has to be a function of known variables & data only.
$Va r (\hat{θ}) = E [(\hat{θ} - E \hat{θ})^{2}]$ , NOT $E [(\hat{θ} - θ)^{2}]$ ← This is MSE

How Good is Your Estimator?

Accuracy is higher. Increased as Bias (Statistics) is decreased
Precision is higher. Increased as Variance $V (\hat{θ})$ is decreased
Efficiency (Statistics) is higher. If estimators $\hat{θ}_{1}, \hat{θ}_{2}$ have the same accuracy, but $V (\hat{θ}_{1}) < V [\hat{θ}_{2}]$ then the former is more efficient than the latter.
Consistency.
Mean Squared Error is lower.
Likelihood (Statistics) is higher.

→ In general, making sure to reduce bias of estimators is important. Note that:

If you can write down what the bias is mathematically [= characterize the bias], then you can make a new estimator that doesn’t have the bias.
Bias usually decreases as the data points increase

Example

let $X_{1}, \dots, X_{n} \sim ii d N (μ, σ^{2})$ and let estimator $\hat{θ} := \sum_{i}^{n} a_{i} X_{i}$ where

$a_{1}, \dots, a_{n}$ are weights that sum to 1. [= weighted average]

$\hat{θ}$ is estimating $μ$ . $σ$ is known.

How accurate is $\hat{θ}$ ? [=what is the bias?]

> > E [\hat{θ}] > > > > B (\hat{θ}) = E [\sum a_{i} X_{i}] = \sum E [a_{i} X_{i}] = \sum a_{i} E [X_{i}] = μ = 0 > >>>

How precise is $\hat{θ}$ ? What are the best $a_{i}, \dots, a_{n}$ ?

> > V (\hat{θ}) >> > > > = V [\sum a_{i} X_{i}] = V [a_{1} X_{1}] + \dots V [a_{n} X_{n}] = a_{1}^{2} V [X_{1}] + \dots a_{n}^{2} V [X_{n}] = a_{1}^{2} σ^{2} + \dots a_{n}^{2} σ^{2} = σ^{2} \sum a_{i}^{2} >> >>>

→ Thus $V [\hat{θ}]$ is minimized when $a_{i} = \frac{1}{n}$ .

PK's Notes

Explorer

Estimator

How Good is Your Estimator?

Graph View

Backlinks

PK's Notes

Explorer

Estimator

How Good is Your Estimator? §

Graph View

Backlinks

How Good is Your Estimator?