Covariance

Motivation. Let us imagine a tuple of data generated by two independent RVs: $(x_{1}, y_{1}), \dots, (x_{4}, y_{4})$ , as graphed below.

$Var (X) = 2$ for both
$Var (Y) = 1$ for both
…but the ‘spread’ of these two are clearly different! There, we observe that what matters is the ‘quadrant’ or ‘direction’ of the spread that matters, and that is captured by multiplication of the two coordinates. Thus: Intuition. Covariance is the “average quadrant” of the direction of spread. The covariance matrix from 5:06

def. Covariance. Covariance measures the joint variability of two R.V.s; let $X, Y$ .

Cov (X, Y) : = E ((X - μ_{X}) (Y - μ_{Y})) = E (X \cdot Y) - E (X) \cdot E (Y)

When $X ⊥ Y$ , then $Cov (X, Y) = 0$ ; but $Cov (X, Y)$ does not imply $X ⊥ Y$
Covariance is a generalization of Variance: $Var (X) = E ((X - μ_{X})^{2}) = Cov (X, X)$

thm. Relationship between Covariance and Variance. let $X, Y$ . then:

Var (X + Y) = Var (X) + Var (Y) - 2 \cdot E ((X - μ_{X}) (Y - μ_{Y})) = Va r (X) + Va r (Y) - 2 C o v (X, Y)

thm. Bilinearity of Covariance.

thm. Summed Variance. let $X_{1}, \dots, X_{n}$ . Then

Var (i \sum X_{i}) = i \sum Var (X_{i}) + 2 \forall j, k s.t. j < k \sum Cov (X_{j}, X_{k})

E.G. second summation has $(2 3)$ terms:

Var (X_{1} + X_{2} + X_{3}) = Var (X_{1}) + Var (X_{2}) + Var (X_{3}) + 2 \cdot Cov (X_{1}, X_{2}) + 2 \cdot Cov (X_{2}, X_{3}) + 2 \cdot Cov (X_{1}, X_{3})

Covariance Matrix

def. Covariance Matrix is a collection of covariances for $X_{1}, \dots, X_{n}$ RVs:

Σ = σ_{1, 1} σ_{2, 1} ⋮ σ_{n, 1} σ_{1, 2} σ_{2, 2} ⋮ σ_{n, 2} \dots \dots ⋱ \dots σ_{1, n} σ_{2, n} ⋮ σ_{n, n}

where $σ_{i, j}$ is the covariance $Cov (X_{i}, X_{j})$