Post-treatment variables are variables that confound the “controlling for endogenous factors by including them in the regression” technique. They can cause problems is two ways:
Mediator
Motivation. Suppose regressing Earnings vs Tutoring:
But secretly, there was a correlation where: Therefore, Now, simply controlling for reading scores like this:
Is not effective, because there’s a multicollinearity between and . In this case, , and this doesn’t capture the portion.
Confounder
Collider Bias
Collider bias
A stupid model:
will have biases:
Example. Suppose we are regressing: Flu vs. Car accident. We also try to control for being being infected while being hospitalized:
Suppose, however, reality went like this:
- Car accidents lead to hospitalization, but don’t cause the flu
- Fevers lead to both hospitalization and cause flu testing Then:
- i.e. accident and flu is