Motivation. In theory we should be able to estimate the causal effect of a controlled experiment using Difference of Means model:

But we have the following issues in really (ABC issues):

  • Attrition
  • Balance
  • Compliance


  • def. Balance means treatment and control groups are same in all other factors
  • def Blocking means to categorize into blocks (e.g. men and women) and randomize within groups.
    • Blocking is harder to implement when there are lots of categories, and sample size is limited. Therefore, before running regression check for balance by running the Balance Test:

where is a covariate that we’re worried is not randomized enough.

  • 😊 If is not statistically significant, it means treatment and unobserved factors are not correlated.
  • 😞 If is statistically significant then (even if it was randomized) then it’s a failure of randomization


Motivation. Some people don’t comply, even through they’re offered treatment. What if people who don’t comply are different from people who comply? If this is the case, then focussing on just the treated & compliant people would be not indicative of the true effect of the treatment.

  • is the binary “was going to treat this person”
  • is the binary “is compliant person”

Intention to Treat Approach

We attempt to resolve this issue by not regressing against treated & compliant people, but all people who were “supposed to be treated” (=intent to treat), i.e.

  • Non-ITT:
  • ITT:
    • This is a conservative estimate. ; the lower the compliance, the lower the coefficient. (Full compliance implies ”“)

2SLS Approach

Alternatively, a better approach to dealing with non-compliance is using a 2SLS. You can use the ITT variable as the IV, since it satisfies the conditions that make a good IV:

  • Inclusion condition: is correlated with
  • Exclusion condition: is uncorrelated because treatment is randomly assigned
  1. Then, the first stage (reduced form):
  2. Then the second stage:


def. Attrition means dropping out. The following regression assesses if treatment is correlated to attrition:

→ if is statistically significant the attrition is correlated to treatment, thus we need to control for it. Methods to resolve attrition:

  1. Add covariates that are significant when we test (interactive attrition test): , where is the intention-to-treat binary variable.
    • This should also be significant in the balance test:
    • Add the significant covariates to the final analysis
  2. Trim either dataset so the attrition rate is the same
  3. Selection model. (not discussed)