When there is indication of covariate imbalance, we may wish to construct a sample where the treatment and control groups are more similar than the original full sample. One way of doing so is by dropping units with extreme values of propensity scores. For these subjects, their covariate values are such that the probability of being in the treatment (or control) group is so overwhelmingly high that we cannot reliably find comparable units in the opposite group. We may therefore wish to forego estimating treatment effects for such units since nothing much can be credibly said about them.
A good rule-of-thumb is to drop units whose estimated propensity score is less than 0.1 or greater than 0.9. By default, once the propensity score has been estimated by running either est_propensity
or est_propensity_s
, a value of 0.1 will be set for the attribute cutoff
:
>>> causal.cutoff 0.1
Calling the method trim
at this point will drop subjects according to this rule-of-thumb. In general, we can consider dropping units whose estimated propensity lies outside of the \([\alpha, 1-\alpha]\) interval. To trim the sample at \(\alpha = 0.12\), for example, we can set cutoff
to 0.12 and call the method trim
:
>>> causal.cutoff = 0.12 >>> causal.trim()
Once trim
is called, the causal
instance will mutate and behave as if the subjects outside of the \([\alpha, 1-\alpha]\) propensity range no longer exists. The usual object attributes and methods will still work as expected after trimming. If we inspect summary_stats
at this point, for instance, we will find
>>> print(causal.summary_stats) Summary Statistics Controls (N_c=355) Treated (N_t=324) Variable Mean S.d. Mean S.d. Raw-diff -------------------------------------------------------------------------------- Y 41.568 29.026 63.069 27.225 21.501 Controls (N_c=355) Treated (N_t=324) Variable Mean S.d. Mean S.d. Nor-diff -------------------------------------------------------------------------------- X0 3.701 2.825 4.546 2.563 0.313 X1 3.489 2.784 4.453 2.461 0.367
Note that the output statistics are now different from what we saw earlier with the full sample. In particular, the sample sizes are now 355 and 324 instead of 392 and 608, and the normalized differences in covariate means are 0.313 and 0.367 instead of 0.706 and 0.880, showing marked improvement in covariate balance.
If the trimming was not satisfactory, we can reset causal
to its initial pristine state by
>>> causal.reset()
Note that doing so resets everything, so even the propensity scores have to be re-estimated again.
Instead of setting an arbitrary value for \(\alpha\), a procedure exists that will estimate the optimal cutoff. Crump, Hotz, Imbens, and Mitnik (2009) show that the asymptotic variance of the efficient estimator for the average treatment effect given that the covariate value \(X\) is in some subset \(\mathbb{S}\) of the covariate space is given by $$\frac{1}{\mathrm{P}(X \in \mathbb{S})} \mathrm{E}\left[ \frac{\sigma_t^2(X)}{p(X)} + \frac{\sigma_c^2(X)}{1-p(X)} \Big| X \in \mathbb{S} \right].$$
Here \(\sigma_t^2\) is the conditional variance function for treated units, and \(\sigma_c^2\) is the conditional variance function for control units.
Letting \(\mathbb{S}\) take the form of \([\alpha, 1-\alpha]\) and choosing \(\alpha\) to minimize the above asymptotic variance results in the optimal cutoff threshold. This optimal cutoff does not have a closed form solution, but can nonetheless be calculated in \(O(N \log{N})\) time.
To compute the optimal \(\alpha\) and trim the sample based on it in Causalinference, we simply call trim_s
, like below:
>>> causal.trim_s() >>> causal.cutoff 0.0954928016329 >>> print(causal.summary_stats) Summary Statistics Controls (N_c=371) Treated (N_t=363) Variable Mean S.d. Mean S.d. Raw-diff -------------------------------------------------------------------------------- Y 41.331 29.608 66.067 28.108 24.736 Controls (N_c=371) Treated (N_t=363) Variable Mean S.d. Mean S.d. Nor-diff -------------------------------------------------------------------------------- X0 3.709 2.872 4.658 2.522 0.351 X1 3.407 2.784 4.661 2.517 0.472
As the above show, the optimal cutoff was determined to be around 0.0955. Note that it is not necessary to call trim
after trim_s
was called as the trimming occurs automatically as part of trim_s
.
Since causal
still behaves like a regular CausalModel
instance after trimming, it is possible at this point to run any treatment effect estimation procedures that are available, including est_via_ols
and est_via_matching
. It is important to note, however, that the estimand is now the average treatment effect restricted to a subpopulation: $$\mathrm{E}[Y(1)-Y(0) | \alpha < p(X) < 1-\alpha].$$
Again, this is because when covariate imbalance is an issue, estimates of the unconditional average treatment effect \(\mathrm{E}[Y(1)-Y(0)]\) are inherently much less credible.
References
Crump, R., Hotz, V. J., Imbens, G., & Mitnik, O. (2009). Dealing with limited overlap in estimation of average treatment effects. Biometrika, 96, 187-199.