[R Course] Propensity Score Matching in R

Machine Learning R Nanocourses

Propensity Score Matching

Thierry Warin https://warin.ca/aboutme.html (HEC Montréal and CIRANO (Canada))https://www.hec.ca/en/profs/thierry.warin.html

Ideal method

Randomized controlled trials (RCTs) are considered the gold standard for inferring causal relationships. Randomisation eliminates all confounding variables between the treatment (exposure) and control groups in RCTs.


RCTs, on the other hand, are not always feasible due to ethical or logistical concerns (e.g., testing the causal impact of smoking on lung cancer).


In an observational study, propensity score matching can be used to replicate the balance between the treatment and control groups. Propensity score matching, in its simplest form, compares each individual in the treatment group to an individual in the control group using their propensity score.

The propensity score can be thought of intuitively as the probability of recentiving treatment for each individual, calculated using a variety of covariates (and potential confounders).

After matching, the treatment and control groups should share a high degree of similarity in their characteristics.

To estimate the treatment effect on the outcome, a simple regression model can be used. Standard errors that are cluster-robust are required for correct inference.

To go further

See the excellent tutorial here with a nice example: https://statsnotebook.io/blog/analysis/matching/


For attribution, please cite this work as

Warin (2019, July 23). Thierry Warin, PhD: [R Course] Propensity Score Matching in R. Retrieved from https://warin.ca/posts/rcourse-statistics-psm/

BibTeX citation

  author = {Warin, Thierry},
  title = {Thierry Warin, PhD: [R Course] Propensity Score Matching in R},
  url = {https://warin.ca/posts/rcourse-statistics-psm/},
  year = {2019}