Real-world epidemiology gives us the unique opportunity to observe large numbers of people, and the actions and events that characterize their encounters with healthcare providers. However, the heterogeneity and sheer diversity of the population and healthcare systems makes it impossible for researchers to compare “like with like” when attempting to draw causal inferences about interventions and outcomes. The critical issue in epidemiological datasets relates to high risk of bias due to confounders that stem from baseline differences between groups. Propensity score (PS) techniques are statistical approaches that have been used to tackle potential imbalance in the comparison groups. The PS is the estimated probability (based on measured baseline covariates) that the patient receives a particular intervention. Patients that share similar PS will most likely have the same distributions of underlying covariates included in the PS. Implementation of PS methods may achieve better balance of covariates, but there is no consensus on the best way of capturing all relevant confounders for incorporation into the PS model. Should covariates be selected by clinical or epidemiological experts, or would data-driven algorithms (machine learning) offer more efficient and reliable methods of estimating PS and controlling for confounding? The PS can be incorporated into the analysis in different ways, each with its own strengths and limitations, and researchers must choose the best fit for their study objectives. PS methods are particularly advantageous in situations where there are large numbers of measured covariates but relatively few outcome events captured in healthcare administrative databases.