Date
Journal Title
Journal ISSN
Volume Title
Publisher
In this thesis, a statistical learning method is leveraged to create a novel measure for conducting post-hoc matching between a treatment group and a candidate set. Post-hoc matching is a necessary element in many non-random observational studies and arises in diverse fields such as economics, medicine, marketing, and others.
Post-hoc matching has been in use for many years and different methods have been used. A common measure to match the two groups, called the propensity score, can be estimated in a variety of ways. A recent method to estimate it was introduced in 2013 using random forests.
The method introduced in this work utilizes random forest to develop an alternative measure to the propensity score. The new measure, proximity matrix method, is intuitive and potentially captures more similarities between subjects. In order to compare the propensity score method with the novel post-hoc matching method, data sets are generated which logically reflect observational studies with various assumptions regarding treatment selection. Experiments are conducted to evaluate the average treatment effect between the treatment and the control group that are matched. The empirical analysis shows promising results for the proximity matrix method. In particular, the technique has superior results when the treatment selection is made using complex rules, namely, a non-linear model, and when the bag is used to estimate the proximity matrix.
This study demonstrates significant potential of the novel method for both researchers and practitioners interested in matching candidates to a test set to estimate the average treatment effect within an observational study when there is an unknown, and possibly complex multivariate relationship with the initial treatment selection.