Random forest algorithms for evaluating offensive efficiency in Football
Football is a team sport which has been changing its view towards data-backed decisions in the past decade. In this study, we focus on offensive efficiency in football and use modified random forest algorithms to construct an expected goals metric. The low-scoring nature of the game created a large gap between the overall attempted shots and those which were converted to a goal. The modified methods employed account for the imbalance in the data. We find that using a random forest-quantile classifier outperforms other algorithms in terms of predictive accuracy. The constructed metric assigns a probability of scoring for each shot given some characteristics, including individual skill, spatial-temporal data and defensive ability of the opposing team. We then provide ideas for applications of such a metric and discuss how it can potentially be a useful tool for football clubs.