Bayesian analysis of Premier League football

In this post we are going to look at some football statistics. In particular, we will examine English football, the Premier League, using Bayesian statistics with Stan. If you have no idea what Bayesian statistics is, you can read my introductory post on it. Otherwise this post shouldn’t be a difficult read. All right, let’s get to it. First, we need some data. I will use all the matches from the Premier League seasons 16/17 and 17/18 (which is still ongoing at the time of the writing). [Read More]

Summer Olympics: the countries that beat the expectations

In this post we take a look at the summer Olympics and try to see which countries performed substantially differently than was expected of them. We will look at the Olympics from 1964 through to 2008. For each year, we will run a predictive model, trying to predict the number of medals a country wins, using selected datasets that are available before each of the Olympics. We will see that this model performs well out of sample and this model will be what we expect. [Read More]

Causal impact and Bayesian structural time series

Causal impact is a tool for estimating the impact of a one time action. As an example (which we will actually look at the data) consider the BP oil spill in 2010. Let’s say you want to evaluate the impact that this had on BP stocks. Typically with questions like this, we would like to be able to collect multiple samples from a control group and a test group. As this is not possible we would have to try something else. [Read More]

Bayes of our lives: a gentle introduction to Bayesian statistics

Bayesian statistics is an interpretation of statistics. It is used to help explain the frequentist methods and can give much more information. Even if you have never really learnt about Bayesian statistics, I guarantee you have encountered it in some way. Bayes, it’s everywhere In this post, we will only consider a linear model: \(y = \beta x + \epsilon\) where \(\epsilon\) is a standard normal. Suppose we have gathered some data \((Y=\{y_i\}_{i=1}^n, X=\{\{x_{k,i}\}_{k=1}^p\}_{i=1}^n)\), which consist of \(p\) predictors and \(n\) observations, and we wish to fit a linear model. [Read More]

Analysis of calving of JH Dorrington Farm Part III

Drum roll please. This is the long awaited third and final part of the analysis from JH Dorrington Farm. If you have not already, read the first part and second part. Leaving where I left off, almost all of our models fit pretty well except for CART, so in what follows, I will ignore the CART model. That leaves us with linear regression models and MARS. MARS essentially builds a piecewise linear model using hinges. [Read More]