R - Batı Şengül

Variational inference, the art of approximate sampling

Posted on 2018, July 21 | Batı Şengül

In the spirit of looking at fancy word topics, this post is about variational inference. Suppose you granted me one super power and I chose the ability to sample from any distribution in a fast and accurate way. Now, you might think that’s a crappy super-power, but that basically enables me to fit any model I want and provide uncertainty estimates. To make the problem concrete, lets suppose you are trying to sample from a distribution \(p(x)\). [Read More]

Python R Edward Bayesian Kullback-Leibler distance Statistics

Spike and slab: Bayesian linear regression with variable selection

Posted on 2018, June 20 | Batı Şengül

Spike and slab is a Bayesian model for simultaneously picking features and doing linear regression. Spike and slab is a shrinkage method, much like ridge and lasso regression, in the sense that it shrinks the “weak” beta values from the regression towards zero. Don’t worry if you have never heard of any of those terms, we will explore all of these using Stan. If you don’t know anything about Bayesian statistics, you can read my introductory post before reading this one. [Read More]

Bayesian Linear regression PyMC3 Stan R Statistics

Summer Olympics: the countries that beat the expectations

Posted on 2018, March 19 | Batı Şengül

In this post we take a look at the summer Olympics and try to see which countries performed substantially differently than was expected of them. We will look at the Olympics from 1964 through to 2008. For each year, we will run a predictive model, trying to predict the number of medals a country wins, using selected datasets that are available before each of the Olympics. We will see that this model performs well out of sample and this model will be what we expect. [Read More]

Analysis Analysis Modelling R Statistics

Causal impact and Bayesian structural time series

Posted on 2018, February 3 | Batı Şengül

Causal impact is a tool for estimating the impact of a one time action. As an example (which we will actually look at the data) consider the BP oil spill in 2010. Let’s say you want to evaluate the impact that this had on BP stocks. Typically with questions like this, we would like to be able to collect multiple samples from a control group and a test group. As this is not possible we would have to try something else. [Read More]

Bayesian Modelling Statistics R

Analysis of calving of JH Dorrington Farm Part III

Posted on 2017, October 10 | Batı Şengül

Drum roll please. This is the long awaited third and final part of the analysis from JH Dorrington Farm. If you have not already, read the first part and second part. Leaving where I left off, almost all of our models fit pretty well except for CART, so in what follows, I will ignore the CART model. That leaves us with linear regression models and MARS. MARS essentially builds a piecewise linear model using hinges. [Read More]

Analysis Modelling R Statistics Dorrington