FizzBuzz with neural networks and NALU

FizzBuzz is one of the most well-known interview questions. The problem is stated as: > Write the numbers from 0 to n replacing any number divisible by 3 with Fizz, divisible by 5 by Buzz and divisible by both 3 and 5 by FizzBuzz. The example program should output 1, 2, Fizz, 3, Buzz, Fizz, 7, 8, Fizz, Buzz. A while back, there was this infamous post where the author claimed to solve this problem in an interview using tensorflow. [Read More]

From Zero To State Of The Art NLP Part II - Transformers

Welcome to part two of the two part series on a crash course into state of the art natural language processing. This part is going to go through the transformer architecture from Attention Is All You Need. If you haven’t done so already, read the first part which introduces attention mechanisms. This post is all about transformers and assumes you know attention mechanisms.

[Read More]

From Zero To State Of The Art NLP Part I - Attention mechanism

There has been some really amazing advances in natural language processing (NLP) in the last couple of years. Back in November 2018, Google released https://ai.googleblog.com/2018/11/open-sourcing-bert-state-of-art-pre.html, which is based on attention mechanisms in Attention Is All You Need. In this two part series, I will assume you know nothing about NLP, have some understanding about neural networks, and take you from the start to end of understanding how transformers work. Natural language processing is the art of using machine learning techniques in processing language. [Read More]

Beating the odds: arbitrage in sports betting

In this post we are going to look at sports betting and how to make guarenteed money. Let’s take the example of football. For each match, there is a home team who hosts the match in their stadium and an away team. There are three outcomes to each match, home win (away lose), home lose (away win) and a draw. The bookie provides odds on each outcome. [Read More]

Variational inference, the art of approximate sampling

In the spirit of looking at fancy word topics, this post is about variational inference. Suppose you granted me one super power and I chose the ability to sample from any distribution in a fast and accurate way. Now, you might think that’s a crappy super-power, but that basically enables me to fit any model I want and provide uncertainty estimates. To make the problem concrete, lets suppose you are trying to sample from a distribution \(p(x)\). [Read More]