If you like coding, statistics and problems that are not trivial… You found the right place. In this post, I will try to show an example to a way for solving this type of questions. For the readers that don’t remember what is Monte Carlo simulation (don’t be shy) – You might want to check a previous post that I wrote last summer and give you an intro to the world of Monte Carlo on Apps script and Google compute engine.
Monte Carlo experiments (simulations) are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. In most cases, we will run our simulations many times over in order to obtain the distribution of an unknown probabilistic entity. This tool is often used in physical and mathematical problems and are most useful when it is difficult or impossible to obtain a closed-form expression, or infeasible to apply a deterministic algorithm.
In the real world, we see it being mainly used in three distinct problem classes:
* Numerical integration.
* Generation of draws from a probability distribution.
OK, there is no time. The game starts in two days.
Monte Carlo simulations tend to follow a particular pattern. Here is how we will use this tool:
1. Define a domain of possible inputs. In our case, the odds between the states (=games) and their results. This part is now in the code, we should move it later to be driven from the sheet or a form that users can change.
2. Generate inputs randomly from a probability distribution over the domain. It will be east to use Math.Random and check what we have in terms of the odds we decided for each step in the game. If we have 1:0 for Brazil we can say that they have 60% chance to finish the game with 2:0 or 2:1 because they got the momentum going on. Of course, you can play with these numbers base on your own assumptions.
3. Perform a deterministic computation on the inputs. In simple wording, we run on the data (=inputs) and base on our state machine we decided which team won in each case. We can run it 100, 1,000, 10,000 times etc’. If you wish to run it for ‘real’ with many more iterations, please use NodeJS because the performances will be much better and you won’t be limit to the length of the sheet.
4. Aggregate the results. Calculate how many times each team won and what are the odds base on our inputs.
This is the heart of our simulation. You could play with the numbers here as you think will be the best assumption.
I got some of the odds from trying to see what the smart guys over at: fivethirtyeight.com did with their SPI and calculations. However, in order to keep it simple, I’ve just picked nice/round numbers to show the concept. This diagram shows the states between BRA-GER with the odds per stage in the semi final.
This Google Sheet contain the simulator and the code.
You can set the numbers of times we will run on this by setting E11 cell. After that, you have a new menu ‘WorldCup Simulator’ that will let you run it. You can see the results on the sheet, as I’m using it as our logger (for now). You can see in each game, what is the result and the ‘path’ or the special game that lead to it. At the top of this table you can see the aggregate results that give us a nice summary.
Please feel free to copy it and open the code with ‘Tools’ -> ‘Scripts Editor’. You will be able to change the sheets, odds and run it on your own ideas.
- Wikipedia – Monte_Carlo_method
- Monte Carlo Simulation On Google Compute-Engine
- If you want to have all the results (so far) from the #worldcup – This post will give you the ability to fetch and work with them.