Banks, financial institutions, governments and public companies all want to see the future. Not in a crystal ball and tarot cards kind of way, but in a way that allows them to forecast economic events, share prices, voting patterns, disease spread and so on. They ways they do this vary, and usually involve a lot of expensive hardware and even more expensive people, but many of the underlying methods used are reasonably understandable. They can get a little involved though, so this is the first in a series of articles about them.
I’ll start with the basics, and then get into the more interesting stuff...
When you toss a coin, the result of you tossing it isn't affected by whether you started with heads or tails. Either way you have a 50% chance of the toss resulting in heads, and a 50% chance of tails. Let's draw that as a probability tree:
The tree gets quite unwieldy, and big, quite quickly, so let's steal from software development and draw a diagram based on a state diagram of what can happen. Here the states are represented by circles, and instead of saying what causes the transition between the states, let's just put the probability of each transition between states:
This diagram is smaller, much less complex and shows even more information, because there's no limit on the number of coin tosses you can simulate with it, and we just see the possible transitions between outcomes, or “states”.
This diagram actually represents a very basic Markov Chain; a model containing a set of transitions or events, in this case coin tosses, which are determined by some probability distribution, and a set of outcomes, or states, which result from those transitions. Tossing the coin is an event, and the outcome of that coin toss is the new state. Markov chains aren’t necessarily complex, although like anything else in mathematics or engineering, they can get complex quite quickly.
Of you want something a little more complex, here's a diagram showing the probability of the result of each throw of a single six-sided die:
Again, the probabilities of any given result don't change depending on the last value we threw.
But what if they did?
So far we’ve encountered situations where each event is independent, so let’s take a look at a situation where the probability of the next result depends on the latest result. I’m originally from the UK, and now I live in Canada, so when I suggest we take a look at the weather you probably shouldn’t be surprised.
This totally-made-up set of probabilities shows, among other things, that:
- The chances of one sunny day following another is 0.75
- The chances of a rainy day following a sunny day is 0.2
- The chances of one rainy day following another is 0.6
- The chances of a sunny day following a rainy day is 0.3
- and so on.
Of course, as well as the next day’s weather, we can use this diagram to work out the chances of what the weather will be on a day in the future, given the current weather. For example, if you know today is rainy, what are the chances that it will be sunny tomorrow and the next day?
To start with, let’s colour today’s weather red, so that we can see where we started:
From the diagram, we can see that the chance of it being tomorrow is 0.3 or 30%:
And then, the chances of the day after that being sunny is 0.75 or 75%:
Then we can calculate the overall probability by multiplying the individual values:
P = 0.3 x 0.75
P = 0.225 or 22.5%
So, the probability of two sunny days following a rainy day is quite low, at 22.5%.
You can also work out the overall probability of the day after tomorrow being sunny, given that today is raining and that you don’t care what the weather is tomorrow. This is a little more complicated, because you need to account for every possible type of weather tomorrow, and then the chances of it being sunny the following day.
Again, starting from today being rainy:
Tomorrow’s weather could be anything; we’ve already said we don’t care but we still have to account for it:
It’s probably worth making a simple table to show the probabilities so far:
Now we need to work out the probabilities of the day after tomorrow being sunny in each case:
Again it’s probably easier to add these probabilities as a new column to the table:
|Probability of Sunny Weather The Day After
Then, like we did above, we need to multiply the probabilities to get the probability of each outcome.
|Probability of Sunny Weather The Day After
|Probability of Sunny Weather
And then we add the probabilities together to get the overall probability:
P = 0.18 + 0.225 + 0.036 + 0.006
P = 0.447 or 44.7%
So, if today is rainy, the probability of it being sunny the day after tomorrow is 44.7%.
You’ve probably noticed that this got quite complex and laborious quite quickly. Imagine if you were asked to calculate the probability of it snowing in 7 days time, or just what the weather would be like then; there are a huge number of possible options for each of those. There are ways to represent these odds using matrices in mathematics, and that can simplify things, but then you have to learn about matrix multiplication, and it’s been a long time since I did that. From my totally biased point of view this seems like the sort of thing that could be done using a computer, and it is. I’ll dig into it a little more later.
So far I’ve introduced Markov Chains, and shown you some examples of how you can use them to calculate the probabilities of future events, or at least coin tosses, die throws and the weather. Hopefully they don’t appears as complicated as you might have thought.
There’s a lot more to cover, including the answers to questions like:
- Those weather probabilities looked totally made-up. How can I get some REAL data and make a more realistic model?
- These examples are pretty simple. Do you have anything more interesting?
- How can you do this sort of thing in Python, C# or maybe some other language?
- I’m convinced that this is what I want to do with my life; help me get a job doing it!
I can’t help you with the last question, but I’ll do what I can with the rest.