Day #38 : The Poisson Distribution – 365 DoA
So Poisson, famous french smart guy from the late 1700’s may be a name you are familiar with already. For those who are not, the name is NOT pronounced like the word poison, it’s french so you need to say it with a certain amount of flair to get it right. In any case, as with many of these things, he came up with the distribution, so it is named after him.
We can get into why this particular distribution is interesting, but let’s look at the formula and see what we get to work with today. The Poisson distribution formula looks like this guy here:
This may seem vaguely familiar, we’ve dealt with something similar when we looked at the exponential distribution. That pdf was also determined based on the number of events in a time interval. Furthermore it looks somewhat similar (if you squint really hard) when we rewrite the Poisson pdf like this:
So we know what λ is, but what is x in this case? Well x is the number of events we are interested in, in a particular interval. Namely, when we say P(x, λ) you should see it as what is the probability of x events occuring in an interval when the mean number of times that event occurs is λ.
What makes this distribution useful is something that we will talk about next time. Not to worry though, we will give you a hint, look at what happens as λ gets larger, because the shape it takes on should look very familiar to you. Now, enough with the hints, let’s look at what this is used for and (maybe) an example. If we don’t get to an example this post, we will later. Also, if you recall from the last post, we talked Bayes’ theorem, there is a reason for it I promise, it will all come together in the end, just keep that information fresh(ish) you’ll need it later.
Okay, so what is Poisson good for? It turns out, a lot of things. This pdf has a lot of practical applications that can be modeled with it. Let’s take a look at a few of those examples.
We can use the Poisson pdf to determine the number of calls coming into a call center. In biology, we can use the pdf to determine the probability we will find a certain number of mutations in a strand of DNA per unit length, it has applications in earthquake modeling, finance insurance, chemistry, and one interesting and odd example (for our day and age). In a book by Ladislaus Bortkiewicz he used the Poisson pdf to model the number of soldiers killed by horse-kicks each year in each corps in the Prussian cavalry. For our last example, if you’re a beer drinker William Gosset used this pdf to model the number of yeast cells used when brewing Guinness beer.
Okay, let’s have some fun with it, let’s say you are a real estate agent and you want to determine the probability that you will sell three homes today, when you sell on average two homes per day. Well in this case λ = 2 and our x = 3. All we need to do now is plug in the values!
You’ll notice that I expanded out our λ and factorial, I just wanted to show the step, but the result is still the same. Unfortunately, the result is that the probability you will sell three homes is about 18%, so not great. Then again, this makes sense, you aren’t interested in the probability that you will sell at least three homes, that is a whole different question, you just want to know the odds of you selling three and the odds don’t look good.
Okay, so we got to look at a worked out example and we talked some interesting and some (by today’s standards) odd examples. We could look at the CDF, but as you can see from the pdf plot, as λ changes, our distribution changes so we have to look at several cases, like this:
As you can see, as our λ increases the CDF changes and again, it should take on a form you are familiar with from our other discussions. If you haven’t figured it out already, stick around we’ll talk about what that form is next. Well that’s all we’ve got for the day, but it was important to introduce the pdf for our next few posts.
Until next time, don’t stop learning!
*My dear readers, please remember that I make no claim to the accuracy of this information; some of it might be wrong. I’m learning, which is why I’m writing these posts and if you’re reading this then I am assuming you are trying to learn too. My plea to you is this, if you see something that is not correct, or if you want to expand on something, do it. Let’s learn together!!