Day 26: Probability density functions, Part 1
We are well on our way to wrapping up week 4, what a ride it’s been! It’s been a long day for me, so today might be short. However, I really, really, really want to break into probability density functions. This topic is going to be a bit more advanced than some of the things we’ve covered (IE more writing) so it will most definitely be broken up. Let’s look at why and discover the wonderful weirdness of probability density functions!*
What are the chances that we can do a short post on this? Well the probability doesn’t look good. However, we’re going to try! To do it, I think I’m going to break it up so this is part one of… well I’m not sure yet I haven’t written it all out.
Probability density functions are functions that let us determine the relative likelihood that the value of a random variable would be the value we selected. Basically it’s a fancy way of saying what the odds are that we will find a value at a certain point. Spoiler, the answer is zero!
Yes, you read that correctly, the chances of of finding a specific value is zero… for any continuous random variable. That’s easy to explain though, the absolue likelihood for any continuous variable to have a particular value is zero because the set that you are picking your number from is infinite. So when we try to guess what a random value will be from an infinite set, the odds that we will be correct are so infinitesimally small that we can say it is zero.
That isn’t useful!
That doesn’t mean probability density functions are useless though, it just means we need to reframe our question a little. Instead of saying between zero and infinity, what are the odds that we will find that the value is 3.348932453425 (see the problem), we can instead ask what is the likelihood that our value will fall within a certain range.
If you remember all the work we’ve been doing up to this point, you know that we’ve been dealing with that exact question, we have been asking what are the odds that our value is outside a certain range. Then we use that information to determine if the value we recorded (our random variable) is significant or not. If you forget what we mean when we say our value is significant, you can read about that here at part one of three on significance.
Okay, so we’ve determined that our probability density function (p.d.f.) can be useful, so… wait why did I say (p.d.f.) and not (P.D.F.) well not to add to the confusion, but I should point out that there is another related concept called the probability distribution function or (P.D.F) capitalized. We can (and do) also refer to the P.D.F as a C.D.F. or cumulative distribution function, so that helps the confusion, but only slightly. See, to get our P.D.F. from our p.d.f. we integrate. So the two are very much linked. I mean, reread that sentence and tell me the notation isn’t odd. For our purposes we will refer to the P.D.F. as the cumulative distribution function (C.D.F. or just CDF) and our probability density function as (p.d.f. or pdf). To avoid any more confusion we will also try to avoid using PDF when we refer to our pdf as to help keep separation between the two topics.
That seems as good a place to stop as any. Tomorrow we can go into what it is, how to use it and start to broach the subject of the C.D.F. In other words, it looks like we have our topics for the next few days. What are the odds?
Until next time, don’t stop learning!
*I wish I were perfect, but I’m sadly not. Because of that, I make no claim to the accuracy of this information; some of it might be wrong. I’m learning, which is why I’m writing these posts and if you’re reading this then I am assuming you are trying to learn too. My plea to you is this, if you see something that is not correct, or if you want to expand on something, do it. Let’s learn together!!