probability

You are currently browsing articles tagged probability.

This is a very basic introduction to odds, followed by a short introduction to log-odds and a few recommendations from where to go from here.

Odds

Imagine you were living in a universe with 3 gods. One benefactor and two nasty gods. Each time you want or need something, one of the gods is chosen at random to serve you.

So if you want to eat strawberries, and you get lucky, the benefactor is chosen to serve you some tasty strawberries. But that’s only going to happen, on average, 1/3 of times. If one of the two nasty gods is chosen instead, you might or might not get something, but what’s for sure is that it won’t be what you want.

In other words, after a very long lifespan you can expect to have 1/3 of your wants and needs being satisfied. In 2 out of 3 cases you did either get nothing, if you are lucky, or something nasty.

That means that if you ask for strawberries it is 2 times as likely that you won’t get strawberries, or something nasty instead, as it is that you are actually going to get what you want. Because (2/3) / (1/3) = 2/1 = 2. In other words, in the long-term you will be left with 2 unfulfilled wishes for each fulfillment, or 2:1.

In this undesirable universe that we imagine, the odds in favor of getting what you want are 1:2. Which means that the odds against receiving some tasty strawberries when you want them are 2:1.

Here is another more abstract example. Imagine that the probability of event A happening is

P(A) = 80% = 0.80 = 80/100 = 4/5.

Then the probability of it not happening, ~A, is

P(~A) = 1-P(A) = 20% = 0.20 = 20/100 = 1/5.

Now the odds are simply the ratio of the probabilities. ‘Odds’ are an expression of relative probabilities.

The odds in favor are the ratio of the probability that an event will happen to the probability that it will not happen. Which in our example equals

P(A)/P(~A) = P(A) / (1-P(A)) = 80%/20% = 0.80/0.20 = (80/100) / (20/100) = 80/20 = (4/5) / (1/5) = 4:1.

The other way round, the odds against event A are 1:4.

Why use odds?

Using odds can help to illustrate the actual confidence accompanying various probabilities.

Take for example the difference between a probability of 99.98% and 99.99% of an event occurring. In odds we have,

99.98%/0.02% = 0.9998/0.0002 = (9998/10000) / (2/10000) = 9998/2 = 4999:1

and

99.99%/0.01% = 0.9999/0.0001 = (9999/10000) / (1/10000) = 9999/1 = 9999:1.

Which means that increasing your confidence from 99.98% to 99.99% is equivalent to saying that you believe that the event is 9999 times as likely to occur than not instead of it being “just” 4999 times as likely.

As you can see, not only does using odds reveal that an increase from 99.98% to 99.99% means that you are actually twice as confident as before, but also how incredible confident you must be to say that something is even 4999 times more likely to occur than not.

In the same way, converting probabilities to odds shows that the difference between 50.01% and 50.02% is negligible (under many circumstances). As 50.01% in odds are

50.01%/49.99% = 0.5001/0.4999 = (5001/10000) / (4999/10000) = 5001/4999 = 1.0004:1

and

50.02%/49.98% = 0.5002/0.4998 = (5002/10000) / (4998/10000) = 5002/4998 = 1.0008:1.

Which is almost the same, since 1.0008/1.0004 ≈ 1.

What odds reveal

To see why you have to realize that using odds reveals what might intuitively not be obvious, namely that to increment the smallest factor has the largest effect.

In the first example we had 9998/2 and 9999/1, which is the same as 9998 * 1/2 and 9999 * 1. The larger factor only inreased by 0.01% while the smaller factor increased by 100%, that is 9999/1 ≈ 9998*1.0001*(1/2)*2. Notice the bold 2? That’s twice as much.

Whereas in the second example both factors are approximately equal and increased or decreased by a similar percentage. That is 5002/4998 is approximately equal to 5001*1.0002*1/4999*1.0002. Which is almost an increase of a factor of 1, or in other words no increase at all.

What odds reveal is that the relative increase or decrease of a factor by one unit becomes more pronounced as the factors absolute difference increases.

Log-odds

Another way of representing probabilities is in terms of log-odds, or decibel (dB).

To convert probabilities into log-odds, first convert percentages into odds. We have already talked about how to do that above.

Once you got the odds, all you have to do is to take the base 10 logarithm of the odds ratio and multiply it by 10.

For example, 4:1 odds would translate to 6dB because 10*log(4) ≈ 6.

Further Reading and Resources

I could go on to explain the advantages of using log-odds. But there are others who have already done so, probably better than I could. And you should now be able to follow and understand what they have written.

I recommend that you start with this short primer, maybe followed by this blog post.

For a precomputed lookup table of important probabilities and their approximate odds and log-odds (decibel) values see this PDF made by muflax.

Tags: , ,

Beginner

Bayes' theorem

intermediate

Advanced

Philosophical foundations

Other guides

miscellaneous

A law of probability that describes the proper way to incorporate new evidence into prior probabilities to form an updated probability estimate. Bayesian rationality takes its name from this theorem, as it is regarded as the foundation of consistent rational reasoning under uncertainty. A.k.a. “Bayes’s Theorem” or “Bayes’s Rule”.

Eliezer Yudkowsky is on bloggingheads.tv with the statistician Andrew Gelman.

Several different points of fascination about Bayes…

When looking further, there is however a whole crowd on the blogs that seems to see more in Bayes’s theorem than a mere probability inversion…

Bayesian statistics is a system for describing epistemological uncertainty using the mathematical language of probability.
Bayesian probability is one of the most popular interpretations of the concept of probability.

Edwin T. Jaynes was one of the first people to realize that probability theory, as originated by Laplace, is a generalization of Aristotelian logic that reduces to deductive logic in the special case that our hypotheses are either true or false. This web site has been established to help promote this interpretation of probability theory by distributing articles, books and related material. As Ed Jaynes originated this interpretation of probability theory we have a large selection of his articles, as well as articles by a number of other people who use probability theory in this way…

Bayesian statistics is so closely linked with induction that one often hears it called “Bayesian induction.” What could be more inductive than taking a prior, gathering data, updating the prior with Bayes Law, and limiting to the true distribution of some parameter?

Gelman (of the popular statistics blog) and Shalizi point that, in practice, Bayesian statistics should actually be seen as Popper-style hypothesis-based deduction. The problem is intricately linked to the “taking a prior” above.

Or, how to recognize Bayes’ theorem when you meet one making small talk at a cocktail party.

Still, I’m sure Blogger won’t mind me using their resources instead. The basic idea is that there’s a distinction between true values x and measured values y. You start off with a prior probability distribution over the true values. You then have a likelihood function, which gives you the probability P(y|x) of measuring any value y given a hypothetical true value x.

In other words, What is so special about starting with a human-generated hypothesis? Bayesian methods suggest what I think is the right answer: To get from probabilistic evidence to the probability of something requires combining the evidence with a prior expectation, a “prior probability”, and human hypothesis generation enables this requirement to be ignored with considerable practical success.

Andrew Gelman recently responded to a commenter on the Yudkowsky/Gelman diavlog; the commenter complained that Bayesian statistics were too subjective and lacked rigor.  I shall explain why this is unbelievably ironic…

Maybe this kind of Bayesian method for “proving the null” could be used to achieve a better balance.

Bayesian brain is a term that is used to refer to the ability of the nervous system to operate in situations of uncertainty in a fashion that is close to the optimal prescribed by Bayesian statistics.


—————————————

P.S.

Expect this link collection to be permanently updated.

Please post a comment if you have something to add.

Tags: , , , , , , , ,