Where does Bayes' theorem come from?

Reverend Thomas Bayes derived it in the 1740s as an unpublished theological argument. It was rediscovered and popularized by Laplace in the early 1800s. The formula itself is a straightforward consequence of basic probability rules — what makes it powerful is the *interpretation* that probabilities represent degrees of belief that should update with evidence.

What's the difference between Bayesian and frequentist statistics?

Frequentist statistics treats probabilities as long-run frequencies — the probability of a coin landing heads is 0.5 because in many flips, that's the ratio. Bayesian statistics treats probabilities as degrees of belief — the probability that this defendant is guilty given the evidence. Both schools agree on the math; they differ on what 'probability' means and how to apply it to single events or hypotheses.

Is Bayesian reasoning always correct?

It's the correct UPDATING rule, given the inputs. But the inputs (the prior and the likelihoods) have to be reasonable. If your prior is wrong, the posterior is wrong. The strength of Bayesian thinking isn't certainty — it's making your assumptions explicit so others can challenge them.

Bayes' Theorem Explained — How to Update Beliefs With Evidence

The one-line statement

P(A | B) = P(B | A) × P(A) / P(B)

In words: the probability of A given B equals the probability of B given A, times the probability of A, divided by the probability of B.

In English: how to update what you believe about A when you observe B.

That's Bayes' theorem. The math is trivial; the implication is profound. Properly applied, it tells you how to weigh any evidence rigorously. Improperly ignored, it's the source of most probability mistakes in daily life.

The cancer test example

This is the classic illustration.

A rare disease affects 1 in 10,000 people. There's a test for it that's 99% accurate — meaning it catches 99% of sick people, and correctly clears 99% of healthy people. You test positive. What's the chance you have the disease?

Almost everyone — including most doctors, when asked cold — says something close to 99%. Let's see what Bayes says.

Let A = "you have the disease," B = "you tested positive."

P(A) = 1/10,000 = 0.0001. This is the prior — your belief before the test.
P(B | A) = 0.99. If you have it, you'll test positive 99% of the time.
P(B | not A) = 0.01. If you don't have it, you'll test positive 1% of the time (false positive rate).
P(B) = P(B | A) × P(A) + P(B | not A) × P(not A) = 0.99 × 0.0001 + 0.01 × 0.9999 ≈ 0.0001 + 0.01 ≈ 0.0101

Plug in:

P(A | B) = 0.99 × 0.0001 / 0.0101 ≈ 0.0098 ≈ 1%

You have a 1% chance of having the disease, not 99%. The test is "99% accurate" in two specific senses, but the rarity of the disease drowns out the test's accuracy. Out of 10,000 people who test positive, only 99 will actually have it; 100 will be false positives.

This is the most consequential everyday Bayesian calculation. It's why screening for rare conditions in the general population causes more anxiety than benefit, why "test rare events" needs follow-up testing, and why doctors should — but often don't — talk you through the math when delivering a positive result.

The three ingredients

Bayes asks for three things:

The prior — how likely the hypothesis is before you see the evidence. (Base rate of the disease.)
The likelihood — how likely the evidence is, given the hypothesis. (How often the test is positive in sick people.)
The marginal — how likely the evidence is overall. (How often the test comes up positive in everyone — both sick and healthy.)

If you keep these straight, you can plug into Bayes correctly. The most common mistake is forgetting the prior. People focus on the test's reported accuracy and forget how rare the underlying condition is.

A simpler way to think about it

If equations feel awkward, try this natural-frequencies version:

Imagine 10,000 people, of whom 1 has the disease.

The 1 sick person tests positive (99% × 1 ≈ 1 positive).
The 9,999 healthy people test positive 1% of the time (about 100 positives).

Total positives: 101. Of those, 1 is genuinely sick. 1/101 ≈ 1%.

This is the same answer, derived without algebra. Most people who think Bayes is hard actually just think the notation is hard. The natural-frequencies version is much friendlier and gives identical answers.

Why it matters beyond medicine

The same logic applies anywhere you're updating beliefs from evidence.

Crime forensics. "DNA matches with 1-in-a-million probability." If the suspect was caught for an unrelated reason and the DNA confirms, that's strong evidence. If the suspect was identified by trawling through a database of a million people looking for any match, the apparent strength evaporates — you're guaranteed to find a match by chance.

Job interviews. Your prior on "this candidate is right for the job" should be roughly the base rate of qualified applicants. A great interview updates that — but how much depends on how often great interviews come from unqualified candidates and from qualified ones.

News stories. "X causes Y, says new study." Your prior on "X causes Y" matters. If the prior is low (extraordinary claim), the study needs to be strong to move you much. If the prior is high (well-established claim), the study mostly confirms.

Conspiracy theories. Updating in a Bayesian way on conspiracies is hard because evidence supporting them is easy to come by (selectively quote anyone). The corrective is the prior: how often have things this specific turned out to be coordinated by cabals vs being a result of competing incentives? Pretty rare. So you need very strong evidence to move that prior.

Where Bayes can fool you

Bayes is correct given the inputs. But the inputs depend on you:

Bad priors. If your prior is way off, the posterior will be too. Bayes doesn't fix that.
Wrong likelihoods. If you misestimate how likely the evidence is under each hypothesis, the update is wrong.
Updating on weak evidence. If the likelihood ratio is close to 1, evidence barely changes your belief — but people often treat it as strong.
Cascading errors. Updating multiple times on related evidence (without accounting for the correlation) double-counts.

These aren't reasons to abandon Bayes. They're reasons to do it carefully and to make your inputs explicit so others can question them.

What "Bayesian thinking" really means

The deep payoff isn't the formula. It's the habit of mind:

Hold a numerical (or quasi-numerical) prior on most claims.
When evidence arrives, ask "how much should this update me?"
Update proportionally to the likelihood ratio, not by leaping to a conclusion.
Be willing to say "this is suggestive but not strong enough to move me much."

People who internalize this stop swinging between extremes based on the last argument they heard. They stay roughly calibrated. They expect to be wrong sometimes — and aren't shocked when they are.

If you'd like a guided 5-minute course on Bayesian reasoning with worked examples, NerdSip can generate one starting from your current level.

The takeaway

Bayes' theorem is a short equation for updating beliefs with evidence. The math is straightforward; the practice is hard, because human intuition routinely ignores the prior. The rare-disease example is the cleanest demonstration — a 99% accurate test gives only a 1% probability of disease when the underlying condition is rare. Beyond medicine, the same logic applies to forensics, news evaluation, hiring, and any time you're weighing evidence. The habit it builds — explicit priors, proportional updates, comfort with uncertainty — is more valuable than the formula.

Bayes' Theorem Explained — How to Update Beliefs With Evidence

Article summary