Would you like me to show you the path to enlightenment? Feast your eyes:
P(B|A) = P(A|B)*P(B) / P(A)
Is the path not yet clear to you? I’ll give you all a bit more of a push. Some say Bayes’ rule is the ideal decision-making procedure. I say it’s also the perfect way to smuggle your personal biases and hobby-horses into any argument or discussion, all under the cover of mathematics (insert sparkle effect).
A simpleton sees in Bayes’ rule a way to derive the probability of a hypothesis (or belief) B, having learned the fact A, by the use of only three numbers: P(A|B) (the likelihood of A given B), P(B) (the prior probability of B), and P(A) (the probability of the evidence we’ve learned). A humble formula which, when watered from the garden hose of truth, bears delicious fruit. But I can teach you how to lie and deceive with each of these three values, plus a bonus method I’m throwing in as a special favor to each of you - four ways to baffle and bamboozle your friends under the guise of pure mathematical reasoning. Here they are, in order from least to most useful.
The first way to lie makes use of the evidence,
P(A). Now admittedly, it’s often not useful to lie
about P(A). Typically when we want to deceive people via Bayes’ rule,
it’s because we’re trying to inflate or deflate the perceived
probability of the hypothesis, B. A is the evidence, which (hopefully)
by virtue of having already happened, is uncontroversial and hard to lie
about. And P(A) has nothing to do with B - it’s a scaling parameter
which applies to any old hypothesis equally. Some people simply ignore
it in their calculations. So can we use it at all?
Of course! A feature of probability is that the sum of P(B|A) across all
mutually exclusive hypotheses B should equal 1. The scaling parameter is
needed precisely because some hypotheses are not compatible with A and
their posterior probability goes to zero. The probability of the
remaining hypotheses must be raised in turn. And if the evidence was
very unlikely (say P(A) = .1), the posterior of a compatible hypothesis
scales up in turn (in this case by a full order of magnitude!). So if
you’re motivated to say that a hypothesis B is still unlikely
even after some surprising evidence which makes competing hypotheses
untenable, leave P(A) out of your calculation!
P(A) is also an (in)sanity check of the reasoner. For example, I recall
a case where an interlocutor told me they had a prior P(B) = .3 and a
likelihood P(A|B) of .5. Fine, but then he assigned a value P(A) = .9!
Check quickly to convince yourself that these numbers are impossible! Of
course, if you can slip something like this by, an over-inflated P(A)
can be used to decrease the posterior probability of whatever you’re
interested in (or vice-versa for under-inflation).
The second way to lie relies on the likelihood,
P(A|B). Now, conditional probabilities are difficult to
come to any agreement on for complex, real-world issues. So why does our
formula require us to use one? Well, Bayes’ rule is fantastic
for solving things like STATS
101 word problems, which will give you P(A|B) explicitly. In the
real world things are much less obvious - but that’s good, because it
means more ways to lie!
Assigning a value to P(A|B) asks you to generate a story about how
plausible it is for the evidence A to come about in a world where B is
true. If you’re a truly motivated reasoner (and I’m sure you are), you
should be very good at coming up with such stories! I recommend
practicing at coming up with all kinds of stories about various
sorts of worlds, so that you can make a convincing case for any value of
P(A|B) for any A and any B (bonus points if you can get a value that’s
less than 0 or more than 1). Here, watch:
A = Trump says no collusion, B = Trump didn’t collude
P(A|B) is obviously as high as 1, he’s an innocent man defending himself
against unfair treatment!
P(A|B) is obviously as low as 0, an innocent man wouldn’t rail about his
innocence, he’d let the investigation bear out!
What’s extra nice about lying with P(A|B) is that assigning a value here
asks you to reason “back in time” about a probability of A occurring.
But A has already occurred! This naturally means people will accept
over-approximations of the likelihood of A occurring, even if in fact it
was a very unlikely event. If you want to get a higher posterior, pump
up that P(A|B) and say it was always obvious that it was going to
happen! Extra points if you can simultaneously inflate P(A|B)
while deflating P(A) to really get that posterior up there!
The third and omnipresent method of deception uses the prior, P(B). An ideal Bayesian reasoner attempts to discard personal biases in favor of dispassionately weighing the evidence observed and the relevant probabilities. But we ain’t ideal here, people! A motivated Bayesian reasoner notices that while an ideal prior is built on a sequence of Bayesian updates starting from a uniform distribution and updating on every relevant piece of evidence, their prior for a particular update can simply be - well, whatever the hell you want! Your priors are your darlings, and among friends, you can adjust them as much as you like to achieve the results you desire. Tougher crowd? Offer a slight justification for the number you pulled out of your ass, and then handwave any disagreements away by saying that the process is what matters and good Bayesians will converge to identical priors after enough updates. Make sure to imply that anyone with a different prior from you is an inferior Bayesian!
And finally, the fourth way to lie is to simply never perform a Bayesian update. Look, the numbers are nice, but sometimes they give you new numbers that you don’t really like. Plus, making up numbers can be a lot of work, and sometimes people take issue with the numbers you’ve gone to the trouble of making up. Why bother, when instead of being used as an actual mathematical reasoning tool, Bayes’ rule can simply be a piece of scientific jargon that you sprinkle over all your opinions to make them seem more legitimate? Really, all this requires is a vocab change. For example, instead of saying “this cherry-picked article appeals to my prejudices”, say “this evidence updates me toward (pet nonsense theory) being true”. Instead of “you’re a moron for disagreeing with me”, try “it seems like your prior may be miscalibrated”. By throwing the vocab words you’ve learned today into your arguments, it’s easy to make it seem like you’re a rigorous and objective technocrat whose beliefs are substantiated by cold, hard facts. When in reality, the jargon is just a signalling game used to boost the status of your beliefs while not-so-subtly implying that those who don’t adorn themselves with Bayesian garments are hysterical zealots in service of the Dark Side!
I hope you’ve learned something about how to lie with Bayes’ rule! If you’ve got any other nice methods, share with the class!
This post has updated my priors towards rationalism being dumb.
This post was more informative than anything on Bayes Theorem rationalists have ever written.
Does anybody seriously say that, while pretending that they´re “bayesian” and that “frequentism” is wrong?
The basic criticism of Bayes rule is: garbage in, garbage out. And humans are just garbage at probability estimations.
OP this is an interesting post. I might be stupid, but I don’t understand how
is impossible as you say it is.
When
P(B|A) = P(A|B) * P(B) / P(A)
then we have
P(B|A) = (0.5 * 0.3) / 0.9
= 0.167
Nothing impossible about that! It does mean that B makes A less likely. That’s only wrong if in your particular situation P(A|B) >= P(A).
I my experience, rationalists just insist their prior on their opinion is 1-epsilon and close their ears to any new evidence.