r/SneerClub archives
newest
bestest
longest
a sneer from the handbook of biological statistics (https://i.redd.it/jaxkmj53o1651.png)
56

I find the rationalist Bayes fetish to be really, really annoying. Bayesian statistics is great, just don’t utterly misuse it, ignore all of its underlying assumptions, and pretend that the entirety of frequentist statistics is wrong.

rationalists aren't really interested in being right, they're interested in being smarter than you.
[deleted]
You can cook crack in your kitchen, but it's not edible.
Well to be fair, crack cocaine - like other cocaine - is as far as I know perfectly edible within certain limits As with your standard marching powder you’re not gonna get much use from it if it’s in your stomach but I’m not aware of any especial toxicity related to crack Fondly reminded myself of a fun John Doran essay here (yeah I know it’s in VICE, but he’s good and not on their staff or anything like that) https://www.vice.com/da/article/mvwmw4/john-doran-menk-crack-cocaine
I like to sprinkle a little bit on my popcorn, like nutritional yeast!
You can have a little cocaine, as a treat
Not always, but in most of the situations where rationalists are involved, pretty much.

I bet the book was a few years old, because these days lots of real, practicing scientists (NOT bloviating rationalist frauds) are using Bayesian methods. I have used and loved the ‘brms’ R package, which makes fitting a Bayesian model (with Stan on the back end) as easy as fitting an LME4 model: https://cran.r-project.org/web/packages/brms/index.html

[Not old enough for that excuse.](https://scholar.google.com/scholar?hl=en&as_sdt=0%2C1&q=Handbook+of+biological+statistics+john+h+mcdonald&btnG=&oq=Handbook+of+Biological+Statistics) He was just wrong.
I disagree. In 2009 he would have been absolutely correct. Most of the technological innovations (like 'brms' and other similar packages and libraries) that make applied Bayesian stats accessible to scientists who have "day jobs" in the fields of their primary interest (like biology, or econ, or genetics, or whatever) today did not exist in 2009. STAN was released in 2012. PyMC didn't become anything resembling user-friendly until version 3, which was released in 2013. JAGS and BUGS were around in 2009, but they were really unusable for the vast majority of people who didn't specifically have PhDs in statistics/biostatistics--I certainly tried and failed to use them in real-world research projects. Furthermore, at least this dude does actual science, unlike the rationalist shitheads who treat anything "Bayesian" like a religious totem and somehow manage to routinely write 12,000 word "brain" dumps peppered with words like "prior" and "posterior" while clearly having zero clue what they're talking about.
I'm with you on LARPing rationalist "Bayesians". They're super annoying, and almost always just using it as a fig leaf for intellectual laziness. I was using Bayesian methods for protein homology search in 2005-2008, and for identifying rare variants from DNA resequencing in 2008-2010. It wasn't hard to explain the method to people on a high level. They didn't understand Gibbs sampling, but they understood what the posterior sample distributions implied. Also, before there was STAN there was BUGS. Not as flexible, but people could use it. Google Scholar returns 43,100 matches for ["bayesian biology"](https://scholar.google.com/scholar?as_ylo=2008&as_yhi=2009&q=bayesian+biology) in 2008-2009.

Bayesian stats are used in radiometric dating, which is pretty important for paleontology.

Yeah. That being said, with rationalists basically they don't understand that there's a case of having very well justified priors (coming from somewhere else), and there's a case of pulling a number out of your ass such that any outcome can be produced depending on what number came out of the ass. They also don't understand how "frequentist" probabilities allow to objectively establish a bound on a risk. E.g. if I want to do some process after which there's <0.1% chance of giving people an useless or harmful drug due to chance effects alone, I'd conduct the study such that if the drug is useless is harmful, the probability of me finding it to be useful would be <0.1% . That can be done objectively without relying on anyone to pull any number out of their ass. Pulling numbers out of the ass is of little interest to science other than maybe to an anthropologist studying rationalist communities. Meanwhile rationalist's idea of priors is invariably a number pulled out of their ass, not I dunno some reasonably chosen probability distribution, like Poisson distribution for clicks of a particle counter.
[deleted]
Yeah it would make sense to have several prior sources, blend into the "optimal" one based on accuracy. I think one big issue is that a lot of hypotheses are empirically indistinguishable, and all the crazy things you can think up need to somehow balance out by construction, or else it simply doesn't work. And the better you sum "utility" over hypothesis space the worse it would be if sums don't converge. As far as practical reasoning goes, it's considerably more idiotic than that, too. Basically they believe that it is most rational to add up utilities in the hypotheses you are thinking of, multiplied by probabilities representing belief, and that's the most rational thing for you to do right now. Also the probabilities can't be 0 or even small, etc. They do notice it wouldn't work so they came up with something they started calling "instrumental rationality". At no point can they realize that this is a technical topic, in which they have no education or work experience *whatsoever* (picking as a guru another such person). It's like someone derives 2+2=5 and then comes up with "instrumental mathematics" where you pretend it is 4 for practical reasons such as living in a world of idiots convinced it's 4. Obviously, when you consider actual practical reasoning, most of hypotheses have probability of zero, in the sense that you didn't even think of them, you are not summing over them, you effectively have *exact zeroes* to start from. And when you are trying to estimate a sum of infinitely many elements but you only have a selection of elements, you can't get there by simply summing what you happen to have. edit: especially when what you happen to have is what you read out of someone's writings. Even worse when the sum likely doesn't even converge ("utilities" growing faster than prior improbability).
> If I want to do some process after which there's <0.1% chance of giving people an useless or harmful drug due to chance effects alone, I'd conduct the study such that if the drug is useless is harmful, the probability of me finding it to be useful would be <0.1%. I think you should be careful when you're wording this. Even if you conduct a study and find that the drug is useful with p<0.001, this does not mean that the "chance of giving people a useless or harmful drug" is <0.1%. This would be mixing up P(x|y) and P(y|x). In other words, you are implicitly ass-pulling a prior of \~0.5 for the likelihood of the drug being useful and assuming the statistical power of the study is 0.99 (other combinations of parameters are possible). Indeed, frequentist statistics cannot make any claim ever about the "probability that a certain drug is not useless or harmful" without implicitly ass-pulling some prior, because this inevitably requires some knowledge about how the drugs that were recommended for the study were chosen. If a lab only recommends conducting studies on arsenic blue, arsenic red, arsenic cyan, etc., just by chance 0.1% of them will show significant non-harmful effects with p<0.001, yet still, 100% of them will be useless or harmful.
No, I mean if I have an useless drug - guaranteed useless, 100% probable useless - i can have a process with <0.1% probability of by chance failing to detect uselessness and ending up actually giving people that drug (which in that eventuality will be useless, 100% probable). There's a bound on the probability of a fuck-up, per drug. More cut and dried example: there is a piece of food that may be radioactively contaminated. This is tested with a Geiger counter. Normally you wouldn't bother with the prior, and just set it so that the probability of a false negative (on a presumed positive sample) is below some threshold, so you can compute for how many seconds do you need to count with the counter. The threshold would be set on the basis of how bad it is to consume contaminated food and how much food we're testing. edit: and with the process of "get a bagel, test it, eat it" the probability of the process ending with you eating a radioactive bagel is below that value. Of course, in the chance that you are eating a bagel, and assuming all bagels are bad, you're eating a bad bagel, but that is unlikely to happen. Of course, ultimately you'll want to set the probability lower if you are testing a lot of things, like how want more safety in airplanes because a lot of people fly on them.
It's for this reason that I see text like that in the OP and think it sounds an awful lot like when Ken Ham tried to convince the world that paleontology is a pseudoscience. It's not all fake just because some people find stats to be difficult to understand.

Why is there a war between the two? Both can be useful. This might not be a good analogy, but in my industry people always fight about Microsoft vs Linux. I think both are good depending on the situation.

There isn't unless you spend way too much time listening to rationalists. Your view is exactly the one held by anyone who knows anything about practical statistics.