A LessWrong post destroys Yudkowsky with facts and logic, and Yudkowsky responds (https://www.reddit.com/r/SneerClub/comments/124cncy/a_lesswrong_post_destroys_yudkowsky_with_facts/)

posted on March 28, 2023 04:28 AM by u/grotundeek_apocolyps

LessWrong post: My Objections to “We’re All Gonna Die with Eliezer Yudkowsky”

Facts and logic

In response to Yudkowsky’s recent podcast interview about how the robot apocalypse is nigh, one brave soul has barged into LessWrong and done the unthinkable: patiently explain, in detail and citing relevant sources from the established academic literature, why Yudkowsky is wrong about everything.

This is LessWrong, so the post still says some oddball things (e.g. giving a 5% estimated chance of robot apocalypse), but on the whole it is a mostly reasonable and sober take on the current state of affairs in machine learning vis-a-vis the plausibility of the end times. I have never seen something like this before on LessWrong.

Some select quotable moments from the post:

How can we stop AI from being evil? OP has a creative and bold suggestion:

As far as I can tell, the answer is: don’t reward your AIs for taking bad actions

Does preventing AI from being evil require the kind of visionary genius that only Yudkowsky can offer? OP suspects not:

I think such issues largely fall under “ordinary engineering challenges”, not “we made too many capabilities advances, and now all our alignment techniques are totally useless”.

OP commits heresy against the acausal robot god:

The misaligned AI does not reach out from the space of possible failures and turn current alignment research adversarial.

OP doesn’t care for Yudkowsky’s tone:

I find [Yudkowsky’s] implication, that I’m only optimistic because I’m inexperienced, pretty patronizing. Of course, that’s not to say he’s wrong, only that he’s annoying. However, I also think he’s wrong.

A general condemnation of Yudkowsky’s opinion on “aligning” AI:

I’m not enthusiastic about a perspective which is so totally inappropriate for guiding value formation in the one example of powerful, agentic general intelligence we know about [i.e. humans].

I don’t necessarily recommend reading the whole post - it’s a bit overwrought - but unlike the average LW post it is not filled with egregious errors or grossly misinformed nonsense.

Yudkowsky’s response

Recently freed from the shackles of gainful employment, Yudkowsky has time to spare and deigns to share his perspective in some comments.

He begins his defense of himself by noting of the original post that

This is kinda long.

Those, I shit you not, are the very first words that he writes.

He then wonders

why you’d reply to the obviously simplified presentation from an off-the-cuff podcast rather than the more detailed arguments elsewhere.

in the very same comment. Other LWers note the, ahem, apparent discrepancy:

Yudkowsky ultimately responds to exactly one point from the post, saying that the post misunderstood what he said and offering a correction. OP responds in turn; apparently he misunderstood Yudkowsky because Yud’s intended meaning is so brazenly stupid that it never occurred to OP to interpret him in that way. OP then has to update the post with an entire new subsection explaining why Yud is even more wrong than he thought.

Elsewhere, Yudkowsky complains that OP responded to a mere youtube interview, and not to his supposedly better thought out essay “A List of Lethalities”. OP points out that, actually, he already did exactly that in a previous LW post. Eliezer doesn’t respond.

Some other, non-Eliezer comments

At one point in the post OP asks, rhetorically,

have you ever, even once in your life, thought anything remotely like “I really like being able to predict the near-future content of my visual field. I should just sit in a dark room to maximize my visual cortex’s predictive accuracy.”?

This being LessWrong, multiple commenters say that, actually, yes they have. (comment1, comment2)

In response to a complaint that he’s oversimplifying Yudkowsky’s ideas, OP laments that his efforts at a fuller explanation were stymied by the mind-boggling scope of conspiracy theory whackamole:

there are so many arguments to cover, and it probably would have been ~3 or more times longer than this post

A reminder about Yudokowsky’s total failure to predict the importance of artificial neural networks evolves into a debate about the invention of the airplane, in which we are further treated to corrections regarding Yudkowsky’s overconfident bullshit about the history of aviation.

u/DELETED 70 points at 1679979612.000000

[deleted]

permalink

u/sexylaboratories 46 points at 1679988688.000000

Incredible reply: "I think you should use a manifold market to decide on whether you should read the post, instead of the test this comment is putting forth. There's too much noise here, which isn't present in a prediction market" Even better, the prediction market question isn't "should EY read the post" but "If EY read the post, would he admit he was wrong". Everyone bets no, coming to the same conclusion I did, but probably for different reasons.

permalink

u/ThisIsMyAltFace 14 points at 1680021222.000000

In a nearby counterfactual universe where Yud is more self-aware, this would be a god-tier troll.

permalink

u/aponty 22 points at 1679984979.000000

oh neat, that's the same iceman that wrote the original "Friendship is Optimal" optimizer-horror-story fanfic about a decade ago

permalink

u/Soyweiser 21 points at 1680023306.000000

Today in: struggles between my desire to know more and the desire to keep what is left of my sanity.

permalink

u/aponty 15 points at 1680028092.000000

you can find Yud's response to it here: [https://hpmor.com/notes/progress-13-03-01/](https://hpmor.com/notes/progress-13-03-01/) (beneath the wall of text on Michael Vassar's Thiel-funded rational-medical-paper-judging startup that seems to have since failed due to lack of interest and the difficulty of their promised product of "review the entirety of medical literature on a topic pertaining to your medical issue, better than specialist doctors would") note: I do not believe the authors of these stories found them anywhere near as horrifying as Yud did

permalink

u/Soyweiser 12 points at 1680034708.000000

ow god, they are going to use GPT to bring back that startup... I just know it.

permalink

u/aponty 11 points at 1680035370.000000

don't give them ideas their quoted prices for these literature reviews ranged into the hundreds of thousands of dollars lmao and the justification for the whole thing is "someone researched solutions for their severed fingertip on their own and eventually found someone who could restore it"

permalink

u/Soyweiser 9 points at 1680036324.000000

I already saw a news post going 'gpt found cancer in my dog!' so we are in for a horrible new thing capitalism will imagine. Rich people were not ready for 'somebody skimmed some papers' but poor people who want healthcare will get no other choice. (This is assuming the rumors about GPT costing ungodly amounts of energy an supercomputers are a bit overblown).

permalink

u/aponty 11 points at 1680036657.000000

the amount of compute needed to run a public server running a large model is indeed quite expensive, but the cost isn't really significant if you look at how much it costs to generate just a few outputs for just one user

permalink

u/IHateReddit_9001 6 points at 1680158665.000000

From the articles I’ve read, most models can be trimmed substantially without losing their effectiveness, and if you restrict them to a specific purpose they can be pruned even more, to the point that they can be run on a smartphone. So at least, it seems the biggest energy cost is the initial training, but once you deploy them they can be incredibly efficient

permalink

u/Soyweiser 1 points at 1680185639.000000

Old style models or the new GPT like ones? (Still amazed this is all caused by LLM's, I always had the impression (when I was learning about them) that neural networks were a bit of a played out field due to all the various flaws they have).

permalink

u/IHateReddit_9001 3 points at 1680186025.000000

GPT-like LLMs specifically. [Like Llama by meta](https://arstechnica.com/information-technology/2023/03/you-can-now-run-a-gpt-3-level-ai-model-on-your-laptop-phone-and-raspberry-pi/) for example, or [this](https://arxiv.org/abs/2301.00774) article where they managed to prune half the parameters in a single pass without affecting accuracy.

permalink

u/Soyweiser 2 points at 1680186386.000000

Ah that is cool! Thanks.

permalink

u/evangainspower 6 points at 1680151783.000000

If you're not aware, that startup failed over a decade ago. The mastermind behind it, Michael Vassar, is the mastermind behind the MetaMed startup, had been behind even worse cultish conspiracies among rationalists since. That includes allegations of sexual abuse recently reported in TIME magazine. Like with everyone else sneered at here, they're always worse than you think. You can search this sub for more details, though I can also dish a bit if you'd like. It's almost shocking that nobody has tried to prosecute him yet.

permalink

u/Soyweiser 2 points at 1680185117.000000

I know, but still thanks as a lot of people dont know. And as silicon valley vc culture is always looking for the next 'disruption', saying they will bring this back with gpt is an easy bet (only might now be for poor people who cannot say no).

permalink

u/giziti 11 points at 1680040451.000000

> (beneath the wall of text on Michael Vassar's Thiel-funded rational-medical-paper-judging startup that seems to have since failed due to lack of interest and the difficulty of their promised product of "review the entirety of medical literature on a topic pertaining to your medical issue, better than specialist doctors would") It is worse than your summary makes it sound! https://rationalwiki.org/wiki/MetaMed

permalink

u/aponty 12 points at 1680041246.000000

I had assumed including the words "vassar" "thiel" and "\`better than specialist doctors\`" was scathing enough but next time I will include more corpse-flowery language in my sneers, to make my disdain more obvious

permalink

u/giziti 10 points at 1680041488.000000

Oh, no, you're fine, your post made it sound bad, it's just somehow *even worse*. Actually, the worse bits don't make it into the article, I apologize, but sometimes people who actually knew Vassar in that era in the sub comment on how bad it was (which is partly about how bad Vassar is).

permalink

u/dgerard 4 points at 1680088397.000000

note that the RationalWiki article (which I mostly wrote) is *mild* on the dangerous stupidity of these bozos and the multiple angles of fuckery and mistreatment of their workforce, and frames it just as medical ineptitude

permalink

u/dgerard 13 points at 1679998441.000000

o no the accursed slab of ponyfucking

permalink

u/CelestAI 2 points at 1681880989.000000

Oh man, I had so much fun reading that / it's where my reddit username comes from. Just the right amount of fridge horror. Simpler times.

permalink

u/aponty 1 points at 1681883606.000000

omg

permalink

u/DELETED 57 points at 1680005257.000000

[deleted]

permalink

u/evangainspower 8 points at 1680152398.000000

It's always true that if Yudkowsky was [self-aware and intentionally ironic in any instance], it'd be one of the funniest things he ever said.

permalink

u/VoidHuntG03 41 points at 1680008797.000000

apparently he misunderstood Yudkowsky because Yud’s intended meaning is so brazenly stupid that it never occurred to OP to interpret him in that way.

Mood.

permalink

u/IHateReddit_9001 46 points at 1679993762.000000

I find it hilarious how even the slightest bit of pushback shatters his whole facade

permalink

u/_ShadowElemental 27 points at 1680037129.000000

Looks like at least one LW'er is becoming aware of Yud's narcissism from this: >By responding as if Quintin was seeking your personal attention, rather than the attention of your audience, and by explicitly saying you'll give him the minimum possible amount of your attention, it implicitly frames Quintin's goal as "summoning Eliezer to a serious debate on AI" and as chiding him for wasting your time by raising a public clamor regarding ideas you find basic, uninteresting, or unworthy of serious debate - though worthy of spreading to a less-informed mass audience. > >Instead, I think Quintin is stepping into the same public communications role that you were doing on the podcast. And that doesn't actually demand a response from you. I think it is common for authors of fiction and nonfiction to allow their audience and critics some space and distance to think through and debate their ideas. > >It's this disconnect between what I think Quintin's true goal was in writing this post, and the way your response reframed it, that I think rubs some people the wrong way.

permalink

u/evangainspower 7 points at 1680151086.000000

The review on this sub of Yud's recent podcast interview was not bad, though it's worth knowing that even his own rationalist and AI safety communities have in recent months started trying to sideline him for nonstop spouting of quackery even they can't stomach.

permalink

u/85_13 1 points at 1682344399.000000

It's very simple: Your average LWer is interested in tech, derives values from discussing the development of tech, and EY was a convenient node for organizing audiences to discuss those interests. EY is interested in himself, derives value from organizing conversations around himself, and tech was a convenient node for organizing audiences to discuss his singular interest.

permalink

u/evangainspower 8 points at 1680152591.000000

I'm laughing at how Yud's critic still writing a comment that's holding Eliezer's hand to guide him towards how full of crap he is.

permalink

u/Soyweiser 33 points at 1680009142.000000

This is kinda long

Amazing for a guy who wants to be taken seriously as the ai alignment guy.

This is what actual research often is Yud, long posts. Not weird tweets. (And I know he likes to play semantics and go ‘I never called myself an ai researcher’, but clearly he wants to be taken seriously).

E: also fun to see how much LW is just weird incrowd jargon. Wonder how much of that is to hide they are not seriously talking about ai alignment, or in LW words how much it is an intentional obfuscation attempt by poisoning the memeplex. (Behaviour which is of course part of the dying wizard sneerclub meme)

permalink

u/grotundeek_apocolyps 21 points at 1680015453.000000

Doubly amazing for a guy whose entire career consists of making excessively long internet posts. **Yudkowsky:** people should respect each other's time and be concise **Also Yudkowsky:** what the world needs most is a 600,000 word harry potter fanfiction story about how I am history's greatest genius

permalink

u/Soyweiser 18 points at 1680016251.000000

Yeah, that too. It is all just amazing. Here comes an outsider, clearly not a person who just wants to sneer, they are actually interested in the material and has a lot of remarks (and looked it up, they are an active(ish) poster of LW since 2020). A great opportunity to put all those questions to rest and dazzle them with your great intellect. Yud: tl:dr; LOL. At least he put his money where is mouth was, and he has not posted on LW after that in the past 7 days. (wonder if he tweeted (not going to check, rhetorical question)).

permalink

u/evangainspower 10 points at 1680153334.000000

As an ex-rationalist myself, I wouldn't be surprised if the main source of growth for this sub i ex-rationalists who can no longer accept how ludicrous Yud and his acolytes have become.

permalink

u/Soyweiser 5 points at 1680185217.000000

I came here via Scott myself.

permalink

u/dgerard 19 points at 1680020964.000000

**But first Yudkowsky:** Here's the million-word first draft of my from first principles philosophical edifice answering all things, which you are expected to memorise to join our phyg

permalink

u/Soyweiser 11 points at 1680023413.000000

I had already forgotten that cult accusations are so common he rot13ed it

permalink

u/_ShadowElemental 8 points at 1680037380.000000

I just checked, and typing in "lesswrong" on google still autocompletes to "lesswrong cult".

permalink

u/Soyweiser 10 points at 1680037555.000000

Perfection.

permalink

u/_ShadowElemental 7 points at 1680038254.000000

Same for Bing and DuckDuckGo. No for Yahoo, though

permalink

u/dgerard 5 points at 1680033825.000000

it was actually the phygists who did, then adopted it!

permalink

u/ahopefullycuterrobot 13 points at 1680016734.000000

> **Also Yudkowsky:** what the world needs most is a 600,000 word harry potter fanfiction story about how I am history's greatest genius In Yud's defense, surely his 600,000 word fanfic perfectly respects people's time, since what could be more important than listening to the teachings of a singular genius whose work will both save the world from eternal torment and usher in heaven on Earth.

permalink

u/giziti 12 points at 1680086649.000000

Ai alignment is just all LW jargon, it's not the terminology used by actual AI ethics people

permalink

u/evangainspower 8 points at 1680153131.000000

Eliezer years ago mostly retreated from LessWrong to his more personally groomed feeds on Facebook and Twitter after the direct feedback from the very rationalist community he founded became too much for him to cope with.

permalink

u/aponty 27 points at 1679982813.000000

overall a fantastic post, and the response to it definitely gets my sneer going

though,

have you ever, even once in your life, thought anything remotely like “I really like being able to predict the near-future content of my visualfield. I should just sit in a dark room to maximize my visual cortex’s predictive accuracy.”?

this part does fall kind of flat for me. OOP doesn’t seem to have, at time of writing, thought much about how this example works in a world where people can get overwhelmed by their senses e.g. from autism

and after further thought, OOP seems to agree, and bring in the needed nuance:

Edit: On reflection, the above discussion overclaims a bit in regards to humans. One complication is that the brain uses internal functions of its own activity as inputs to some of its reward functions, and some of those functions may correspond or correlate with something like “visual environment predictability”. Additionally, humans run an online reinforcement learning process, and human credit assignment isn’t perfect. If periods of low visual predictability correlate with negative reward in the near-future, the human may begin to intrinsically dislike being in unpredictable visual environments.

However, I still think that it’s rare for people’s values to assign much weight to their long-run visual predictive accuracy, and I think this is evidence against the hypothesis that a system trained to make lots of correct predictions will thereby intrinsically value making lots of correct predictions.

permalink

u/textlossarcade 23 points at 1680051917.000000

“I am overwhelmed so I will make sure I stay in calm environment” is different from “I want to make sure I score as high as possible on my prediction rankings, so I will keep my eyes closed and win the ‘will I predict my visual sensory stimuli correctly?’ score contest”

permalink

u/Character_Cry_8357 6 points at 1680066337.000000

Yeah I have autism and adhd and I tend to listen to an albumn on repeat all day at work in order to block out the majority of found for this reason. Being able to "predict" the sounds I will here certainly helps with focus.

permalink

u/evangainspower 2 points at 1680152913.000000

Even in spite of how much their put out in an ass-backwards way like this, it's tragic how crucial insights like this will become an afterthought at best among rationalists, while just any hot take du jour from a rationalist Thought Leader™, no matter how bad, will become canonical wisdom.

permalink

u/SolarSurfer7 21 points at 1679982462.000000

Top class sneer.

permalink

u/sue_me_please 25 points at 1679985617.000000

It will be memory-holed, getting scared about sci-fi is more fun than whatever this is.

permalink

u/acausalrobotgod 13 points at 1680026148.000000

The failures of AGI is going to be like global warming, not “alignment” of the AGI itself. That is, the purposes humans put them to, which is a non-trivial problem to solve, but one which EY’s alignment completely fails to address.

permalink

u/Kiss_Me_Im_Rational 12 points at 1679992521.000000

🍿

permalink

u/textlossarcade 12 points at 1680051994.000000

Can someone go respond to the “this is kind of long” comment and just ask him if he has read the sequences?

permalink

u/OisforOwesome 10 points at 1679990876.000000

Glorious. Simply glorious.

permalink

u/panoisclosedtoday 9 points at 1680008623.000000

[*] I’ve just realized that I can’t name a way in which airplanes are like birds in which they aren’t like humans.

lol

permalink

u/ursinedemands2112 17 points at 1680024283.000000

The discussion of this is also hilarious, with a lengthy discussion of what exact inspiration the Wright brothers drew from bird wings. At least on a skim, no one seemed to be able to identify the extremely obvious “has parts (called wings) which generate lift when moved through the air”. I’d half expect someone to link to Looney Tunes to show that you can generate lift just by flapping your arms hard enough.

permalink

u/MathsyLassy 16 points at 1680022355.000000

I think the funniest thing about all of this is that AI alignment and safety as it’s currently practiced in the ML community is more like nurturing a child than a branch of hard engineering and so of course rationalists are always gonna fail spectacularly at understanding it because the idea of “Teach the machine empathy” just never occurs to them because they’re terminal math fetishists.

permalink

u/BrentMaen 5 points at 1680120831.000000

This is kinda long.

How long? Like more than 2 sargons?

permalink

u/TheAncientGeek 5 points at 1680129976.000000

Recently freed from the shackles of gainful employment

Huh? I thought he had never held down a conventional job, but also had patrons who would pay him to post whatever whenever…what’s he lost?

permalink

u/grotundeek_apocolyps 5 points at 1680134074.000000

One of the things he said in the recent youtube interview is that he is no longer working at MIRI, which is the ostensible research organization that he cofounded. It seems that he is no longer formally employed even as a free-range internet blogger. It's possible that he's still having some rich guys pay his bills on an informal basis, but he's stopped maintaining even the *veneer* of traditional employment or institutional affiliation.

permalink

u/TheAncientGeek 5 points at 1680134807.000000

He was replaced as director of MiRI ages ago. He's still listed as full time research staff. His concept of full time is unusual, of course

permalink

u/grotundeek_apocolyps 4 points at 1680143242.000000

Maybe they didn't update the website? He seemed to be saying that he's no longer with them at all.

permalink

u/evangainspower 2 points at 1680154017.000000

He's only presently lost the official/on-paper employment at MIRI as a registered NPO. That doesn't mean he can't net for himself whatever counts as a satisfactory income from his supporters. It just means he doesn't currently have an institutional home supporting whatever posts he puts out and calls "research."

permalink

u/wstewartXYZ -1 points at 1680145942.000000

e.g. giving a 5% estimated chance of robot apocalypse

IIRC this is pretty inline with what the average AI researcher thinks.

permalink

u/grotundeek_apocolyps 10 points at 1680146489.000000

It isn't. The idea that AI might autonomously decide to cause a robot apocalypse without human intervention is an extremist fringe belief that is based entirely in science fiction. The appropriate estimate for that outcome is 0%. Some researchers might creatively interpret the question to mean "what are the odds that *humans will use AI* to cause a robot apocalypse?", and that would certainly get you an answer above 0%. But that's a different question from what OP is addressing here.

permalink

u/Soggy_Ad7165 1 points at 1681007187.000000

Old post I know. But thanks for writing that. I thought I am crazy. I saw the interview with lex Friedman. And they totally skipped any mention of how a text bot that reacts to questions will suddenly have the intention to do anything else than to answer the question. It's kind of crazy that nothing even close to that came into play in three hours of bullshitting. But this "intent" problem is nowhere else mentioned at least not on reddit.

permalink

u/grotundeek_apocolyps 1 points at 1681010233.000000

It's just a classic mass hysteria, like the satanic panic of the 1980's or the various witch hunts from long ago. People are freaking out because other people are freaking out about things that none of them understand. There's nothing more frightening than the unknown.

permalink

u/Cool-Amphibian-4035 1 points at 1682228302.000000

I used to respect El as an autodidact who had built a name for himself in the AI community and who cofounded LessWrong and MIRI, but the latest fiasco has hurt his credibility. I just hope that he refrains from making such over-the-top claims in the future.

permalink

u/grotundeek_apocolyps 2 points at 1682361301.000000

It's not just the latest fiasco that has hurt his credibility. He has *never* been known or respected in the AI community.

permalink

u/hydroborate 1 points at 1683844357.000000

I was fr rolling in immense delight reading this glorious obliteration of Yud

permalink