r/SneerClub archives
newest
bestest
longest
New AI x-risk hadith dropped, wherein a polytheistic model is introduced (https://www.reddit.com/r/SneerClub/comments/11r4rv8/new_ai_xrisk_hadith_dropped_wherein_a/)
55

Maybe by the time the world-killer arrives, we’ll have a lot of intermediate AIs sort of on our side that are only a little less intelligent than the world-killer, and the world-killer won’t have an overwhelming advantage against us. For example, maybe in 2050, some AIs will warn us that they can see a promising route to turn virus XYZ into a superpathogen, we will have millions of AIs work on XYZ vaccines, and then the first AI smart enough and malevolent enough to actually use the superpathogen will find that avenue closed to them.

https://astralcodexten.substack.com/p/why-i-am-not-as-much-of-a-doomer

Looking forward to his seminal work on the topic, Evangelion: You Will (Not) Adjust Priors

This goes directly against the Yuddite dogma of smartN+1 beats smartN.

Heretik!

Honestly the basic premise here seems like a path back towards sanity. I assume there's some nutty bits in there, but the idea that probably we will make a lot of AI systems that are varying degrees of smart and that we will develop them incrementally seems both A) more plausible than hard takeoff AGI scenarios and B) vastly less apocalyptic. If Scott is willing to believe that instead of "ChatGPT 5 will literally be God", I don't know that I particularly feel the need to mock him for it.
Ow certainly, this isn't a bad development. Not sure if it will work however, because following the logic of large number of smartN- AI beat 1 smartN AI, you can also see that large number of smartN-- nonAI can beat smartN- AI. So the whole project of Rationalism (as a way to stop the evil AGI) will fall apart, and Rationalists might finally start to listen to non longtermist people. So im not really mocking him for this idea, more weirdly pointing out that this idea kinda removes one of the pillars of Yud style Rationalism. Welcome to sneerclub Scott. (Of course, Scott style Rationalism (with it's pro NRx shit) is prob worse).
yes but what if smartN times 3\^\^3
Yes he's clearly missing that they all engage in acausal trade with each other. Chessmate.
But there are more of them!

This stuff reads more and more like science fiction to me. It’s just so… unserious and unrigorous. It’s great fodder for scaring the shit out of 22 year olds and converting them to your cause based on really dodgy math. The math of infinitesimal is kinda fucked, and when you go with lim n -> 0 1/n you end up with infinity. So doesn’t matter how small the probabilities are, when you measure it against risk then you end up with ‘life is gonna end’ – which is always true. Everyone is gonna die.

I do believe that rationalism and accelerationism and e/acc and the like is religion for nerds who don’t want to believe in god.

[deleted]
Mind sharing that essay when it's done?
ooh that sounds like it'll be good
> I do believe that rationalism and accelerationism and e/acc and the like is religion for nerds who don't want to believe in god. All the AGI talk has been this way from the beginning. Go to church, guys. Get a hobby. Deliver food to the elderly. Go do something useful.
There's probably some really shitty thinkpieces to be written about the need to direct and structure the ambiguous desire to do Good and how quickly and easily this leads towards doing silly or nonsensical or at best suboptimal things.
> Deliver food to the elderly ah but what does EA tell us - old people or castles? it's just math
Castles and elderly people are *the same thing* you fool
What good is doing something concrete that your instinctual emotions might validate as real, if there's even a chance you might miss the opportunity of indulging in abstract fears that reward you with a fleeting, contrarian sense of smugness?
and they’ll mock “statist regulators” for using linear models without realizing the irony.

But what if one of the good AI’s working on the vaccines is actually the world killer in disguise? Of course, enough good AI’s will expect this, so there will be lots of suspicion thrown around, until the good AI’s fight it out and destroy each other.

…which was the REAL world killer’s plan all along, who was simply waiting for this to happen before destroying us all, unopposed.

This is in fact a hypothetical that is discussed very seriously and at excruciating length: >2: How likely are AIs to cooperate with each other, instead of humans? Consider the sugar-crystal maximizer from the Interlude. It might be willing to make cake forever without rocking the boat; it does like cake a little, just not as much as crystals. > >Or it could tell humans, “I’ll make cake for you if you pay me in sugar crystals”. > >Or it could leave a message for some future world-killer AI “I’m distracting the humans right now by including poisons in their cake that make them worse at fighting future world-killer AIs; in exchange, please give me half of Earth to turn into sugar crystals”. And so on and so forth.

Goddamnit guys, you’re not supposed to literally reinvent pagnism. Fuck it, it’s a slow start today:

The average online debate about AI pits someone who thinks the risk is zero, versus someone who thinks it’s any other number. I agree these are the most important debates to have for now.

If I wanted to be charitable, I would read this as ‘most important debates on AGI to have’ but I’m here, so I’m not interested in being charitable.

Eliezer Yudkowsky seems to think >90%

As written this makes it look like everyone except Eliezer is <=50%, which isn’t true; I’m just having trouble thinking of other doomers who are both famous enough that you would have heard of them, and have publicly given a specific number.

Sweet baby Jesus.

So there’s no definition of what the bad outcome is here, it’s just “we die cause AGI” which is something of a red flag. I have a feeling this is going to be relevant later.

some superintelligence with inhuman values is monomaniacally focused on something far from human values (eg paperclips) becomes smart enough to invent superweapons that kill all humans, then does so.

Not that long, apparently. It’s worth pointing out that “superintelligence” remains pretty unexamined for being so central to the “why we all die” argument; any definition of intelligence that maximizes paperclips doesn’t seem particularly super to me.

The world-killer needs to be very smart - smart enough to invent superweapons entirely on its own under hostile conditions. Even great human geniuses like Einstein or von Neumann were not that smart.

Gotta get our Einstein reference in. I know I sound like a broken record on this, but again, “smart” is not really well defined. The idea that there is only one thing that constitutes intelligence seems like a baseline assumption of these fellahs.

(if you’re imagining specific years, imagine human-genius-level AI in the 2030s and world-killers in the 2040s - not some scenario with many centuries in between)

So the optimists’ question is: will a world-killing AI smart and malevolent enough to use and deploy superweapons on its own (under maximally hostile conditions) come before or after pseudo-aligned AIs smart enough to figure out how to prevent it (under ideal conditions)?

Let me rephrase: will there be sufficient angels on the head of this pin to establish dystopia?

JESUS WHY IS THIS SO FUCKING LONG

Here’s a similar problem you might find easier: suppose you are an American. But due to a bug in your motivational system, you don’t support the United States. You support (let’s say) the Union of Soviet Socialist Republics. In fact, this is the driving force behind your existence; all you want to do with your life is make the USSR more powerful than the US.

I am sure this is a completely neutral example that is not indicative of any ideological bend in the writer. I’m equally sure it will go entirely smoothly and that this is a good example that will not lead to any absurdity.

But if an AI had a bug in its motivational system, maybe it would do better. Maybe it would act like a sleeper agent, pretending to be well-aligned, and wait for opportunities to strike.

It is incredible to me that at every point, these guys display such a profound lack of imagination about the way that these entities are structured and behave. They have to square the circle of using human examples of cognition and behavior to describe things a priori inhuman in environment, senses and structure, while saying that it will be a super intelligence that will just do what we do but MOAR.

Eliezer Yudkowsky worries that supercoherent superintelligences will have access to better decision theories than humans - mathematical theorems about cooperation which let them make and prove binding commitments with each other in the absence of explicit coordination.

I want to point out that the Logical Positivists had zero luck in developing a universal grammar and syntax for provable statements only, which Yud doesn’t seem to understand had vastly more intellectual firepower than MIRI. Also, and I’m not an expert in the proof, but it seems like Godel would rule out, in principle, a system that is both complete enough to describe every possible interaction between agents, and logically coherent enough to be provable in the sense Yud is making.

4: How easy are superweapons? The usual postulated superweapon is nanotechnology: large-molecule-sized robots that can replicate themselves and perform tasks. Get enough of these, and they can quietly spread around the world, quietly infect humans, and kill them instantly once a controller sends the signal.

God damn it guys, you’re literally using a plot from a GI Joe movie.

I’m snipping the really good reasons why you shouldn’t worry about this because:

Eliezer Yudkowsky takes the other end, saying that it might be possible for someone only a little smarter than the smartest human geniuses. He imagines, for example, a von Neumann level AI learning enough about nanotechnology to secrety train a nanobot-design AI. Such an AI might work very well - a chemical weapons designing AI was able to invent many existing chemical weapons - and some that might be worse - within a few hours.

I swear to fucking god these people are profoundly contemptible.

How detached do you have to be from other people to describe a disagreement with your goverment as a 'motivational bug' I can't imagine it at all, it's such an utterly inhuman way of thinking
I think these people are very much *not* detached from other people and all of this is a defense mechanism for extreme fragility.
>I'm not an expert in the proof, but it seems like Godel would rule out, in principle, a system that is both complete enough to describe every possible interaction between agents, and logically coherent enough to be provable in the sense Yud is making. I want to push back against this particular criticism. There is precisely one thing in the entire world for which you can't reasonable criticize MIRI, and that is criticizing them for not talking about Gödel enough. Roughly one third of MIRI's research output is about jerking off to Gödel's incompleteness theorems and to Löb's theorem which is a generalization of Gödel's second incompleteness theorem. Pretty much all of MIRI's research output is about irrelevant and esoteric questions of mathematical logic, that have no practical relevance to AI. So if you want to know how an AI can provably trust another AI in the presence of Gödel's incompleteness, then you can read this 40 page MIRI paper, that actually seems to largely have been written by Yudkowsky himself: https://intelligence.org/files/TilingAgentsDraft.pdf In case you wondered what the people at MIRI actually do all the time, now you know.
> So if you want to know how an AI can provably trust another AI in the presence of Gödel's incompleteness, then you can read this 40 page MIRI paper, that actually seems to largely have been written by Yudkowsky himself: https://intelligence.org/files/TilingAgentsDraft.pdf Was it him, or was it the coauthor with a mathematical background? EDIT: okay, four minutes scrolling through the prose, it's very EY, but no way any of the math...
Yeah peep the first footnote, it's pretty clear others did the math.
yeah, a pile of these names are rationalism's actual mathematicians
Why did I download that pdf? Why? First footnote: >\*The research summarized in this paper was conducted first by Yudkowsky and Herreshoff; refined at the November 2012 MIRI Workshop on Logic, Probability, and Reflection with Paul Christiano and Mihaly Barasz; and further refined at the April 2013 MIRI Workshop on Logic, Probability, and Reflection with Paul Christiano, Benja Fallenstein, Mihaly Barasz, Patrick LaVictoire, Daniel Dewey, Qiaochu Yuan, Stuart Armstrong, Jacob Steinhardt, Jacob Taylor, and Andrew Critch Strong "I'm stealing first author spot despite doing nothing past being present once" energy. I don't doubt these guys talk about Godel a lot, but it doesn't seem like Yud understands the implications for formal logic of the type he thinks AGI will be capable of? I dunno, like I said, I'm not an expert here, this is all left overs from studying logical positivism and some classes on logic a decade ago
It does feel like they're starting from a position where this kind of communication across time without direct interaction is definitely possible and then started looking at Godel and incompleteness as problems to be solved rather than fully considering the implications for their hypothesis. There's probably another version of this that's rooted in decidability. Off the top of my head, given the recursive nature of the hypothesized negotiations (I know that he knows that I know...) this feels like any attempt to actually engage in this kind of consideration would run a foul of the halting problem. Also I too have little to no real academic background here and if I'm talking nonsense I have no aversion to being told so.
> and then started looking at Godel and incompleteness as problems to be solved oh god lol that's perfect
> Goddamnit guys, you're not supposed to literally reinvent pagnism is it an advance on reinventing a vengeful Yahweh tho

we will have millions of AIs work on XYZ vaccines

Then the vaccine is what gets us! * Taps head *

Other take: How long before the narrative is some version of Brave Little Toaster

Side bar, this is the plot to my personal sci fi universe I’ve been scratching away at for years. A pantheon of AIs develops, there is a “shadow war in heaven” with a benevolent caretaker-type winning and then keeping a lid on any new superhuman AIs from emerging. It’s my cheat code to having stories set in the medium future without having to deal with a singularity or powerful AI characters.

Which I guess would be a sort of impossible utopia to the Yuddites. Come to think of it, I really should include them as a messianic cult in there somewhere.

amazing

Here’s a similar problem you might find easier: suppose you are an American. But due to a bug in your motivational system, you don’t support the United States. You support (let’s say) the Union of Soviet Socialist Republics.

That’s not a bug, that’s a feature.

Username cheka's out.

Also apparently monomaniacal computer genies/parerclip maximizers will be rebranding as ‘supercoherent AI’ in the near future.

What if instead one such smaller but nonetheless sentient AI started to manipulate events on a global scale to force a burnt out but experienced hacker to help them unite with another such small AI, not to become a world killer but rather to disolve into the matr the internet?

Yes, I too have seen Person of Interest.

I’m curious which side biological viruses would take: in favor of being weaponized by the “god” AI to overwhelm us as a means to short term success but possible destruction of their chain of dependence on biological matter, or being ruthlessly hunted down by the “lesser” AI in defense of our own existence, at risk of accidentally being too thoroughly eradicated should those lesser AI be successful at protecting us.

God, I’d hate to live with the existential crisis viruses are experiencing right now. /s