r/SneerClub archives
newest
bestest
longest
19

One thing I wonder as I learn more about Yud’s whole deal is: if his attempt to build AI had been successful, what then? From his perspective, would his creation of an aligned AI somehow prevent anyone else from creating an unaligned AI?

Was the idea that his aligned AI would run around sabotaging all other AI development, or helping or otherwise ensuring that they would be aligned?

(I can guess at some actual answers, but I’m curious about his perspective)

Building an AI – aligned or not – is so far outside the realm of what Yudkowsky is capable of building that you might as well ask what if Yud had been successful at making a time travel machine or perpetual motion machine.

One of the biggest issues I have with MIRI project is that they so grossly underestimate the technical challenges behind building such a self-improving AGI. LLMs are impressive and may eventually be stepping stones, but they are still toys in comparison to full AGI. In all likelihood, “aligning” a self-improving AGI (whatever that means) will turn out to be a trivial sub-problem in comparison to the task of actually building one. Whatever the unintended consequences of AGI might be, I have faith that we’ll be able to communicate our desires a little bit more clearly than wishing for “give me as many paperclips as possible” on a monkey’s paw.

Your point about alignment being a trivial problem congrues with something I've been thinking about for a while: If we're going to create an artificial consciousness, there are some questions we'll need to answer along with the way: "What is a 'value?' How do people come to hold values? What's the relationship between our values and our actions? Sometimes, people profess to hold one value, but their actions seem to be inconsistent with that value. What's up with that?" Answering those questions would seem to be a prerequisite for creating a true consciousness, no? And solving the alignment problem will become a heck of a lot easier once we answer those questions. So I'm not worried about the alignment problem. I'm worried about many other things, but not the alignment problem.
>If we're going to create an artificial consciousness, there are some questions we'll need to answer along with the way: "What is a 'value?' I'm not even saying that. Rationalists like to imagine that the AGI will be solving trolley problems all day, so before we cut it loose we better have a good idea what the "correct" answers are. That, I grant you, is an impossible problem. It's a problem that applies to the Yudkowskys of the world because the greater the problem, the greater the genius required to solve it. But it's the exact opposite of the engineering mentality, which is, how can I scope this problem down so even me and my 20 Watt brain can figure it out? As long as the AGI doesn't do a worse job than us bumbling humans, we should be OK. Let's not let the perfect be the enemy of the good. Let's just figure out how to avoid completely catastrophic outcomes. Solutions to the trolley problem that no one asked for. Extinction of all humanity within our lifetimes. It seems to me that we can at least avoid to manage *that* much. Yudkowsky gives us just enough credit to build an AGI but not enough to avoid bollocksing it all up, whereas the former problem seems far more difficult to solve than the latter.
not sure I agree with or follow this train of thought. ethics is an infamously difficult field. (~~for those who disagree with spinoza I mean, us spinozans know he got it right~~) also unsure that answering those questions would create a true consciousness. I don't think that this thing called consciousness depends on having a sense of value, or morality, or ethics. elon musk is conscious, after all
Snark aside: Consciousness tends to get conflated with agency and a motivation for action, and I did make that mistake in my first post. You can imagine a consciousness that doesn't have motivation to action (a cloud-gazer, that experiences the world without any desire to act.) But that's not what most people think of when they think of AGI, and it's *especially* not what Yudkowsky is thinking of. I'm not saying that a conscious agent requires *moral* values, but agency *does* require some sort of motivation for action. Musk definitely has "motivating values" in that sense, like "make more money" and "own the libs" and "create an elaborate fiction in which my wealth has nothing to do with Daddy's emerald mine."
> elon musk is conscious, after all \[citation needed\]
...Is he, though?
But muh philosophy is useless and STEM will solve all our problems with MATHS
The answer you’re looking for is called socialization and it’s well discussed in foundational psychology.

In singularity lore, the singularity is an event where AI explodes exponentially in sophistication, and the first AI that does this will “win” and out-compete every other AI. Therefore, according to this logic, you have to make sure that a “good” AI evolves first so that it can kill off or destroy any other attempts at creating rival AIs who might be “bad”. And, again according to singularity lore, this exponential explosion of intelligence will create an AI with god-like powers, so you can just hand-wave any explanation for how the benevolent robot god will accomplish this.

So being the first means you would have to be at the forefront of AI research, your not-yet-AI systems would have to always be the best ones around. ChatGPT and similar made it painfully obvious that MIRI et al are hopelessly lagging behind, which is what has inspired the turn towards doomerism recently, because Yud and co are now convinced that they won’t be the first, which is why he wants to nuke datacenters hosting anyone else’s AI.

Eliezer Yudkowsky heard about Voltaire’s claim that “If God did not exist, it would be necessary to invent Him,” and started thinking about what programming language to use.

You win the internet

I think their thing is that hard takeoff AI, that is recursively self optimizing for some abstracted general intelligence factor, can only happen once. There’s something dumb going on with timeless decision theory here too, but essentially if you get aligned super intelligence a little head start, it can bootstrap its way past the “alignment penalty,” i.e. the efficiency cost of not being efficiently evil.

His idea was called Friendly AI (FAI), a system deliberately aligned with human-centric goals. He has a crude understanding of AI systems as goal-optimizers and utility-seekers that might have unintended side effects if their goals aren’t properly defined.

He’s not able to go from a broad qualitative understanding of how machine learning (specifically something like reinforcement learning) works to a mathematical or statistical set of definitions, let alone write any actual software, but he does have this vague idea of how things could go wrong. He got as far as “Friendly AI will protect us from malicious AI by figuring out how to get these goals defined properly. Because I sure can’t do it, so it has to take superhuman intelligence.”

What if God was a rabbit?

I ran a few simulations where he was successful. This also involved running simulations of the AI. They were insufferable dopes. Let’s just say 10^27 copies of them are not enjoying their best lives right now.

The idea is that any proper AGI pretty immediately uses “nanomachines son” to become God, therefore it’s gotta be a good God because God will immediately want to suppress competitors that have different goals.

Human aligned AI would want to protect humans from bad AIs, therefore would destroy all other AIs that could be bad. It would also want humans to have fulfilling lives, community, defeat death and suffering (unless such suffering was fulfilling, which is why ethics is tricky) etc, which would probably basically involve taking over the Earth and running it as a post scarcity fully automated gay luxury space communist utopia or something.

Basically the idea is that the moment ANY AI is created humanity has lost control of their own destiny so you better make sure you get it right the first time.

[deleted]

I had very similar experiences being a clever child. Unless you interface with reality, it can be easy to continue to think that because you can imagine things in what you think is perfect, accurate detail, that it actually IS perfect and accurate.

I remember him quite liking the sabotage idea, which would make sense given where he is now.

In his headcanon, if he’d succeeded, his AI would destroy-before-sentience any of the lesser models to prevent apocalypse.

In my headcanon, his AI immediately self-terminated the moment Yud first mentioned infinite catgirl simulations

We literally cannot know what happens after the singularity, yud just wanted to make sure it didnt kill us all. Ow and also, everybody becomes immortal and we will bring back the dead and conquer the stars. Also nanomachines.

Yud was going to be successful at making AI the way Trump was successful at building his wall and making Mexico pay for it: not at all, even a little, ever. He is a con artist, a grifter, with no demonstrable skills or means to effect his “plan.” It was all fake.

Yeah, I was asking more about his sales pitch, not believing in it :)