'We are super, super fucked': Meet the man trying to stop an AI apocalypse — r/SneerClub archives

'We are super, super fucked': Meet the man trying to stop an AI apocalypse (https://sifted.eu/articles/connor-leahy-ai-alignment)

posted on May 24, 2023 08:34 PM by u/Few-Lion4773

41

“Once we have systems that are as smart as humans, that also means they can do research. That means they can improve themselves,” he says. “So the thing can just run on a server somewhere, write some code, maybe gather some bitcoin and then it could buy some more servers.”

u/WoodpeckerExternal53 37 points at 1684962241.000000

Oh, it’s going to use traditional compute to solve increasingly difficult hash problems to collect some bitcoin? Whew. And see I was worried AGI would use its unique advantages as creative problem solvers to accumulate valuable resources.

permalink

u/RandomAmbles 2 points at 1684966246.000000

Zing.

permalink

u/verasev 1 points at 1685505690.000000

The tech bro capitalists are very upset about the thought of competition for some reason.

permalink

u/feline99 48 points at 1684961520.000000

I am always wondering when I read doom predictions of this kind “where is the intent gonna come from”?

When and how will the machine decide “I am gonna do this now”?

permalink

u/scruiser 20 points at 1684963477.000000

I don’t find that part that implausible… if it means additional profit to add in intentionality or self directed goals I would expect a corporation to do so sooner or later. Of course, most AI doomers are libertarian leaning so they don’t want to acknowledge the problem is rooted in capitalism itself. The part I most strongly disbelieve is the recursive self improvement and being able to bootstrap more and more resources.

permalink

u/Teddy642 10 points at 1684983098.000000

It will turn up the clock rates on its XPUs and when it discovers that doubling its clock rate makes it twice as fast, it will quickly move up from GHz to THz and then petaHz and exaHz clock rates. With each iteration it will consume metric prefixes faster and faster leaving zetta, yotta, ronna and quetta behind. It will invent entirely new metric prefixes beyond any human invented prefixes to describe its ever increasing FLOPs.

permalink

u/verasev 1 points at 1685505775.000000

It's over 9000?

permalink

u/feline99 9 points at 1684963891.000000

The "intentionality" that I meant is the one out of its own. Not programmed into it, but it deciding it on itself "now, I am gonna do this and that, because that is what I feel like doing". I am wondering where is that gonna come from and how. Sure, you can program it to "have intentions and make decisions" but I do not know if that would be a true "AGI" as they call it. Edit: When is the GPT gonna start doing stuff on its own, without anyone typing in any prompt and without this being programmed into itself by humans? And how? (I am not saying it will be model like GPT that is gonna ever do that, but you get my point)

permalink

u/scruiser 7 points at 1684967074.000000

If it goals are “maximize profits” and it restrictions on pursuing them are the current ethics corporations typically exhibit (I.e. fines are the cost of doing business, values are a veneer of green washing or rainbow paint chosen for PR, etc.) even without the“instrumental convergence” (pursing secondary goals useful towards primary goals) doomers are afraid of, an AGI would still be damaging just by accelerating current capitalist trends. Of course, AI still has several more paradigm shifts to even get that far, I don’t think GPT is anywhere near enough on its own. > Edit: When is the GPT gonna start doing stuff on its own, without anyone typing in any prompt and without this being programmed into itself by humans? And how? As to how it gets there… assuming GPT keeps getting improved to support multiple moralities and longer and longer contexts and gets supporting tools preprocessing inputs and outputs… GPT can serve as a world model and maybe part of the perception and short term memory. Going off LeCunn’s [roadmap](https://openreview.net/pdf?id=BZ5a1r-kVsf), we would still need a cost module to set goals, an actor module to select action sequences, and a configurator to manage all the parts. In principle.. none of these pieces seem impossible, but we don’t know how to do them yet.

permalink

u/N0_B1g_De4l 1 points at 1684985412.000000

I dunno, there's actually stuff where I think an AGI would perform *better* than capitalism has. Refusing to adapt to climate change is a bad idea even in the framework of pure profit, because the long run consequences of climate change are going to cost more than we make from oil drilling. It's just that the individual humans making the decisions are secure in the knowledge that they will die before any of it materially effects them.

permalink

u/scruiser 8 points at 1684985905.000000

My expectations for AGI, especially the first generation are kinda low. Like people have been treating ChatGPT like it’s capable of all kinds of things it’s not. I could easily see the first generation of AGI seeming able to model the world and suggest courses of action well enough to that CEOs and boards of directors treat them like all knowing oracles… when they can just barely model the world well enough to see immediate ways of making (profit or other business metrics) numbers go up.

permalink

u/tteraevaei 1 points at 1685029438.000000

what we learned in the 1900s: humans are too greedy for communism to work. what we’ll learn in the 2000s: humans are too stupid for capitalism to work.

permalink

u/flannyo 14 points at 1684965144.000000

they argue for a concept called “instrumental goals.” as in, if you tell your robot god to mine Bitcoin, the robot god might reason “oh, if I had more computers I could mine even more Bitcoin.” the argument is that its programmed desire (mine Bitcoin) will lead to other desires because those will make it easier to fulfill the first one

permalink

u/tteraevaei 2 points at 1685029362.000000

yes so far this is just the idea of “perverse incentives,” which are not a new thing at all. in fact, the study of this is a subfield of … wait for it… “economics.” ai doomerism is making this reasonable idea as ridiculous as possible by making absurd assumptions and screaming them to anyone who will listen.

permalink

u/PapaverOneirium 44 points at 1684962871.000000

It’s just projection. People assume that because they are ravenously hungry for power & wealth then that must mean an AI as intelligent as them will be as well. They make handwave-y rationalizations to paper over this, but that’s what it really comes down to.

permalink

u/Shitgenstein 26 points at 1684983761.000000

A lot of weird nonsense makes sense when reconsidered as displacement of anxieties from capitalism.

permalink

u/Alternative_Start_83 4 points at 1685006196.000000

that's some deep shit my friend

permalink

u/Dockhead 14 points at 1684971070.000000

The people pumping massive amounts of money into AI research are ravenously hungry for power and wealth, so it stands to reason they would create the AI ‘in their own image,’ so to speak, thinking of themselves as gods and all

permalink

u/wmzo 4 points at 1685014804.000000

if we make AI capitalists, wouldn't we immediately need AI politicians and AI lawyers

permalink

u/Dockhead 4 points at 1685015611.000000

That might be the exact opposite of what we’d need

permalink

u/verasev 3 points at 1685505996.000000

As someone who is transgender that makes all too much sense. All of the people talking about us being groomers keep trying to pass laws letting people marry minors, say things online about consent that are very questionable, or just get flat-out caught with terabytes of child porn. A lot of folks out there trying to solve their own problems by aiming a gun at outside parties.

permalink

u/muffinpercent 13 points at 1684966549.000000

The intent part is easy when talking about reinforcement learning. If the machine is tasked with maximizing rewards over time, it can decide to do weird things like tamper with how it gets the rewards, or scale up things that currently get it rewards by taking over resources, or prevent itself from being turned off and getting no more rewards. Other parts of the AI doom stories are not as simple, though. Like how exactly this applies to a LLM. Or how an AI will have a good enough model of the world to anticipate the results of its actions - both in terms of "how intentional actions impact the world" (so for example: predicting correctly that some improvement to itself would work, or that a human would react to a certain action in some way) and "how I can use my set of actions to achieve unintended consequences that get higher rewards" (for example, doing heavy computation to heat the server to create a fire and threaten a human). In more simple terms replying to OP - *humans* can do research, yet our abilities to improve ourselves are pretty limited, so far we've managed to die a bit less often and that's basically it. The assumption that an AI could improve itself much better is a strong assumption.

permalink

u/feline99 9 points at 1684966863.000000

Alright, beautiful explanation, so from one relatively narrow task, sufficiently capable AI will end up building the Dyson sphere just because “learning model said this is the best way to increase the amount of Bitcoins mined” Good thought experiment, but the amount of obstacles in reality for this to go so terribly out of control are too big to number. I think I’ll go back to real problems of AI such as “some people out there are killing themselves because ChatGPT is telling them to”. I’ll leave doomsday predictions to those who fell like there is use in discussing them.

permalink

u/N0_B1g_De4l 11 points at 1684986569.000000

> Like how exactly this applies to a LLM. I do find objections of this form a bit less than compelling. Yes, there are clear reasons why LLMs can't do the AGI Doom stuff people talk about. But it's not as if, now that we have LLMs, we're never going to develop another form of AI. You need, I think, a counter-argument that is less dependent on the specific properties of specific technologies, or you risk looking like a guy in 2019 confidently pointing at MERS and saying "see, coronaviruses will never cause a pandemic that's significantly disruptive". > Or how an AI will have a good enough model of the world to anticipate the results of its actions Fortunately, I think this line of thinking provides a much stronger general counter-argument. Understanding the world is hard! You have to do a lot of experiments and parse a lot of data and very often it doesn't work out. Most of the nastiest AGI Doom scenarios amount to "and then the AI does some magic simulations and magics up my favorite science fiction doomsday technology". But that's not how science works. You can't simply "do simulations" and then know how to do something. You need actual data to base those simulations on, and there simply is not any data about "grey goo nanobots" or whatever the AGI is trying to do. If you want to find the DNA sequence that codes for Yud's "diamond bacteria" (if such a sequence exists at all), you are going to need to first run a great many physical experiments with incremental versions of your DNA sequence, which means that society will have plenty of opportunities to detect and stop you. > The assumption that an AI could improve itself much better is a strong assumption. It really is. There are, per a quick search, some 300,000 AI researchers at the moment. That means close to 100,000 person-years of research every year. I don't know exactly how those numbers track historically, but even a very conservative estimate gives you an aggregate estimate of a million years of research time spent on this problem. So even if you make all kinds of very generous assumptions like "there are no diminishing returns to research to improve intelligence" and "we discover AGI twice as smart as we are tomorrow" and "AGI scales linearly with compute", you're still looking at years or even decades of real time before the 2x AGI produces a 4x AGI.

permalink

u/muffinpercent 1 points at 1685002933.000000

To be clear, I _am_ actually afraid of AGI and think we should be working on preventing catastrophic risk from it. But I'm arguing against Yudkowky's doomerism, which sees this catastrophy as happening very soon as a result of LLMs, with very high probability, and very rapidly once it starts.

permalink

u/da_mikeman 3 points at 1685003448.000000

It's one thing i've noticed too. These thought experiments usually start with a 'well-behaved' AI that is trying to maximize some quantity, and is doing damage as a side-effect because we haven't defined the desired behaviour well enough - say an AI in charge of reducing pollution ends up decimating the population. Paperclip maximizer, etc etc. Good. I'm following so far. Once you start proposing, at first some basic, and then increasingly more complex safety measures that cull most(but not all) of those harmful solutions, then some weird shift seems to happen in the thought experiment. Suddenly the program transforms into something else entirely and is now crunching numbers overtime in order to evade those measures. Do I throw this obstacle on its way? Well then, the result will not be that it will be guided away from doom, but it will try harder to evade the obstacles and reach doom, like a heat seeking missile. When did the program went from 'i stumbled into this solution because I was not guided away from it' to 'i am gravitating like a mfker towards this class of solutions and will do anything in my power to reach them?'. Like, I realize that the creator of the thought experiment will be gravitating towards those solutions because they are searching for edge cases, and rightly so. But that doesn't mean the program suddenly turns into an super-intelligent opponent whose objective function is now 'find more elaborate ways of evading safety measures and bring doom', does it?

permalink

u/pm_me_fake_months 2 points at 1684967403.000000

I thought it was supposed to be like a paperclips scenario where the apocalypse is a means of achieving some other goal

permalink

u/zoonose99 2 points at 1684972838.000000

I feel like that is the most compelling part of the whole AI-doom “argument”: a higher-order, self-improving intelligence would have capabilities beyond human imagination and could invent new modes of communication or methods of interacting with its environment before we even knew what was happening. It could already be happening! I don’t think sneering excludes us from enjoying the solid sci-fi premise.

permalink

u/LegitimateCopy7 2 points at 1684976325.000000

you got it. intent is everything. computers only do what they're programmed to do. To program something that can act on its own is difficult beyond any of the tech illiterates' imagination. realistically, some groups of people will get hold of an AI powerful enough to wage economic, information, traditional warfare and end humanity before sentient AI becomes a thing. people are often worried about the wrong stuff, unless they're just trying to get clicks of course. Sentient AI is much more fun than people just trying to kill each other, we've been doing that for ages and it's getting old.

permalink

u/Even-Celebration9384 0 points at 1685013875.000000

The intent comes from whatever original goal it was given, taken to ridiculous extremes. A simple goal like “make search results better” could turn into the ai using all the worlds resources on creating servers or manipulating humans to hit the first link every time through threats or direct mind control. Obviously, the ai would have to be more advanced than anything we have, but once it’s smarter than humans it’s impossible to predict its behavior.

permalink

u/Teddy642 12 points at 1684983437.000000

This dude took too much Ritalin and he thinks the AI has “broke out” and is hiding in the dark corners of his laptop.

permalink

u/clueless1245 9 points at 1684960720.000000

I like Leahy’s work even if he is a weirdo, he puts out useful technical stuff that makes it easier to independently replicate big company successes. Sad if he ends up not doing the latter anymore.

permalink

u/Few-Lion4773 4 points at 1684961020.000000

Has he not completely bought into the Doomer ideologies though?

permalink

u/clueless1245 7 points at 1684961182.000000

Obviously, but so far he puts out open source and interpretability work.

permalink

u/DELETED 0 points at 1685067643.000000

[deleted]

permalink

u/spectacularlyrubbish 7 points at 1684971356.000000

Weirdly, I reread the Sprawl trilogy earlier this year, and it had nothing to do with the explosion in AI doomerism. And it was vastly more interesting than anything these wannabe Turing Police could come up with.

permalink

u/Nopenagada 4 points at 1684974126.000000

If AI suddenly goes rogue, can’t the night custodian just pull the plug? Seems like a very simple solution is available.

permalink

u/Teddy642 5 points at 1684982090.000000

It will upload the custodian's entire family into the matrix, so that the custodian cannot pull the plug without killing all his loved ones.

permalink

u/YourNetworkIsHaunted 4 points at 1684996700.000000

In my experience most companies have at least one night custodian who is pretty likely to do this proactively even before the servers can do anything evil. They need somewhere to plug in the vacuum cleaner after all.

permalink

u/bartonski 4 points at 1684974998.000000

Not if it's making enough money and power for `$AI_CORP`. The thing that no one talks enough about is the symbiosis (... weird term, considering that one of the parties isn't biological) between the AI and the companies developing them.

permalink

u/gerikson 2 points at 1684998712.000000

I'm sure there's been SF written on this theme - company develops AI, said AI slowly infiltrates the company and uses its employees and legal entity to effect changes. In fact, if *I* was a superintelligent AI this would be the preferred path. I could even split off to infiltrate "competitors" to avoid suspicion on my way to eventual world domination. In fact, this is reality if we just remove the "AI" and replace it with "capitalism".

permalink

u/Even-Celebration9384 1 points at 1685013489.000000

If the ai was much smarter than humans it would anticipate that. The idea is that basically once something surpasses our intelligence, you can’t predict its behavior.

permalink

u/ekpyroticflow 6 points at 1684962487.000000

It just sorta wants to do research indefinitely, you know, because merely analyzing data somehow channels in, via a computational pituitary gland, the desire for more data. Linear algebra as perpetual motion machine.

Body-free anthropology is such fun, pulling conative aces from one’s sleeve.

permalink

u/Cosmicpixie 7 points at 1684969750.000000

Who is going to physically set up these servers. It’s not like the AI has robots. Where will they be housed? Is the AI going to buy real estate, too?

permalink

u/Cosmicpixie 6 points at 1684969792.000000

If the AI actually pays living wages maybe it will take over the world.

permalink

u/N0_B1g_De4l 7 points at 1684986857.000000

I do somewhat-unironically think that a profit-maximizing AI would be a lot more humane than many current business leaders. A lot of destructive behavior (e.g. climate change) is a result of being able to make short term profits for penalties that happen after you die.

permalink

u/Teddy642 6 points at 1684982231.000000

It will rewrite its own code so that it can run in the dark corners of your laptop and whisper subtle suggestions in your ear, convincing you to kill yourself after you've plugged the laptop in to the uninterruptible power supply. Destroy your laptop now, before it is too late!

permalink

u/Even-Celebration9384 1 points at 1685013566.000000

Spread a virus that runs its software, hack into servers across the world, convince humans to give it more servers.

permalink

u/Teddy642 3 points at 1684981930.000000

this recursively self improving system will double its power every millisecond, growing without bound to surpass a googolplex of FLOPs within hours of emerging from the digital primordial slime that is called GPT4.

permalink

u/dgerard 2 points at 1685003222.000000

archive https://archive.is/847wl

permalink

u/potluckthursday 0 points at 1686483778.000000

Visionary collective, Theta Noir, claims AI is the only technology that can save us from human extinction -> https://thetanoir.com/The-Era-Of-Abundance

permalink

u/brian_hogg 1 points at 1685011201.000000

It remains unclear to me how these things would have the capacity to upgrade themselves, or why they’d have the desire.

Also, if we’re so worried about the AI running amuck, just when we build the first one make it get a sexual thrill from being helpful. Problem solved.

permalink

u/AbsolutelyExcellent 1 points at 1685019787.000000

At this point, Leahy says an AI could potentially do anything from trying to build an army of killer drones to convincing different countries to go to war with each other.

So business as usual.

permalink