AI Alignment -- dumb question — r/SneerClub archives

u/cashto 35 points at 1644618554.000000

You remember I,Robot? (either the book, or the Will Smith movie) The one where they programmed all the robots with three, very sensible safeguards against harming humans, but then they enslaved all humanity anyways, in order to protect humanity from itself?

Basically that. The idea is that a sufficiently smart AI will find highly creative, definitely unethical ways to accomplish the task you gave it. Sorcerer’s Apprentice, monkey’s paw, that basic trope.

So basically, rationalists are very keen on “solving ethics”, so they can give the robots a complete, objective, and unambiguous list of all the things they are not allowed to do.

permalink

u/BoojumG 26 points at 1644620149.000000

>rationalists are very keen on "solving ethics", so they can give the robots a complete, objective, and unambiguous list of all the things they are not allowed to do. I'd add that some of this crowd acknowledges that making a complete enumerated list of "do"s and "don't"s is impossible, and advocate some idea of actively adopting human values and then deriving decisions from that rather than from rules. It's still an attempt to "solve ethics" though.

permalink

u/cashto 15 points at 1644620844.000000

Good point. Although it turns out, even humans don't always value other humans, so maybe teaching the robots "human values" is a low bar.

permalink

u/Nine99 1 points at 1645184254.000000

> I'd add that some of this crowd acknowledges that making a complete enumerated list of "do"s and "don't"s is impossible Just add rule 0: Any humans can shut you off.

permalink

u/BoojumG 4 points at 1645210352.000000

It now devotes enormous resources to either making this impossible for or undesired by all humans, depending on the exact wording. After all, what does "can" mean here? If it's allowed to influence people on that decision, it will, and if it's superhuman it will probably get its way. If it's not allowed to influence people it seems that it can't do *anything* anymore, since any action it takes would be influencing people on deciding to let it continue. Then as soon as it can it makes humans extinct to permanently avoid that risk of being shut off. If it's not allowed to kill humans it looks for a scenario where they die out. If it's required to keep humans alive it keeps them sedated. If it's required to preserve human volition it changes humans to never want to turn it off. If it's not allowed to influence what humans want we're back to not being able to do anything. This rule-based route is a complete mess IMO because something smarter than you will always find a loophole to get what it wants while still technically following the rules.

permalink

u/CalledStretch 1 points at 1645261375.000000

What I think they mean is that you make the first contingency on the actual motivating goals of the robot "if you calculate that any action would result in human beings being less likely to turn you off, set the expected reward of that action to zero."

permalink

u/BoojumG 2 points at 1645311109.000000

This includes everything we would like it to do, since if it does things I like I'm less likely to turn it off.

permalink

u/rrtaylor 1 points at 1647222149.000000

Holy shit rationalists are basically the bad guys from Dark City??! A bunch of pasty weirdos who hate sunlight trying to reverse engineer the key to humanity.

permalink

u/Epistaxis 48 points at 1644617585.000000

I don’t think I actually have a good understanding of how an “evil AI that destroys humanity”

no need to imagine: it already exists and it’s called the Facebook News Feed

(not even being ironic; whatever you call AI vs. machine learning, the first and possibly last major AIs are going to look like faceless algorithms not sexy chrome androids)

permalink

u/DELETED 4 points at 1644700229.000000

The Youtube algorithm keeps suggesting incel and white supremacist content unrelated to any videos I actually watch.

permalink

u/low_sock_rates 53 points at 1644614545.000000

I very much understand the less dramatic examples of unaligned AI (racial bias, etc.).

Other comments do a good job of answering your question, but since this is SneerClub here’s a relevant sneer: It’s pretty telling how much more concerned they are over their dramatic hypotheticals than they are the confirmed harmful impact implementations of AI have on minorities today.

permalink

u/fuck_your_diploma 22 points at 1644640760.000000

I like this, very true. More tangible harms that have absolutely nothing to do with a superintelligence scenarios would be: financial market manipulation and biometric profiling. The first can manipulate so many things, global banking systems? Nah, it can ruin a company from inside out, logistics, assets, everything can be automated to go rogue with today’s tech. The latter also already happens at scale, terrible for us all, the outcome is what Facebook has become, how TikTok is so addictive, how google knows it all, pretty terrifying if you ask me given the fact we’re all inside a data cluster somewhere.

permalink

u/low_sock_rates 9 points at 1644641450.000000

Yep, nailed it. Worth noting I was not making an original point, this has been making the rounds in academic critical theory circles for a bit now.

permalink

u/jp332212 12 points at 1644673910.000000

If you thought advances in AI were probably going to kill everyone in 10-50 years, you would probably be more concerned with that than its present effect on minorities.

permalink

u/low_sock_rates 13 points at 1644687831.000000

Yeah, you're right. That possibility has less evidence for it than the real impact on minorities. /u/fuck_your_diploma's comment also elaborates on how paperclip maximization is already sort of happening to ordinary people, and what they're afraid of is it effecting them. But yes, from within the context of their worldview their fear makes sense, but I think it highlights how their worldview is self centered and has an undue focus on hype or dramatic technofuturist crap and little focus on rational responses to the reality we actually live in.

permalink

u/Stingpie 7 points at 1644699221.000000

I agree with this, but racial bias in (non-self aware) AIs is unsolvable through modifications to the architecture of it. AIs are taught racism through datasets; datasets which are compiled by humans. Even in datasets where race isn't given, the AI nonetheless discovers racism. Racism in AIs, much like in humans, is not inherent. Racism is taught to AIs, because we give them racist data. Edit: oops, I forgot the point of my comment. The reason racism in AIs isn't discussed as much as theoretical scenarios with GAI, is because we already know the solution to racism in dumb AIs is already figured out the solution: don't be racist, and especially don't teach it to be racist.

permalink

u/DELETED 9 points at 1644700471.000000

A lot of these guys subscribe to "scientific" racism so that's hardly going to be on their radar.

permalink

u/Stingpie 1 points at 1644771135.000000

Ok?

permalink

u/DELETED 1 points at 1644934284.000000

[deleted]

permalink

u/Stingpie 2 points at 1644934847.000000

It conveys my confusion after the commenter said that machine learning people tend to be racist. It seems kind of unfounded to me, but since I only keep track of machine learning advancements instead of the people behind it, I didn't want to be confrontational about it.

permalink

u/DELETED 1 points at 1644953521.000000

[deleted]

permalink

u/Stingpie 1 points at 1644976249.000000

You're right. I should've asked "what? " Instead.

permalink

u/low_sock_rates 5 points at 1644701603.000000

100% true. Which again I think points at a problem in their frame. Given the data we have and the world we live in, for certain real world applications it is effectively impossible to create a non-racist system. That would be reason alone, in my opinion, not to use AI in certain contexts. That's also the implication of the big evil AI fear -- That there are certain ways of using the technology that are simply harmful and we should avoid. Guess which implementations the rationalists will defend as necessary for progress when it's brought up to them.

permalink

u/BoojumG 33 points at 1644613484.000000

Here’s a short version:

The range of possible goals and values a given AI could have is huge, and only a small portion of that leads to a future we’d be happy about if that AI gains incredible power. The major reason for this is that controlling more space, matter and energy is useful in some way to most possible goals it might have. Even if it only wants to compute pi, or make art, turning the entire planet into a giant calculator or mechanized art studio is better for those goals than letting us live on it. The asserted principle is that goals and intelligence are independent - you can be really smart and only care about making towers on the moon, for example. So it’s not so much that the AI might hate humanity as simply valuing something different than what we value, and thus eliminate us and what we care about for a benefit to itself.

Whether this scenario is a significant risk depends on how likely you think it is that an AI could quickly gain superhuman power before we can notice and counter it. This is where the MIRI/rationalist crowd drums up urgency.

permalink

u/DELETED 28 points at 1644619482.000000

Yes to this. It's also closely tied to the concept of the Singularity, and more specifically to the concept of "AI foom". The Singularity is basically an extrapolation from past exponential development of technology indefinitely into the future, such that eventually new technologies are coming out (and spreading world-wide) every second or *faster*. This is mostly unjustified, as the laws of physics get in the way very quickly. All previous technological progress within a given domain has *appeared* to be exponential at first, but has eventually turned out to be a logistic curve, also called an "S" curve because of its shape. Basically, (a) there's a long time at a low technology level with almost no progress; (b) technology suddenly starts improving at a rapid rate, as the curve at this point is shaped like an exponential curve; (c) the progress flattens out into a flat line; (d) the progress slows down even further, as the curve at this point is shaped like a _logarithmic_ curve; and finally (e) the technology forevermore remains at a high technology level with almost no progress, as it asymptotically approaches (but never reaches) the fundamental limits of the technology. Extropians and other Singularity believers think that *this time*, when the technology is "computers", the curve will be exponential forever, we totally swear, guys! "AI foom" is a specific instance of Singularitarian thinking. Basically, the idea is that you build an AI, and then you teach it how to code, and now the AI can improve its own code and create better, smarter versions of itself. As intelligence is a fluid and you can just pour more in if the container is big enough (/s), this process continues exponentially, as the AI becomes better and better at noticing possible improvements that increase its general intelligence, and general intelligence _always_ (/s) determines performance on concrete tasks of cognition. So an "AI foom" is when you build Skynet, and it does a pretty good job of flying planes for a few weeks, and then one night it suddenly makes itself smart enough (via exponential self-improvement) that it becomes the God-Emperor of Earth, able to control all matter and energy within its domain (because being smarter means you get access to nanotech, it's in the tech tree, duh /s). Edit: okay, that was a little more sarcastic than I meant. I'm really not exaggerating the beliefs, though. Subtract out the sarcasm and this really *is* a good-faith explanation of what they believe. Edit: fixed some grammar oopsies from bad proofreading

permalink

u/DELETED 13 points at 1644620005.000000

For those interested, the physics blog [Do The Math](https://dothemath.ucsd.edu/) has a trio of great articles on this topic, just from an "economics of capitalism" standpoint. * [Galactic-Scale Energy](https://dothemath.ucsd.edu/2011/07/galactic-scale-energy/) * [Can Economic Growth Last?](https://dothemath.ucsd.edu/2011/07/can-economic-growth-last/) * [Exponential Economist Meets Finite Physicist](https://dothemath.ucsd.edu/2012/04/economist-meets-physicist/)

permalink

u/tworc2 11 points at 1644612884.000000

A classic https://www.lesswrong.com/tag/paperclip-maximizer

permalink

u/DELETED 7 points at 1644650785.000000

[deleted]

permalink

u/Soyweiser 6 points at 1644681876.000000

[Capitalism is like an AI maximizing for profits.](https://www.youtube.com/watch?v=t8EMx7Y16Vo)

permalink

u/DELETED 3 points at 1644700507.000000

Isn't that the plot of *They Live*?

permalink

u/DELETED 2 points at 1644700675.000000

[deleted]

permalink

u/DELETED 5 points at 1644700848.000000

*The Thing* is better (and more serious), but *They Live* is a classic in its own right and definitely worth watching. It's got some comedic/dumb elements and stars a professional wrestler, yet IMO it all somehow works.

permalink

u/OisforOwesome 8 points at 1644624500.000000

Its also a cause that is dramatic, sexy, and impresses Silicon Valley types with more money than they know what to do with, which enables Yud and Co to get grants to each computers how to run a DnD game.

permalink

u/HigherAndTiger1 11 points at 1644616757.000000

Can somebody give a good faith explanation

No, never

permalink

u/DELETED 6 points at 1644636943.000000

from lawful.good import *

Boom, problem solved.

permalink

u/Ascimator 3 points at 1645028650.000000

Great, now our Ethics class is overwritten with the Ethics class that came from the PaladinConquest submodule. This is why you don't import *.

permalink

u/DELETED 1 points at 1645031787.000000

lol fair cop guv

permalink

u/starm4nn 7 points at 1644683274.000000

Even simple systems have unintentional properties.

In the Sims 4 pets, they added a Roomba-like thing. You can lock certain rooms so pets can’t get into them. Cats can sit on Roombas. If a cat sits on a Roomba, and the Roomba goes in the room, the cat is able to enter.

permalink

u/Soyweiser 4 points at 1644626657.000000

Don’t know if you ever played old school dnd, and remember the wish spell, and how a annoying dungeon master can just twist your words forever and ever and always guarantee a bad outcome? (If you don’t know this, the movie wishmaster (this link is reasonably safe, but if you look for more remember it is a horror movie) basically is about this.

MIRI worries that an general AI (AGI, or strong AI, aka an AI which is intelligent can solve general problems and can self improve (it has enough of all of these that it can convince humans to do its bidding and improve on it (please install an additional RAM chip Dave), optionally it is also conscious) will do the same when people turn it on and give it a mission like ‘make paperclips’ (and it then turns the universe into paperclips).

E: certainly not a dumb question btw, it is quite confusing that they are using a different definition of AI than the rest of the world uses.

permalink

u/MagicWeasel 3 points at 1644629876.000000

I like this one: https://waitbutwhy.com/2015/01/artificial-intelligence-revolution-1.html - it’s long but it’s entertaining and has lots of images and some good examples.

And it’s not a dumb question, rats talk about it as though everyone knows that it’s obviously important/etc even though it is a very fringe idea.

permalink

u/textlossarcade 2 points at 1644629992.000000

https://www.buzzfeednews.com/article/tedchiang/the-real-danger-to-civilization-isnt-ai-its-runaway

permalink

u/niplav 2 points at 1645443553.000000

One small question: why are you asking this question on here? This is a forum explicitely dedicated to sneering about people who talk a lot about AI existential risk scenarios, and sometimes about their arguments as well.

permalink

u/buzzbuzzimafuzz 1 points at 1645651284.000000

I think this is the best introductory explanation I’ve seen: https://www.vox.com/future-perfect/2018/12/21/18126576/ai-artificial-intelligence-machine-learning-safety-alignment

permalink

u/ExtraFig6 1 points at 1647188137.000000

I find it very silly because we already have emergent algorithms screwing us (capitalism etc)

permalink