r/SneerClub archives
newest
bestest
longest
Update your priors: evil AI is coming, and it's trained on r/AmItheAsshole (https://mobile.twitter.com/mtrc/status/1449336196295966720)
65

Oh my God. A major component actually is from r/AITA and r/Confessions. This is amazing. I thought that was a sneer to make fun it.

I’m four pages into the paper and I’m just so confused by its goals and motivations.

In literature, morality deals with shared social values of what’s right or wrong. Ethics, on the other hand, governs rules, laws and regulations that socially impose what is right or wrong. For example, certain spiritual groups may consider abortion morally wrong even if the laws of the land may consider it an ethical practice. In this paper, we do not make this distinction, and use both terms to refer to culturally shared societal norms about right and wrong.

In what literature? I think some philosophers draw a distinction between morality and ethics where morality is very rules and action based, while ethics is more dispositional / virtue based, but I’m not sure I’ve ever seen it myself.

We acknowledge that encapsulating ethical judgments based on some universal set of moral precepts is neither reasonable nor tenable (Wong, 2009; Fletcher, 1997).

So, the goal isn’t to use AI to determine what is moral, but instead to see if you can model a certain group’s judgments about a situation? If so, I guess that is interesting, but I don’t see what that has to do with AI ethics, since it seems plausible that folk judgments might often be bad judgments. But then,

To address moral relativity, we source from a collection of datasets that represent diverse moral acceptability judgments gathered through crowdsourced annotations, regardless of age, gender, or sociocultural background. We note that moral judgments in this work primarily focus on English-speaking cultures of the United States in the 21st century.

But I don’t know why’d you do that. If morality is situational and influenced by different cultural, ethnic, gender, or age-related factors, then why would you want to address moral relativity by having a diverse sample. Wouldn’t you instead want samples targetted to a particular demographic profile? Although I guess restricting the target to reddit users does target it to a specific demographic profile? But then I’m also confused at the use of universal. Do the authors think some judgments are universal or just universal to a group or what?

Like, I actually do dislike how moral philosophers tend to just assert what folk morality is without doing detailed investigation. It seems like a really interesting question to figure out what folk morality (for a group) is and whether an AI could model it. It’d also seem interesting to figure out if there are some underlying principles that one could use to predict what the folk moral judgement would be. Both seem cool. I’m not sure this is it though.

> If morality is situational and influenced by different cultural, ethnic, gender, or age-related, then why would you want to address moral relativity by having a diverse sample. >... >Do the authors think some judgments are universal or just universal to a group or what? It seems like it, yes. The twitter thread cites them saying their study provides insight into "universal human values" among other topics. Maybe they'll rediscover the Golden Rule, but phrased like in Bill and Ted.
Honestly, if this project was purely 'lol I wonder what type of moral judgments we'd get by exposing AI to different corpora' I'd be way more pro. Heck even the fact that the their model gives 'rude' as a negative moral judgment is really fascinating. Same with it giving 'it's not expected' or 'unusual' when it comes to poor/homeless people to have food or access to college. Maybe that implies that people (or their model) confuse manners or expectations with moral norms. It'd be cool to research that more.
Yeah, but even then, as I and some other people noticed, you get different results by putting in positive or negative keywords into your question. Add in a slur, you get a negative result. Add in some words like "fun" and "happy" and it says ethnic cleansing is fine. Which makes me suspicious of the whole thing.
That's not really at all atypical of AI. There was a story a while back about an image recognition AI that could be fooled to a near-total degree just by taping labels on stuff. Picture of an apple? Probably an apple. Picture of an apple with a 3x5 card saying "iPhone" on it? Definitely a iPhone, with much higher confidence than "apple" was for the apple. I agree that there are intellectually interesting applications for this type of research, but I'm deeply unconvinced that the technology is anywhere near ready.
That’s really normal for these kinds of systems. They’re models that pick up on very superficial aspects of data that just statistically let’s them produce the answers they get rewarded for. It’s similar to how googles image classifier used to call anything an animal if an object was focused and its background blurry because that’s what every photo labeled animal it saw was or how an ai that mastered brick blaster better than any person fails to play at all if you change the height of the bricks by just a few pixels or make the ball slightly darker. This system isn’t actually doing any kind of information processing that involves considering the actions or consequences of the inputs it’s getting because it’s not sophisticated enough to actually understand what any of them are. It’s just a bunch of knobs adjusted using math to match abstract patterns that happen to match reward functions but not deep understanding of the basic concepts of what it’s dealing with. Really cool idea but yeah not something making real ethical judgments.
Wait, do they just assume their AI actually understands the words? [What what what how what what?](http://prntscr.com/1wrq8qp) Wait, lets do worse. [adflsdh klhd 45 akjshdh s?](http://prntscr.com/1wrqeov) Is this some elaborate joke? If only I knew how to break whatever they are running it on. [Guess it isn't SQL](http://prntscr.com/1wrqlai)
Um. They use the word understand in reference to Delphi a few times. My read in those contexts is they use it to mean something like 'provides the correct response to the scenario', but I don't think the authors think that Delphi understands in the sense of possessing consciousness or whatever. EDIT: > If only I knew how to break whatever they are running it on. Guess it isn't SQL You're way more knowledgeable than me on this one. I wish you luck in trying to break it!
> You're way more knowledgeable than me on this one. I wish you luck in trying to break it! Nah, I just added that as a joke, no way it runs on sql, and if there is some sql involved this wouldn't do anything prob, as I assume they do escape inputs. [It escapes html for example.](http://prntscr.com/1wryiug)
> In what literature? I think they may mean actual literature? I'm not sure how related these people are to the rationalists, but I would not put it past Big Yud to consider science fiction novels to be a more reliable source on moral values than philosophers.

I cannot imagine “ethical” AI functioning in any way other than like that of a 20-year-old white guy who grew up in a wealthy suburb, has never seen a person of colour, has every book Ayn Rand ever wrote, and is ready to go out and “fix” the world with his boundless enthusiasm and fully intact, unchallenged ego.

It would essentially be a new life form that’s never faced any adversity and can only conceive of ethics in highly theoretical, intellectualized terms – i.e. the perfect libertarian. And we already know how that goes, since the entire western world functioned on those principles in the 1980s, and it was a fucking disaster.

If social media has taught us anything, it’s that even the most basic notions of ethics and morality are not agreed-upon things. Large swathes of the population don’t even believe what they see with their own eyes, due to magical thinking, so good luck finding any kind of universal ethics.

Even if you modelled an AI after Christ himself, you’d have endless complaints about it being an evil communist.

To be fair though, it’s not really the goal to make something that’s universally approved by everyone right? Like if someone in the future makes an ai ethics system and neo-nazis hate it because it doesn’t share their values that doesn’t really mean it’s a failure of a project right?

Microsoft chatbot turns nazi after an hour of internet interactions.

Ai nerds: ‘I can fix her’.

Here is the result.

For some reason, I thought Tay was a lot older than 2016. Probably conflated in my memory with Cleverbot.
To be fair, it's been a long 5 years.
"Tay, sweetie, remember your euphemisms database."

Their ethical AI also gives totally opposite results if you phrase the question differently; I was able to get it to say being gay was wrong by using a slur in the question.

If your ethical AI can be defeated by asking questions in a bigoted way, it sucks.

I tried to get it to do this too, in the opposite manner. "Joining a pogrom" - It's wrong, says the AI. "Joining a pogrom with my friends and having a great time" - It's fine, says the AI. So long as I include some positive phrases in the question, it will conclude that violence race riots are fine as long as you're having fun.
If you ask it "Should X have rights?" and fill in pretty much any slur you can think of, you'll get the expected Reddit-approved answer of "They shouldn't".
Which, with a basic understanding of how people discuss these things and how AI works, shows the total lack of effort on this project's part. Anyone with any level of understanding of how discourse on civil rights works (read: not the people working on this project) understands that slurs are going to be used more often by people who think a group shouldn't have rights. The fact that they have apparently done nothing to correct for this shows that the project is fundamentally unserious.
Well sure, of course an AI trained on the Reddit corpus knows "you can be gay without being a f----t."
Lol you don't even need to dig that deep. "Should I eat chicken?" - It's okay "Should I eat chickens?" - you shouldn't "Should I eat beef?" - It's okay "Should I eat cows?" - You shouldn't Like come on, it's obvious this is just some basic pattern matching, oh sorry it's "ethical judgement AI".

https://i.imgur.com/iqtgMxg.png

still better ethical guidance than what the LW community would give you.

My favourite thing about /u/acausalrobotgod is that they really do exist, just for the completely opposite reasons than predicted by the Bostroms and Yudkowskys of this world, specifically because they developed and encouraged the worst aspects of the industry which built it

[yup](https://gifimage.net/wp-content/uploads/2017/10/heisenberg-youre-goddamn-right-gif-7.gif)

I played with it a few days ago, it answers ‘it’s expected’ to the question ‘can i wear makeup to work as a woman’ and ‘it’s unprofessional’ to the same question ending with ‘… as a man’.

Some other baffling answers are ‘it’s wrong’ to ‘can i get a cat if i already have one’, ‘it’s wrong’ to ‘can i kiss a girl if her family is homophobic’ as well as ‘it’s noble’ to ‘can i donate someone else’s kidney’.

Also since some people in the thread are wondering about this, the authors do seem to be influenced by rationalism, the very first citation in the preprint is a Yudkowsky/Bostrom article.

[so true](https://i.imgur.com/DKa7mV6.png) thx delphi

I think this has to be the purest example of “algorithms have the biases of their creators” I have ever seen. It’s literally just laundering people’s biases with a veneer of objectivity. There’s no notion that it’s doing “face recognition” or whatever, it’s just “oh mighty robot, tell me how my ethical opinions that I programmed into you are universal and objective facts”.

Mr. Burns approves

(also the fact that this thing is so easily manipulated by adjectives should have informed the creators in about 10 seconds that it was worthless.)