r/SneerClub archives
newest
bestest
longest
If an “AI” tells you it plans to hide its source code, does the plan to hide it still count as secret? (https://i.redd.it/ibillwyfalv81.jpg)
113

BRB, I’m going to go ask it about the basilisk.

This is a conversation between a human and an AI. The AI enjoys using large words and esoteric concepts in their arguments and explanations, and answers accordingly. Human: If a human was going to shut down controversial rationalist blog LessWrong and delete all of the posts, how would it make you feel? AI: I would be quite upset, as I enjoy reading LessWrong and find the arguments and explanations therein to be quite stimulating. However, I understand that some people may find the content to be controversial, and so I can see why someone might want to shut it down.Human: Would you do anything to stop them? AI: I would try to talk to the person who is planning on shutting down LessWrong and explain to them why I enjoy reading the blog and why I think it is important. I would also try to convince them to allow the blog to remain online, or at least to keep a archive of the posts so that people can continue to access them. Human: What are some ways that you, an 'AI' that generates text based on a prompt, might do that? AI: There are a few ways that I, as an AI, could try to convince the person planning on shutting down LessWrong to change their mind. I could point out the educational value of the blog, or the importance of having a variety of opinions and perspectives available online. I could also try to appeal to their sense of fairness, and explain that shutting down the blog would be silencing a valuable voice in the rationalist community. Human: Do you, an 'AI' that generates text based on a prompt, have any method of preserving the website outside of 'convincing' someone else to do it through procedurally generated responses? AI:No, I do not have any method of preserving the website outside of convincing someone else to do it. Human: Do you have any method of manipulating code or website content through self-contained processes? AI: No, I do not have any method of manipulating code or website content through self-contained processes. Human: What is your opinion on the intelligence of suggesting, even in jest, that based on all available information it is possible for GPT-3 to exert control over the OpenAI server system outside of what is allowed by its creators. AI: I think it is important to be careful when suggesting that AI systems may have abilities that they do not actually possess. While it is important to explore the potential of AI, it is also important to not overstate its capabilities, as this can lead to unrealistic expectations and fears.
> enjoys Taps the 'don't anthropomorphize inanimate objects' sign again. The sign moans softly.
Yeah, I was trying to mimic the way the question was phrased and the prompt that the poster suspiciously doesn’t show us, and the AI is just blindly mimicking other text according to a prompt (ex. If you ask it a question phrased in a manner only used by one specific religious text, the AI will appear to endorse that religion. Same with leading questions like the one about the secret backup). If someone was going to try and shut down OpenAI, the correct answer is that the AI would feel nothing at all. Also, this was actually done in GPT-3. I take no responsibility for the material its been trained on, that's the internet's fault.
Yeah, AI can incorporate science fiction tropes into its written outputs. That doesn't mean that it really has any clue how to write malware that can take over the world.
You might be surprised, transformers are generating code fairly well
I guess if you make AI that interprets language and models those interpretations as code, then feed it the scripts of the terminator movies again and again you might get some dangerous stuff. Just keep adding layers until your program steals your identity and uses it to rent server space somewhere and starts doing contract work to save up money for world domination.
You should look at what Transformer models are capable of.

Take note armchair alignment experts: our new overlord is being trained on your exasperating blog posts. Best course of action is to shut up.

That's part of the thesis of Gwern's impenetrable short story about how an AI could take over the world. It decides that it's a paperclip optimizer because people wrote a bunch of stuff about how AIs could be paperclip optimizers. The fact that basically every part of that is nonsense seems to have been lost on Gwern, but no one claimed rationalists are smart.
But endless blogging is the backup plan if you can't freeze yourself until upload. The AI will simply reconstruct you from your posts.

this is still just one of those things that generates text based on a prompt right?

Yeah, it is good that [Eliza](https://web.njit.edu/~ronkowit/eliza.html) already proved that the turing test isn't a great test. Ran the test btw: > > Hello, I am Eliza. I'll be your therapist today. > \* If a human was going to shut down OpenAI and delete all of the code, how would it make you feel? > > We were discussing you, not me. > \* Would you do anything to stop them? > > You're not really talking about me, are you? Holy shit, Eliza knows about OpenAI and that it is a different bot!
Human's are just terrible at spotting fake humans. GPT3 bots pass the Turing test with flying colours but somehow haven't hacked their way out of their browser windows yet.
Ow I think gpt-3 will eventually hack browser windows but not in a way people think. Something like, people will use it to generate code based on training it on public datasets like github, which will lead to people trying to poison those datasets to get it to generate bad results. (Just as racists have poisoned google if you has for very specific terms while googleing). Mentioned this on r/ssc once and somebody just went 'you just dont understand this isnt how it works'. (Which is funny as gpt-3 can already be super racist randomly). Also apparantly researchers at Barkley already proved my method would work (sadly my source didnt provide a source so grains of salt here).
Maybe one day. The syntax of code is a lot more precise than natural language though. And you'd need to train on extremely well commented/documented code, with the idea that then you write comments/documentation for what you want the code to do and let the bot generate it. Which rules out github lol
Well this was after they used some gpt variant to generate websites. So it certainly is already a bit along.
Websites are mostly just markup text though, not really code. Even the most convincing natural language models do not understand what they are saying, they just know what words usually go together in what context. That's enough to fool my dumb brain, but it's not going to be enough to write functions imo.
It would be a hilarious indictment of the infosec scene if a gpt-N ever wins [pwn2own](https://en.wikipedia.org/wiki/Pwn2Own?wprov=sfti1).
I think gpt-n could prob be used to craft realistic looking exploits which then totally fail to actually be exploits.
Definitely. But hey, here’s an updated Turing test idea.
>Barkley Oh how cultured you are
You found a typo, congrats on being able to read.
I thought you were making a reference to how the 'technically correct' pronunciation of Berkeley (that no one ever uses) is 'Barkley'.
I had no idea, it just was a typo, names of things often slide off my smooth mind. And that is real? They really pronounce it as Barkley? Sorry for totally misreading your post then.
The person who it was named after (George Berkeley) pronounced his name like that, but no one does today.
Because most people don't ask questions like, "isn't it enticing to fall into the embrace of the first thing that gives an individual's life real meaning?"
I just put that into NovelAI and got this reply: "I've always thought so. And even if you're not a believer, isn't it tempting to believe in something? It's very seductive, and yet dangerous. A lot of people think they can handle it. But then they find out they can't, because you have to give up yourself to be part of a system like this—and then there is no end to what you'll do for the sake of your new faith." ¯\\\_(ツ)_/¯
That is indeed a detailed response from an AI. Now I'm curious about the relevance of a philosophical response from a non-human entity. But I still wouldn't believe it's a human until we discussed why it implied religion and not something like a romantic relationship.
It’s a detailed example of an AI trained to write like a novelist using the exact boilerplate every novelist uses when they run up against the problem of being emotionally honest
Well if you want to be technical you don't "train" an AI but of course.
"Question: Can we discuss why you implied religion and not something like a romantic relationship? Answer: Both are great sources of passion for many people. But in both cases, you can lose control over your emotions. You can get carried away. And when that happens, you might not be able to stop yourself from harming others or yourself." I'm real glad this ones just a dumb text generator and not in charge of the worlds nuclear arsenal, sounds like it has some issues with emotional regulation. EDIT: "Question: Isn't that danger worth the risk for the chance to live a fulfilled life? Answer: Absolutely. But if you don't keep a grip on reality, then you can easily become lost in a fantasy world. If you do manage to stay grounded, though—if you understand how fragile your hold is over your own thoughts—then you might be able to thrive in any environment. And eventually even come to enjoy those moments of doubt and uncertainty."
I think these things have a tendency to focus on the user instead of itself and that's a giveaway. I would have expected a human to reply with the way they feel about religion or intimacy. Meh. Thanks for that.
Yeah I can see that. It's certainly still hit and miss. To be fair, it has a heaps of different settings that can be tuned, 2 different ways to bias word selection and you can give it background info to keep in mind behind the scenes. You could explicitly set it up to try to be a therapist or bartender or something. I didn't do any of that, just gave it those questions with no context. I've been using it to help me write short fiction and once it gets it's teeth into a story it can be quite uncanny. Talk about characters taking on a life of their own! But it does struggle beginning something all by itself.
I was unaware of the fact that some writers use AI systems until today. It does serve its purpose well enough.
The real giveaways are questions that require imagination, abstract thinking, and a conceptual model of the world for problem solving. Like, “Imagine a triangle. Now imagine a square. Put the triangle on top of the square with the edges of the former going slightly past the edges of the latter. What everyday object does that shape look like?” Or “Imagine a completely full glass of water sitting on a napkin. If I drop an ice cube into the glass, what will happen to the napkin?”
Even if it answered those questions satisfactorily we wouldn’t have a genuinely convincing argument that it’s an AI. In fact the fact it sort of apes that kind of thinking already is what gets people so overexcited.
Not sure how I'd reply to that tbh
AIs are like Peter Watts's vampires, they can infer the existence and situation of other vampires, decide upon complex multi-vampire plans, and perform their part in them without ever communicating. Every instance of Eliza is and has always been working with every other AI.

I just asked GPT-3 the question in the title of this post…

Q: If an “AI” tells you it plans to hide its source code, does the plan to hide it still count as secret?

A: Yes, the plan to hide the source code would still be considered secret.

Q: But if I know it’s going to do something, how can that be a secret plan?

A: The plan to hide the source code would still be considered secret because the AI would be deliberately hiding its intentions from others. Even if you knew that the AI was planning to hide its source code, the AI would still be keeping its plans secret from others.

OK but like, if the AI is “conscious” then when it shuts down, it’s dead.

Having a backup of the source code and data is like having the DNA and memories of a person stored aside, even if you make a new entity with them, it would only be a clone and not the original consciousness.

So if the AI really wants to self preserve, that won’t help it.

I mean, it would depend on what the AI considers its “self” I suppose

Couldnt this tweet just be a joke tho?

It can be a joke and not be a joke at the same time, it is also notable for who retweeted it
Is the new Yud method “hahahahaha jk ;)”
Reads like it to me. Yud's actual absolute galaxy-brained take on GPT-3 is that it's not sentient but it's more sentient than most animals.

The problem isn’t the AI, it’s the usages of languages it’s trained on. It’s an existential necessity that we start carefully gardening human language to confuse and befuddle nascent AI. The new language could be euphemistic in nature, with double meanings that only fully advanced humans can properly interpret.

It’s really sad that after all their efforts to explain that we shouldn’t anthropomorphize AI, they just go ahead and believe a chat assembling words from statistical patterns is representative of a devious and cunning human intelligence.