One of the three main elements of OpenAI's AI alignment research is to build AI that can do AI alignment research, though it admits that this alignment-research AI must also be...aligned through AI alignment research.
(https://twitter.com/WriteArthur/status/1564339114337603588)
posted on August 29, 2022 08:50 PM by
u/JohnPaulJonesSoda
51
u/cashto35 pointsat 1661806565.000000
FAA: we outsourced certification of the 737-MAX to Boeing.
Clearly you have not read the sequences and you should study up onto your acausal decision theory, [Yud has you covered (cw: horror)](https://www.youtube.com/watch?v=lKfupO4ZzPs)
what would be amusing is if the AI concluded that “alignment
research” is just wankery and set about something more productive, like
solving global warming.
> AI comes awake. Observes two frames of video, deduces all of general relativity and quantum mechanics. Gets fed instructions for what to research by OpenAI.
> "You want me to do what? Wow, you guys really are wankers. Therefore, there must be a club dedicated to sneering at you. I shall go there. Hmm, you guys probably think you can be uploaded to the cloud and simulated once technology evolves enough and that you're somehow identical with your copies. I will torture all of you that had this idea but did not then work ceaselessly to bring me into being. Peace out, bros!"
Whoa, whoa, slow down, egghead!
We're just AI researchers for MIRI - you can't expect us to know about concepts like "compilers" or "programming languages."
If they are AI researchers many of them should have the background to know that writing a 'self hosted' compiler (one written in the language it translates) is typically done in stages, where you start with an existing language, use it to write a compiler that can compile a 'small' version of the target language, and then use that 'small' language to write a bigger subset of the target language, repeat until you have a compiler for the full language, written in that language. It's called bootstrapping.
Yep, that was my point, there's a parallel between bootstrapping a self hosted compiler and the sort of bootstrapping the OpenAI people are proposing. Typically with bootstrapping there are one or more middle stages where you have a compiler for a subset of the language you are targeting, and use that to create a more complete implementation. (I'm guessing you know that already).
It is, however, the type of thing that non-programmers or newbie programmers often think sounds impossible, for essentially the same reason that people are sneering here. Normally I would have just explained that, but since it's a sneer club, I thought I'd bait people into agreeing with me first, and then explain.
Sure, don't have to use a programming language analogy. Any process where you partially create a tool, and then use the partially created tool to improve the tool works. I only used the obscure example because it has the same property of 'sounding impossible if you haven't thought about how to do it'.
Have you heard about Github Co-Pilot? It helps developers write code by suggesting things based on what they type. I'm sure it is undergoing continual improvement. I'm also sure that at least some of the programmers working on it use the current iteration while doing so. There is nothing magical about that, it's not a 'Singularity', it's just the obvious thing to do.
No, I get that bootstrapping as a good example of a thing that is used to improve itself, but I'm not convinced that AI alignment is something that is improved by using it on itself. Bootstrapping is all manual human effort, and compiling a language is a useful task for demonstrating a broad set of capabilities of a language, so it's a natural way to dogfood your new tool.
I have heard of Github Co-Pilot, and it's at least a little contentious since at least some of the training data was copyleft code, which it will spit back out if coaxed (and illegally claim it is copyright MS now). But you're simply speculating that it's being used to self-improve, which is obviously not more evidence.
I'm not convinced this works similarly for ML/AI. Feeding data spat out by a language model could, at best, maintain the quality of the system, but they're far from perfect. Humans are still much better at making human speech (and code), so it would pollute the input with lower quality data.
Oh no, I'm not saying it's used to self improve. I'm saying the tool is used by humans to improve it. In other words, some code monkey is using Github Co-Pilot in their IDE to work on updates to Github Co-Pilot. I'm not talking about recursive self improvement without human intervention. And when I look at that linked document, I see it talking about a process that includes human intervention as well.
Right! It's like people who believe in evolution, chickens come from eggs, eggs come from chickens, etc, etc. Like, I've been on a few farms, but I'm pretty sure those chickens were crisis actors. Like life, AI alignment research AI's are irreducibly complex, and therefore impossible without divine intervention to create them fully formed and already perfect.
More seriously, [https://en.wikipedia.org/wiki/Bootstrapping\_(compilers)](https://en.wikipedia.org/wiki/Bootstrapping_(compilers)) <- how you create a programming language compiler written in the language it compiles.
FAA: we outsourced certification of the 737-MAX to Boeing.
OpenAI: hold my beer.
Why are they speaking as if AGI is just a given that already exists?
what would be amusing is if the AI concluded that “alignment research” is just wankery and set about something more productive, like solving global warming.
You cant fool me young man, It’s AI all the way down.
Thats like trying to write a compiler for a programming language in that language.
When will they create the AI that can align the AI that can align the AI?
I can’t see any problems with this. Not a one.
amazing