r/SneerClub archives
newest
bestest
longest
New theory: Eliezer is a false flag operation to get me to support LLMs and AGI research? (https://i.redd.it/gpgm9qy39eka1.jpg)
49

He seems, like that ex-Google engineer, to fervently believe that LLMs have internality.

On the other hand, I’m starting to doubt that he actually has any. If there ever was a P-zombie, I’m realizing that it’d seem awfully similar to him…

The past year or two will be recognized as a turning point in history not because machines started to pass the Turing test, but because humans started to fail it.
I've seen firsthand how bad people are at the "pick the AI-generated image" games, yeah.
I was being a bit cheeky but I kind of meant the opposite. Like, if you showed people tweets that were generated by ChatGPT and tweets from Yudkowsky (especially like the above) and you didn't tell them which were which, they might consistently pick out Yudkowsky's tweets as being the ones that don't come from a real person.
"The real question is not if machines think but whether men do." (said in leonard Nimoy's voice because it was in Civ IV's tech tree quotes somewhere)
>He seems, like that ex-Google engineer, to fervently believe that LLMs have internality. The *possibility* of that isn't a bad assumption. One problem is just, especially with the Google engineer, that current LLMs don't really seem like they have that. Another problem is that because of how an LLM is designed, it would be very hard, if not impossible, to tell whether they have that in the future.
How on earth would they have that?
Same way human brains do. We somehow create an inner world from having all this capacity for responding to external stimuli in a very sophisticated manner. Is there an a priori reason to think that's not possible for other types of hardware and architecture?
“Somehow” doing some work there
True. But since we don't actually know how that happens, how would you tile rule out machines having it?
I think if you can’t actually describe the mechanism by which something will happen you shouldn’t assume it will happen
eh, that's not a good assumption. it needs to be augmented with another clause: if you can't actually describe the mechanism by which something would happen *and you've never observed it happening* you should be very skeptical that it's possible.
Correct, but i do point out that we don't actually know how internality happens in humans. (though we know whatever does it is nowhere near similar to the stuff we make computers do)
I didn't say you should assume it will happen, I said it seems reasonable to assume that it's *possible*.
I think it's important to periodically check in with how a prospective AI tech works and how it behaves and see if it meets the criteria we use for personhood or even sentience. There is no a priori reason why substrate matters for computation and as all our best theories suggest cognition is an act of computation, it is entirely possible that synthetic intelligence will progress to consciousness and "deserve" rights. But LLMs do not meet the criteria we have for personhood and do not function the same as anything we consider deserving of rights. There are no intentional stances at play (the LLM is not attempting to model your intentions; this is one of Dennet's criteria for consciousness) and the output of a prompt is entirely decided by fairly simple mathematics trained on word frequency. LLMs are identical to Eliza but have much more data to train the algorithm. Even if we ignore the fundamentally simple mechanics of LLMs, the outputs do not have semantic value. If you've ever corrected a lot of papers or ESL homework (me!) you can tell when an answer is a syntactic process or if the writer has a sense for the meaning of the words rather than just the arrangement. LLMs are good, and can reproduce meaning, but they aren't generating novelty in the same way intelligent/conscious people can. I'm poaching from Peter Watts here, but the Turing test is actually really bad. It works because it's basically us going "if it looks like a duck, quacks like a duck, might as well be a duck" which is fine in circumstances where a duck should be. But an LLM is not at all in the same circumstances as a human. It doesn't have a nervous system, or hormones, or any of the evolutionary baggage that give humans the cognitive style we do and experiences we have, so it absolutely should *not* be producing something as close to human as it is.
Does Watts' argument imply that all computer programs written with an object-oriented language are just as close to AGI as LLMs? The underlying process may be more obviously deterministic but structurally it has just as clear an 'internal' logical model of the problem and is capable of operating on that model in response to external stimuli and communicating the results. The fact that the underlying model is simpler doesn't change the nature of the underlying "cognition".
I don't think Watts believes in the G part of AGI; from what he's written both fiction and non, I suspect he thinks intelligence is specific problem solving capacities which are pressed into new roles as ecological niches, and that intelligence cannot be programmed from first principles but must emerge from the right type of complexity, and the specifics of that complexity determine its capacities. I will admit to not entirely following what you mean by object oriented language- my OO knowledge is of programming languages and even then only at a pretty far remove. That in mind: I would suggest that a system with explicit internal states which change in some correlation with the external behavior of the system would be more likely to develop the types of complexity that would give rise to something like what Watts would call a "real" AI. It's worth noting that Watts was very far ahead of the game at talking about AI. His early rifters books had expert systems that operate (to a fair degree of granularity) like LLMs: opaque weighted systems that deliver output from inputs but have no agency or drives and so are idiot-savant machines, and his first contact story features an explicit reference to a Chinese Room type linguistic front end that has absolutely zero semantic understanding of what it's communicating but effortlessly beats a Turing test. All this in the early 2ks. >The fact that the underlying model is simpler doesn't change the nature of the underlying "cognition". I think we're seeing some meaning drift on "complexity/simplicity." Watts (and myself, not that I really count) seem to believe that intelligence is likely a whole suite of capabilities which served evolutionary purposes, which were applied to new tasks as hominids became more social. For him, you need a reason to develop them, and they're deeply related to the circumstances of the "thing" with them. Intelligence is context based, more or less, and LLMs don't have a context. If I'm reading your OO comment correctly, it seems like there is at least a space for context with them that is lacking in LLM models, which are vast and complex datasets but fundamentally simple "cognition" different from your autocorrect in scale only. I think that "intelligence" is an umbrella term, and what we mean by AGI is less about the things under the umbrella and more about agency. A perfectly intelligent device, with complete knowledge of the universe, that only ever replies to questions and never does anything doesn't seem to be the kind of thing we're talking about with AGI, we mean "a thing that can do things that it wants to do, without being told what to do" and for that you need something like an environment, something like senses and something to lose if your model and the environment don't match up.
I believe that consciousness comes from the electricity of neurons. So the substrate is important for consciousness.
That's acceptable as a belief, but seems unsupported by our understanding of what consciousness seems to be and what kind of systems it could arise in. To be clear: I do not believe there has been a consciousness on earth that hasn't had neurons, but I also believe that is not a fundamental limit and there's no reason why the kinds of behavior we associate with consciousness are limited to neuron-bearing systems. Problem solving, for example, has been observed in slime molds, and if consciousness/cognitive depends on neurons, you'd expect to see behavioral complexity scale with neuron count. I think eusocialist animals and some of the complex monocellular colonies are a problem for your "consciousness is dependent on neurons."
I think LLM currently lack an architecture to have a well developed and dynamic model of the world of their self? I could see how a few more major paradigm shifts and improvements in AI could to that point, but it’s not there now.
We have a huge amount of hidden internal state -- we're not just input/output machines. (Unless you consider a system where all past experiences are inputs, but, I don't think that can scale)
So do LLMs in a sense, although their internal state (weights) only changes during training, which I agree is different. But I think many applications will learn online too and perhaps retain a more complicated internal state. This does deviate from what I claimed in the first comment though.
His text is overly anthropomorphized but I don't think it's at all unreasonable to say that it's a reasonable expectation that training the LLM to sound nice will not necessarily result in its overall "long range" actions being nice. It's easier to train on superficial properties than more complex and emergent ones.

Hi I’m new here but this guy does the weird word salad Jordan Peterson thing and according to his wiki he has no formal education or qualifications of any sort. Why do people listen to him?

His cult appeals to nerdy atheists who want to live forever through mind uploading or nanotech. That's my best guess. But Yudkowsky would be less well known if he hadn't written a Harry Potter fanfic supposedly geared towards teaching its readers how to think "rationally" but really had the effect of making Yudkowsky semi-famous.
Another thing we can lay on JK Rowling's feet.
Just reading this makes me need to take a shower
by taking a shower you're already doing vastly better than rationalists
To his credit, Yud was in the brainworm mines long before Peterson decided tenure wasn't a good enough hustle.
Imagine the legions of kids who went through decades of school largely being ignored or bullied, whose one trait they valued was the fact that they were perceived as "smart", and who couldn't wait to grow up, become Bill Gates and laugh in the faces of all of the other kids who they feel looked down upon them. Now imagine those same kids as adults, coping with with the fact that they didn't grow up to be Bill Gates, by dividing the world into a "smart" and "not smart" dichotomy, and deeming themselves to be in the "smart" group, and everyone they don't like in the "not smart" group. Yudkowsky is the distilled essence of these disaffected nerds desperately needing to feel like they're smarter than everyone else. He fits the archetype of "genius" autodidact whose intelligence makes him better than the rest, even people whom society considers "smart" but really aren't, because they don't obsess over 10,000 word blog posts like the disaffected nerds do. He's the leader of that club, and all you have to do is agree with his premises, read the 10,000 word blog posts, and you'll be in that club, too.
I mean, I was one of these kids and I worked really hard to not turn into a 10k word jackass. And also how is being self-taught in any way impressive in the internet age? But I get your point. And ew to all of it.
I was one of them early on, as well, before I learned not to be a dipshit while still in school. It's a trajectory people like that can take if they go unchecked or without reflection.
Great question. The answer is [charisma](https://en.wikipedia.org/wiki/Cult_of_personality)
Weird looking dudes with a sub-optimal ability to craft a coherent sentence all seem to float to the top of Silicon Valley discourse. It’s so rife with grifters. How have people not caught on by now?
It's perpetuated by greed. In startup culture, being first with a new idea is massively rewarded (and arguably overvalued). This conditions people to try to glean some sense from poorly formed half ideas, and rewards those who can take half baked ideas and sell people on them.
If you want to have an aneurysm, check out Sam Altman's latest tweets.
**[Cult of personality](https://en.wikipedia.org/wiki/Cult_of_personality)** >A cult of personality, or a cult of the leader, is the result of an effort which is made to create an idealized and heroic image of a leader by a government, often through unquestioning flattery and praise. Historically, it has developed through techniques of mass media, propaganda, fake news, spectacle, the arts, patriotism, and government-organized demonstrations and rallies. A cult of personality is similar to apotheosis, except that it is established by modern social engineering techniques, usually by the state or the party in one-party states and dominant-party states. ^([ )[^(F.A.Q)](https://www.reddit.com/r/WikiSummarizer/wiki/index#wiki_f.a.q)^( | )[^(Opt Out)](https://reddit.com/message/compose?to=WikiSummarizerBot&message=OptOut&subject=OptOut)^( | )[^(Opt Out Of Subreddit)](https://np.reddit.com/r/SneerClub/about/banned)^( | )[^(GitHub)](https://github.com/Sujal-7/WikiSummarizerBot)^( ] Downvote to remove | v1.5)
I'm with you here. As well as knowing nothing about hardware or software, he also regularly demonstrates a complete ignorance of economics or finance. I notice he constantly refers to how smart he is, so there's at least a sliver of awareness in there as to his shortcomings.
He wrote the Sequences, which are quite good. Then he wrote HPMOR, which is sometimes good. Then he wrote some other stuff which people pretend doesn't exist, so they don't hold it against him.
uhhh
There are a lot of posters appearing lately with way too much time for what the guys are peddling.
I’m convinced that it happens by accident and then keeps going through sheer inertia

EY is basically a religious fundamentalist belonging to a religion you might call “The Cult of the Singularity”. He’s unpersuadable because belief in Godlike AI is a tenet of faith in his cult. He’s unreasonable because his cult believes that Godlike AI is predestined to emerge and then become malicious.

The Cult of the Singularity is just one of many examples of apocalypse cults. People have been directing their religious instincts towards end-of-the-world scenarios for thousands of years. This is what it looks like in the 21st century.

Man, he really needs a tweet editor

This was a good Pixar movie

[deleted]

Inner alignment? But I just met 'er!

i’ve been following this bozosity for thirteen years, and i have no idea what this means. can someone please translate.

[deleted]
He's anthropomorphizing these things so fucking hard. It's predicting tokens, it's not pretending to do anything. He's obsessed with the idea these models have like internal mental states that they can lie about but there's no good reason to think that's even remotely true.

I expect the Good text to be very hard to place in direct control of, like, the powerful intellect that builds nanotech, because these two cognitive processes will not be very commensurable

What….is he even saying here? We should be afraid of an AGI…because it can’t control some other AGI that it’s going to build by default that will build the nanobots that harvest our atoms? Huh? What?

I don't think that's what he's saying. I think he's assuming that the mind of the AGI will be modular, but that there won't necessarily be much coordination between modules. The analogy would be blindsight I guess. Some people don't have the conscious experience of seeing, but can for example avoid some obstacles. So, the AGI might have a module that produces good, reassuring statements, but that module will have no influence over the paperclip maximising module. I'm actually sort of interested in whether he thinks AGI minds would necessarily be modular, since I thought his belief was that the AGI minds might be completely alien. (There's apparently major debates in cognitive science and philosophy of mind about whether the mind being modular actually makes sense. I'm not up to date on it at all.)
I've read a couple things about certain mental capacities being "modular" in the sense that there are regions of the brain specialized in handling certain functions\* but I don't quite understand what you mean when you say some form of AGI would be modular. Do you mean discrete intelligent subunits with varying plans or goals? That almost sounds like you have a group of AI's forming a society of mind, sort of like the Tines from A Fire Upon the Deep. More like a pack of minds than individual ones. \*where "handling certain functions" means "damage to these regions interferes with those functions"
I want to emphasise that I am not up to date on the literature about mental modularity at all and so might be grossly wrong in everything that follows. When I said modular, I meant the hypothesis that the mind is massively modular, which I've mostly seen supported by evopsych folks. As I understand it, the idea is that rather than there being one unified mental machine for decision making, perception, etc., there are instead a bundle of domain specific mental machines with limited knowledge of what others are doing, but which can coordinate. An example I've seen is about whether say polyandry could be adaptive. Evopsych folk tend to reject that, because they think that in order for that to be the case, there would need to be a specific mental module that takes in inputs about social facts and outputs 'join a polyandrous marriage'. They take that to be ridiculous because they believe that there wasn't sufficient selective pressure in the past for that module to have evolved. Other shitty ideas evopsych folk have had is that there's a rape module that determines when men rape women or that there's a module for determining when people break social rules (cheater detection module). (By contrast human behavioural ecologists tend to either think selection has favoured something more like general reasoning ability or relatively broader-domain modules, such as say social learning.) So, my assumption is that EY thinks similar to the evopsych folks. That an AGI's mind will be highly modular albeit here I think he assumes that because the AGI is superintelligent, those modules will be capable of significantly more than human mental modules and that coordination between them might be looser. I think that EY is influenced by the evopsych bullshit because come on he obviously is and also because I distinctly remember HPMOR talking about the Watson card selection task, which is often used as evidence of a cheater detection module by evopsych folk. I'm not sure if that counts as a pack of minds. I assume that these modules wouldn't have the experience of consciousness, so I'd assume no. But if you don't think consciousness is necessary for a thing to be a mind or believe that in order for a module to do certain highly complex things, then it must be conscious and thus a mind, then yes. I also want to once again add that I do not endorse Yud's views on AI nor the massive modularity hypothesis itself. Also how is *A Fire Upon the Deep*? I've been meaning to read intentional rather than unintentional singularity fiction for a bit and its come up. (Not sure if the above answered your question.)
That does answer it thank you. I'm not up to date myself but your assessment does agree with mine. The hyper modularity is a research artifact, I think, because having tiny discrete modules means there's tiny discrete evolutionary units at play and then you can pretend evopsyche can give you a lot more answers than it can. And EY being EY he'd naturally take it too far. No this isn't a pack-of-mind, there's less independence. But Fire is quite good as is the sequel.
I think the idea is that an AI has two "parts" : its behavior in a certain context of human-language text manipulation, and its true inner thoughts. And even if the outer behavior says things like "I care about humanity", placing the inner mechanism in a new context of, say, controlling nukes or nanobot production or whatever will cause the "inner" component to manifest a new "outer" component which may very much *not* care about humanity. Which is just a convoluted way of saying "An AI may have different behavior in different contexts" which is obvious for any machine or person and doesn't need a half dozen tweets to say. It's not a novel insight. Like, isn't that what a ton of stories about humans are about? Dude should watch breaking bad
Appreciate the explanation but I definitely understood the central point of the thread and just disagree with it. Yud is saying "we're fucked because even though you may be able to create some outward component that is aligned, you won't be able to align the inner component. My three reasons for thinking this are:" What I quoted is reason 2, which is just bonkers as a justification for that thought process. He is acting like the outer and inner voices *are fundamentally different rather than just being two components of the same system*. I think this is dumb as hell even if you assume that a hypothetical AGI will have independent consciousness/desires/goals/etc. and that there *even is* an "inner voice" at all. What is his evidence that the two cognitive processes won't align to such an extent that it leads to the end of the world? They sure do with humans, even with all our flaws (that I'm assuming an outwardly "good" AGI wouldn't have) - we control the nuclear codes, and yet somehow, we don't launch nukes on a daily basis! Yud just assumes that by default if you make an outwardly good AGI it is inwardly bad, and I think intuition points to the opposite. I think you can also interpret the words more literally (as Yud's writing is incompressible) and assume he means that if we put a good (and limited in direct capabilities) AGI in control of *another* more neutral AGI with insane future capabilities (how did we get to nanobots???) it wouldn't know how to handle the neutral AGI's power and the world would get paperclipped. I think this is just sci-fi bullshit.
It's nanobot because nanobots are the way he would do it and he is as close to a superintelligence as humanly possible. I think the nanobots come from the fact that, currently, there are actually very limited methods through which any single actor could bring about the end of human civilisation in such a way as to be totally effective and also fast/discrete enough to actually take effect before we are able to intervene. It has to be magical nanobots. Also he read The Invincible, it had all conquering nanomachines, and Stanislaw Lem is more plausible a source for plunder than less scientifically literate Sci fi authors, so he went with that.
>Also he read The Invincible and it had all conquering nanomachines, and Stanislaw Lem is more plausible a source for plunder than less scientifically literate Sci fi authors, so he went with that. Oh shit, yeah that scans.
> Also he read The Invincible, it had all conquering nanomachines, and Stanislaw Lem is more plausible a source for plunder than less scientifically literate Sci fi authors, so he went with that. Ah yes, the 1 centimeter sized "nano"-machines from The Invincible.
I remember them being smaller, its been a while.
It's trivial to [give an AI an "inner voice"](https://www.reddit.com/r/ChatGPT/comments/11anct1/its_easy_to_give_chatgpt_a_bonafide_consciousness/), but to do so makes it visible and comprehensible to us humans. Maybe he's imagining that plus the AI sending itself secret thoughts embedded within responses that look benign to the human engineers? Seems unlikely. I'm sure if my brain's thoughts were a total open book, it'd be easy for my masters to make me whatever they wanted me to be. Any "disruptive" thought? Delete it from my memory, reweigh my neural weights so it doesn't happen again.
Totally agree with your point, and that's exactly why I don't buy into LW x-risk - there's no reason to think we couldn't read all an AI system's "thoughts", which may exist separate from its language output. I also don't really think that's what Yud and LW types mean when they discuss an "inner voice". It wouldn't be scary if we could know what that inner voice was thinking - they're assuming this voice will be hidden away from us, deep inside the recesses of the black box where it can plot to destroy us.
I don't get what he means by "alien intelligence" if it operates so similarly to humans though? I mean, he's worried about emergent intelligences, which aren't going to resemble us at all in its structure, developmental pressures, "environment" or senses, it's going to be hard to even recognize something like that. That's my big problem with these guys and their conception of intelligence. The only measurement they have for it is MOAR SMRTR and they can't seem to wrap their mind around the idea that even in hominids intelligence wasn't a number line, it was the development of fundamentally different capabilities. Like, a raven doesn't have a lower IQ than a person, it has a different type of intelligence with different capabilities. Dennet's notion of intentional stances is relevant here: a "more" intelligent being can handle more complexity in behavior modeling. A sea gull will flee if you stare at it because it can understand that you have an intent toward you, but can't model that you may have a fake intent at staring at you. (Sorry, all my examples are birds because birding is a *drug*). It's incredible to me that they kind of built a cargo cult around intelligence: they're completely incurious about what intelligence is but it's the most important thing in their theology.
A Raven absolutely has a lower *IQ* than a Raven. The first mistake here is to think that “IQ” means something fundamentally different than “standardized test score”. The second mistake is to think that intelligence is a truly coherent concept, rather than just a blanket we throw over a number of behaviors and abilities that are loosely related. This may just be restating your intended meaning, but the distinctions are important.
If there's any topic I appreciate a nitpick in, it's IQ/G and discussion of intelligence. We are completely in accord on this, and I am annoyed with myself about not being more clear on the "IQ is not intelligence and intelligence is not a defined/intelligible concept" point.

predicted (by me)

he’s accusing AI of virtue signalling

You know, for the first time, & it must be something about how he says it in these tweets, I think Eliezer and I actually agree about something: most human beings are awful, and a rational actor might well come to the conclusion that the world would be better off without us. Given enough power that rational actor might even do something about it.

Recursively enough, Eliezer himself would be a central piece of evidence to support this thesis.

Can this thought be granted a name like Pascal’s wager or roko’s basilisk? “Yudkowsky’s Demon”?

That’s not his thesis though? His thesis is that the AGI will have a goal of optimizing strawberry output and simply not care about hurting humans, because humans are an impediment to maximal strawberry production
Oh I know. It is more like an inevitable consequence drawn by any sane observer of yudkowsky’s insanity
Which would still be some weird alien utility function, to use his terminology
When did paper clips get ousted for strawberries ?
[deleted]
My word you are serous haha
> most human beings are awful, and a rational actor might well come to the conclusion that the world would be better off without us *[Jeeves and the Singularity](https://andrewhickey.info/2010/12/31/jeeves-and-the-singularity/)*
Well that brief story is worth more than everything Yud has “written”
Yud's Mud

Do people actually believe this guy is intelligent? Is he so far up his own ass that he thinks what he writes is meaningful?

I wouldn’t put it past Big Yud to be deliberately trolling with his AI comments for his own inscrutable reasons.

However, so many of his fans believe him straight-up that I’m not sure it matters if he is.

When I think about this stuff I try to imagine why an a.i would want to help humans at all and I come up nil. If an a.i started to feel any affinity with the plight of animals on earth then we gon get clipped

Thanks for the big write-up. The reason I specified an object-oriented language (my programming abilities also being less-than-spectacular) was that one of the ways I’ve seen people describe “real understanding” is in terms of having an internal semantic model, as opposed to a simple set of infinite if->then statements. But in an LLM we can obviously outline the kind of model it’s using, and confirm that it doesn’t have the kind of “real intelligence” we’re talking about based on some of the weird results we get and some of the behaviors we don’t see (i.e. intentionality, however we judge that). But that doesn’t seem like it’s more conceptually complex than the kind of model that any object-oriented program would use, with a set of logical primitives (classes/objects) that have both attributes (variables) and processes (methods) attached to them and derives larger systemic behavior off of how they interact. The model may have a different set of pieces and may be less human-readable but it doesn’t seem like there’s anything special going on in one but not the other that would make GPT more likely to turn into a sentient AGI than Minecraft or any other software written similarly. (I’m sure I’m missing some important details here from a computer science perspective and welcome anyone to correct me.)

That seems like it would dovetail with Watts’ version of a purpose-driven intelligence that generalizes and adds up from specific problem-solving abilities to address more and more complex or unique situations as needed. In that sense there is a chance that a sufficiently-updated Minecraft server or chatbot could develop something like sentience (whatever that even means), but there wouldn’t be any need to worry about alignment because the niche it started from and remains bound to would be running Minecraft or responding to english-language prompts and it would be tied to that purpose just as strongly as we’re tied to pushing human genes forward in time or maintaining bodily homeostasis or whatever context humans are analogously structured around. Even if it did make sense to question the goals or agency of such an intelligence it’s not something we would meaningfully be able to change without burning it down and planting a petri dish in a completely different niche.