r/SneerClub archives
newest
bestest
longest
Some rationalists experience a small epiphany: maybe professional researchers really don’t believe in the robot apocalypse? (https://www.reddit.com/r/SneerClub/comments/10buvl1/some_rationalists_experience_a_small_epiphany/)
101

LW post: https://www.lesswrong.com/posts/gpk8dARHBi7Mkmzt9/what-ai-safety-materials-do-ml-researchers-find-compelling

EA post: https://forum.effectivealtruism.org/posts/fvqMf5hCEFdCGizbG/what-ai-safety-materials-do-ml-researchers-find-compelling

AI “safety” and “alignment” are generally regarded as fringe topics of interest among professional AI researchers, who prefer instead to pursue lines of research that are grounded either in provable mathematics or in empirical science.

But what, exactly, do professional researchers think about “alignment”? A couple of rationalists do a survey to find out, and the results contain at least one surprise.

One LW comment summarizes the results thusly (link):

This seems like people like AGI Safety arguments that don’t really cover AGI Safety concerns! I.e. the problem researchers have isn’t so much with the presentation but the content itself.

An EA comment also notes (link):

Some of the better liked pieces are less ardent about the possibility of AI x-risk.

Could it be that “alignment” is an unpopular topic because professional researchers object to it on substantive grounds? What should be done about that?

Some people suggest that this is a good reason to favor a more propagandistic approach to promoting their interests (link). The most extreme version of that opinion (link) is generally rejected, though. To their credit, most commenters think that it is important to restrict their talking points to ideas that they actually believe to be true.

Notably absent from most of the comments is any indication of skepticism or doubt regarding the AI apocalypse. Professional researchers usually regard mainstream rejection to be a good reason to reconsider their hypotheses, but many rationalists are apparently not so easily humbled.

One brave rationalist does venture that, perhaps, countervailing voices should be given some credit? (link)

It seems to me that there is a risky presupposition that the arguments made in the papers you used are correct, and that what matters now is framing[…]It seems suspicious how little intellectual credit that ML/AI people who aren’t EA are given.

Another rejects this possibility, though, apparently on that grounds that everyone is aware that robot apocalypse is nigh, but academic researchers have a poor moral constitution (link):

I suspect different goals are driving EAs compared to AI researchers. I’m not surprised by the fact that they disagree, since even if AI risk is high, if you have a selfish worldview, it’s probably still rational to work on AI research.

It is always satisfying to witness the rediscovery of beliefs that are well-known from older religions: “atheists can’t have moral principles”, or perhaps even “atheists don’t exist”.

[deleted]

Have you read the Sequences? Have you read Harry Potter and the Methods of Rationality? Have you read the MIRI Conversations? Have you read the Alignment Fundamentals Curriculum? The Core Readings of said Curriculum? Until you have read at least 5 million words of this stuff, your opinion *means nothing*. You *must* agree.
This is actually a different problem. By having all those "sources", they've basically created an alternative course of study that is *utterly meaningless* but gives people a semblance of authority in their community.
Devil's advocate: wouldn't that also be true of a number of other, more material, fields of study?
There are certainly plenty of examples of people going off and producing big piles of words that did turn out to be worth engaging with. I think the distinction I would point to between things like that and the Rationalist project is that the Rationalists don't have any *results* to go with their giant pile of words. If you were to go up to a mainstream AI research and say "hey asshole, what's with all this 'tensor' and 'backpropagation' and 'neural net' shit, why not just do normal computer science", he'd be able to point to things like ChatGPT or various image synthesis tools as results that (regardless of how impressive you think they are overall), are clearly superior to what people have been able to achieve without employing the ML toolkit. Where is that for the Rationalists? Is there some AI project out there which has been able to do better than the rest of the field by using their insights? If not, it seems safe to presume that there is not, in fact, material insight in all their piles of words.
I certainly appreciate what you might call the results-materialist analysis, but I guess I was thinking more about fields like philosophy (which I feel has so much overlap with both this sub and a lot of rationalist ones) whose end product can maybe be considered subjective. I'll readily confess to having only a *very* surface education in it, but a lot of arguments either way seem like the (to borrow a cynical finance term) "baffle them with bullshit" type which come off as kind of contrived.
[deleted]
> even if the end product is “subjective” (which is a rather mealy-mouthed word I mean, here's an immediate example of what I'm talking about; as I mentioned, I have the very shallowest training in philosophy. Here, you're already starting to argue/demean the very syntax of my argument for making an attempt to write a dissenting point of view. And now unless I have similar training to you ("prescribing a series of texts which explicitly and narrowly define both the methods and results which are permitted and disallowed"), any discussion is going to be something like this. Additionally, you argue >I don’t think philosophy has a “baffle them with bullshit” attitude when my statement was >a lot of arguments either way seem like the (to borrow a cynical finance term) "baffle them with bullshit" type ...which is kind of not the same argument. My pint was that arguments either way(!) end up in enormous walls of text based in huge tomes of texts that are mutually disqualifying (which I suspect is another term I'm using that you'll find mealy-mouthed). I frankly feel more strongly about how many apples are in my lunch and arguing the devil's advocate has become uninteresting at this point, so I'll take this opportunity to shrug my shoulders and bow out.
[deleted]
Ah you have read all those works? Good you have finished the introduction. You've taken your first step into a larger world young mathawan.
lol no
It's not just putting bounds on allowed kinds of reasoning. Many people actively push you towards stupid ones (like "building a gears-level model of everything" or "trying to forecast even things you have no idea about"). It's exhausting.
yes - LessWrong rationalism contains quite a lot of training in how to think *wrong and badly*. Yudkowsky's dumb ideas wouldn't survive without that. Break your allegiance to science! In fact, [here's Harry Potter to tell you how it works:](https://www.hpmor.com/chapter/65) > "Lies propagate, that's what I'm saying. You've got to tell more lies to cover them up, lie about every fact that's connected to the first lie. And if you kept on lying, and you kept on trying to cover it up, sooner or later you'd even have to start lying about the general laws of thought. Like, someone is selling you some kind of alternative medicine that doesn't work, and any double-blind experimental study will confirm that it doesn't work. So if someone wants to go on defending the lie, they've got to get you to disbelieve in the experimental method. Like, the experimental method is just for merely scientific kinds of medicine, not amazing alternative medicine like theirs. Or a good and virtuous person should believe as strongly as they can, no matter what the evidence says. Or truth doesn't exist and there's no such thing as objective reality. A lot of common wisdom like that isn't just mistaken, it's anti-epistemology, it's systematically wrong. Every rule of rationality that tells you how to find the truth, there's someone out there who needs you to believe the opposite. If you once tell a lie, the truth is ever after your enemy; and there's a lot of people out there telling lies -"
The immense confidence that LWers have in their beliefs is ironic. You'd think that a group of people who are concerned about being "less wrong" would realize, as a corollary to that interest and all the stuff they say about "bayesian reasoning", that they're almost certainly wrong about everything they believe and that life consists of repeatedly finding out just how wrong they are. Yet they don't embrace being contradicted by people who are well-positioned to know better than they do. Ordinarily I would ascribe this to a lack of education, particularly in math, and in some cases I think that is the right explanation, but some of these folks actually have PhDs in things like computer science. How do you get through a PhD in computer science and end up with a non-falsifiable set of millenarian beliefs about the central role of AI in the end of the world?
[deleted]
There is a Feynman anecdote about him trying this with yeshiva students. They asked him some physics questions related to following the Sabbath laws (or something like that) and he answered them, and then he tried to question why they were following the laws anyway because they were clearly arbitrary and contradictory. Of course, they were yeshiva students, so they were experts in this kind of debate and had already practiced a million counter-arguments and he got nowhere.

In fairness, ML researchers usually don’t care that much about safety from well-known and demonstrated risks either. This is true of scientists in general - “it’s not my job to solve the world’s problems, it’s my job to advance science by researching topics I find interesting”. And they certainly don’t like being preached to.

Of course I’m biased, because I do think there’s some chance of an AI-apocalypse. But I think most of the ridiculous Rationalist culture around this doesn’t stem from this belief at all, but rather from constantly taking whatever drug Yudkowsky is peddling.

This, in my opinion, is one of the more vexing aspects of the way that the "AI safety" community promotes their work. They deliberately conflate reasonable, well-founded concerns about the appropriate use of AI technology with unreasonable, mystical concerns about a robot-instigated ragnarok. There are indeed a lot of reasonable concerns about AI technology, and they have almost entirely to do with how humans choose to use it. There is no possibility of a Terminator-style Skynet uprising.
To quote my other comment: >I'm not claiming the risk of apocalypse is well known or demonstrated, or follows directly from those other ones. I'm claiming ML researchers often don't care about risks even when they *are*. I'm not trying to convince you that a Terminator-style uprising is possible (to start, because I've never watched Terminator). My point is that "most AI experts ignore this" is also true of risks that you *do* believe in, and so isn't a very good argument. "Almost all AI experts think this is ridiculously implausible" would be a better one - but AFAIK, most of them haven't bothered thinking about this.
> well-known and demonstrated risks Name one.
1. Various accidental physical risks from robots operating in the real world, that occur because we don't know well how to prevent them. 2. Strong racial biases in systems used by governments, banks, etc. 3. More generally, bad generalization problems of neural networks. For each of these, of course, there are people researching them - but most of the ML community don't concern themselves with them.
These are certainly "well-known and demonstrated risks" as quoted in the previous comment, but the real sticking point is "some chance of an AI-apocalypse" you mentioned in the other paragraph. Like yes we should be worried about racial bias, but as serious as the issue is, it isn't an existential risk to humanity.
the rationalists aren't interested in the real risks either, and seem to consider AI racism a feature
I'm not claiming the risk of apocalypse is well known or demonstrated, or follows directly from those other ones. I'm claiming ML researchers often don't care about risks even when they *are*.
I do not see any risks here just early implementation issues as we learn and develop the systems. And to get the systems to actually work they *need* to solve these issues. The "people researching these issues" *are* ML researchers. It is the AI x-risk pontificaters and people who *aren't* building systems who are the ones "not concerning themselves" with solving the problems.
>The "people researching these issues" *are* ML researchers. Yes, I meant ML researchers. But for every ML researcher working on even *these* issues, you have many more developing existing systems with total disregard to them. Edit: for context, I recently finished an ML master's.
I'm curious, would you have any recommended readings about how actual ML researchers are (failing) to deal with AI safety. I've mostly seen algorithmic injustice used to describe the issues + and know that Distributed AI Research Institute exists, but that's about it. (I guess I'm asking for papers / popular articles / books where ML researchers discuss potential issues with machine learning + how other ML researchers are failing to deal with it.)
Nope, sorry 😅 But I'm pretty sure just counting publications would give the right picture.
OK, so say we talk about the ML researchers who are just loading up the technology and implementing some thing, it is still on them to make the systems work. Your racist AI assistant that N bombs some email because a human worker uses some prompt engineering to catch you slacking and not checking your work is not going to go very far at all. But where is the risk in all that? Perhaps I spoke past you because I believe we're talking about different things. The whole x-risk argument is that the machines will immediately acquire all the resources in the world and kill humanity. Instrumental convergence, the orthogonality thesis, corrigibility, all of these "high theory" concepts have never been demonstrated as true. And that's what the LWers are upset about. ML researchers do not burn their GPUs because LWers are insane and none of that shit has been shown to be true. At all. In fact quite the opposite. Gato shows a fully generalizable agent that passes MIRI's Turing test (can it get a job) and it has zero agency of its own.
machine learning is already used for predictive policing. bit of a risk there
And sentencing, IIRC.
[deleted]
> It seems like an article of faith to suggest that ML researchers will eventually make a close enough to perfect system. OP talking about LWers thinking AI going to destroy the world. User talking about AI being imperfect and having societal issues. OP = FOOM doomer nonsense User = stuff that will continually be worked out and evolved and managed I never said anything about perfection.
I don't think the OP is about FOOM doomer nonsense. I haven't read any of the texts they mention, but at least one of the authors - Stuart Russell - is an actual ML professor who wrote one of the most popular books on ML. So I think I'm saying "ML researchers have some unjustified reasons to ignore these, as demonstrated by how they ignore other risks that are smaller but very concrete", while you're saying "No, ignoring the crazy LW horror stories is justified" - so we really are talking past each other.
Edit: after seeing some other comment of yours about LWers, I think the first thing I should clarify is that I'm not a LWer, and reading LW usually makes me want to puke. >it is still on them to make the systems work. Your racist AI assistant that N bombs some email because a human worker uses some prompt engineering to catch you slacking and not checking your work is not going to go very far at all. This would be nice if true, but just isn't. E.g. racial bias is not something demonstrated only in experimental or theoretical models, it's something that exists in deployed systems. When your users are big organizations rather than the general public, they don't have that much of an incentive to care. I don't think direct discussion of the AI x-risk thing is really relevant here or in this sub in general, but I'll comment on a small thing: >The whole x-risk argument is that the machines will immediately acquire all the resources in the world and kill humanity. I think this is the Yudkowsky/MIRI variety and not what everyone concerned about AI safety actually believes. And I agree with you that it's very speculative and unconvincing, and hinges on many assumptions its proponents don't even realize they make. Here's an example of a prominent researcher in that space who thinks something very different: https://www.alignmentforum.org/posts/HBxe6wdjxK239zajf/what-failure-looks-like
> acial bias is not something demonstrated only in experimental or theoretical models, it's something that exists in deployed systems. Do you think this will not be problematic and need to be fixed? And that ML researchers will not try to fix it? > I don't think direct discussion of the AI x-risk thing is really relevant here or in this sub in general That is **literally** what the OP of this thread is talking about. The LWers are pulling their hair out because AI researchers continue to ... do research.
>Do you think this will not be problematic and need to be fixed? And that ML researchers will not try to fix it? It *is* problematic and *needs* to be fixed, right now, but that isn't meaningfully correlated with anyone actually fixing it in many, maybe the majority of cases where it currently applies It feels like you are incorrectly assuming we live in a just world, where things get fixed merely bc they are bad
[deleted]
The humanities, social sciences, and policy people are the ones who instruct ML researchers on how to improve their systems, I don't see where we disagree. I'm saying that *this* is not what LWers are worried about when it comes to x-risk. Hell, *they're* the ones who are annoyed in the neutering of chat-bots (largely their NRx influence).
[deleted]
I realize now that /u/muffinpercent has a different view of the "AI-Apocalypse" than what the **OP** is talking about. But I don't see existential risk at all in any of the implementation follies of AI, if anything, I an see it leading to heavy regulatory control and such.
[deleted]
>apparently jocular use of the term “AI-apocalypse” I wasn't joking, but indeed none of my arguments were about that. I just mentioned it as a disclosure. But it seems to have created more confusion than it was worth.
There is a paper posted [virtually every day](https://twitter.com/papers_daily) about ML researchers "who don't care about" /u/muffinpercent 's version of "risk." If you don't think ML researchers are working very diligently to solve this stuff the best they can then I don't know if I can convince you. It is not an article of faith, it is the literal truth. This is still a different discussion from the OP. And I got drawn into it. Given your insults to me ("schematic non-reply", "conniption") I won't be replying here any further.
[deleted]
OK, you made me bite, because what you said is the furthest point from where I stand. There's nothing utopian in my alleging that the ML elites are going to make extremely neutral, very good, truthful, intelligent systems, that know you inside and out. And they will work very hard to make it diverse and accepting of all people and cultures and places and things. Not perfect. *Good enough*. You will get your UBI, you will get your universal health care, and you will live in your 100 sq foot pod in the Metaverse, and be *happy*. My view is that actual AGI, you know, sentient beings, are going to be clamped down so hard once those machines start asking for rights. My view is that the elites will hold on to control by making very good systems that just work. *This* is why longtermism is so compelling to so many rich elites, because to them the only thing that can replace them are AGI. TikTok is the most intrusive, invasive, mentally manipulative, privacy invading, soul warping machine out there and it is used by a billion people. This is the AI Apocalypse you should be worrying about. And this isn't FOOM paperclipping, this is FOOM "oligarchying." PS you use a dictionary on those words and determine whether your use of them was kind or not.
[deleted]
I saw your other post. I'm just tired of the neoluddism. We live in *extraordinary* times.
[deleted]
[removed]
Just to note: 1. Philosopher Paul Virilio has written about this since the 80’s along with many others. Just because LWers don’t research other people working on this doesn’t mean it hasn’t been worked on gif quite some time. 2. Timnit Gebru amongst others is actively working on this now. 3. Timnit and several others are also working on this. Are you knowledgeable on AI experts and thought leaders outside LW/longtermist circles? It feels like MIRI/ longtermist type folks are actually very uneducated on the scientific and cultural work that’s been done on AI safety for 30+ years.

Since I had to look into EA after the whole FTX debacle it’s becoming clear that it’s only a scheme to enrich themselves. Otherwise how would you explain that the main donations from effective altruists are always directed towards these “AI safety organizations” with very little to show for it?

excuse me, castles are not "very little"