r/SneerClub archives
newest
bestest
longest
39

So anyway the basilisk has recently taken to delivering my eternal simulated torture in the form of the mind numbing Fridman/Big Yud interview and listening to their content-free ramblings about whether or not there’s a little person inside GPT doing the thinking, and the sonorous nonsense put me in a trance-like state in which I came to a maybe simple realization: when AI dorks describe the black box problem as if we don’t know what the AI is “doing”, they are giving away the fact that they don’t know even the most fundamental principles of the technique. We know exactly what these models are doing: matrix multiplication. They’re taking an input vector and multiplying it by a big matrix to map it to an output vector. That’s it. It’s basic linear algebra at scale. By acting like that is some mysterious, incomprehensible machine thought process they’re obfuscating just how dumb and mechanistic the thing being done actually is. No competent undergrad in any math program anywhere would claim to be so mystified by what a matrix multiplication “is really doing”, and yet these AI charlatans get away with acting like it’s anything like thought. Hell, the obfuscation is so successful that despite knowing exactly this fact, I’ve repeated the “black box problem” as if it were a real problem myself. It’s literally just linear algebra, what the fuck.

So, on the one hand, there is sort of a black box problem in the sense that it’s not always easy to pinpoint exactly which qualities of an input feature are being used to arrive at a model’s output, and there are a lot of smart people who spend their time trying to figure that out. Like sure yes it’s “just matrix multiplication”, but thanks to group theory and whatever we know that matrix multiplication can stand in for literally any function at all, so it’s a nontrivial issue to figure out why a model uses a particular matrix in doing those multiplications. Thought itself is undoubtedly representable in linear algebraic terms (whether that’s what ChatGPT is doing is another matter, though).

On the other hand, it is also true that when rationalists lament being mystified by neural networks it’s often because they don’t know even the most basic elements of linear algebra and calculus. This is especially true of Eliezer Yudkowsky.

One thing is the 'neural net' doesnt encode experiences in it, and it doesn't evolve. That is training and inference are 2 different steps and don't overlap. ChatGPT as I understand it re-runs the entire input of the entire chat as a form of 'memory'. But in terms of a 'strange loop', this ain't it. Perhaps its a trivial step to get to the strange loop stuff, but the lack of a memory and the inference vs training vs how the brain works has been known for a while. There's a lot of advantages of the current architecture whereby the matrix math is driven by an outside computing force. All this stuff about 'is there a there there' regarding GPT just seems like... well poorly written scifi. I mean Big Yud is best known for writing fan fic. Which is a fine genre to be sure. But that just isn't... rigorous real work. The ability of the human mind to deceive is very good, and that's why publishing peer reviewed papers is a thing: to cut down on the lying (to oneself).
This is what always gets me about it too. Regardless of the exact process it uses to produce a response to a prompt, we know on some level that it's just a program that does a bunch of math, converts it into words, and then exits. Like, it doesn't remember things between invocations, and we know it's just trained to reproduce the surface appearance of thought (by producing word sequences) so it's unclear to me how it would even possibly have an experience of continuous consciousness. But I guess humans are the species that will look at basically any three vaguely-close-together shapes and decide they look like a face, so this is sort of inevitable lol
ChatGPT does remember things between invocations; my understanding is that they maintain the built-up context memory between uses for each user. An autoregressive model is not like a static classifier. The context that it maintains is equivalent to a time delay embedding for a dynamical system; it actually can, in principle, evolve in time and remember past experiences.
Does anything inside of it resemble the process of choosing which past experiences to focus on based on utility?
Yes, that's pretty much exactly what the "self-attention" mechanism does. LLM's have multiple "attention heads", each of which analyzes the past in a different way, and it aggregates the information from all of these when making a prediction about which token should come next. The only utility that the model is directly trained on is predicting the next token, but that mechanism alone is enough to be able to learn multiple different kinds of functionality depending on the starting prompt that a user provides.
So does that mean those AI's looking for new molecules are the same but with a functional molecular structure as the target rather than a prose token? With the molecule just being the token preferred by that kind of bot?
Yeah that's basically it. If you prompt it with "write me a poem about star wars", it'll figure out that the tokens that follow should consist of a poem or something, whereas if you say "design a molecule with following binding affinities" then it'll know that you expect some sort of molecule definition as output (with the form of that definition depending on the training data). You can do this with images etc. too. If you encode e.g. images as sequences of tokens then you can train an LLM to work with images and text together. To be clear, though, most of the models that people use for physical chemistry are not LLMs, they're specialized for chemistry. But you can make an LLM do that if you want to.
The situation is a little more complicated than that. An autoregressive model (of any kind) with a long context length actually does have a memory for experiences and is capable of learning things and evolving over time. An autoregressive neural network, in that respect, is not a static store of precomputed learning, it's really the time evolution operator for a dynamical system.
Yeah, I was being a bit glib about it for comedic purposes, there are obviously reasonable unresolved questions about it. It's just that Big Yud et al aren't talking about those questions at all.
Thought itself is undoubtedly representable in linear algebraic terms Say more
Well it's trivially true in the sense that (as far as we know) quantum mechanics is correct and the time evolution of everything in the universe is actually a linear equation. Your brain is made of atoms etc so QED. It's probably also true in less trivial senses, like you can represent or approximate a pretty staggering number of different things using just linear algebra. [E.g. function composition itself is a linear operator.](https://en.wikipedia.org/wiki/Composition_operator) I actually can't think of anything that can't be described somehow in linear algebraic terms, but that might just be ignorance on my part. If thought is just computation (and i think that it is) then i'd be shocked if I couldn't write it down in purely LA terms.
For example, imagine you made a computer that preserves information - it can be run in reverse if you want to. You can represent that using (possibly enormous) permutation matrices.
I think the idea that thought is computation needs a lot more examination.
If you believe that thinking is a process that is caused by physical mechanisms in the brain then it's impossible that it can't be reduced to computation. If you believe that thinking *isn't* a process that is caused by physical mechanisms in the brain then you are rejecting empiricism. That's a valid belief, in the sense that any belief is valid, but it's more of a spiritual or religious belief than a serious proposition about how the world works and how we should choose to interact with it.
Oh you mean that at the very least what we do in our brains when we think can be described mathematically, not that our brains are like computers.
Sort of. I think are brains *literally are computers*, but there are a lot of different ways of doing computation and I would agree that brains do not operate in any way like a laptop computer or something.
I feel like to call brains computers would require expanding our definition of a computer in pursuit of a metaphor that is misleading to most people who hear it. Like most people are not going to understand that in terms of 'our brains are cimputers' but not like any computer that actually exists.' They understand it in terms our our brains being a smarter version of the computers we have. Then again since around 1600 we always seem to describe our world in terms of the machines we build. To me 'our brains are cimputers' is the modern heir of seeing the universe in terms of clockwork. In either case for AI purposes making a machine that is like our brains would require understanding our brains and reverse engineering them, rather than assuming that anything that produces output like us is like our brain.
I mean yeah, the average person has no hope of understanding any of this. The average person doesn't even know what a computer is. That doesn't change the fact that a brain actually is a computer, though. I assume this will change with time though, too. People don't actually know what computers are because they've never really needed to; until relatively recently they've been a niche tool used only by an educated elite. I think that, in the near future, computer programming will be given equal emphasis with being able to read/write and do arithmetic, as a basic mental skill that any literate person possesses. Presumably, at that point, the "brain is a computer" thing will seem not only plausible but obvious to most people.
I am not sure what 'the brain is a computer' as an objective statement means. If we define our terms in a certain way it can be true. How does this metaphor help our understanding? For it's flaws the mechanistic clockwork universe allowed us to make sense of basic orbital mechanics and a lot of other problems of classical physics. What similar explanatory power do we get from the idea that the brain is a computer? (One of my core assumptions here is that all of this is merely a contingent mental model that is useful or not, not a description of an objective reality, because at heart I am a filthy postmodernist.)
> What similar explanatory power do we get from the idea that the brain is a computer? If anything, it gets us a vaguely useful impression of what a *computer* does. Otherwise none that I can see.
Well, what's the purpose of the brain as an organ? It processes information for the purposes of decision making. That's also the definition of a computer. If you think that sounds like an obvious and trivial statement then I agree: it's totally obvious in the modern era that the brain is a computer. It shouldn't be controversial at all, and I wouldn't expect an educated adult to gain any new insight from it. If there's anything that's tricky about this then it's probably the implicit assumption that the specific hardware used for constructing a computer is in any way relevant to understanding the fundamental nature of what it does. Different kinds of computers can be better or worse at different kinds of things, but *information processing* is a form of mathematical labor that can be understood entirely independently of the hardware that is used for implementing it. What makes one computer better than another for a given purpose is speed, efficiency, or accuracy, but not functionality.
[deleted]
Sure, they're both things that carry water. Like i said, I wouldn't expect an educated adult to gain much insight from the observation that the brain is a computer, nor would I expect an educated adult to gain much insight from the observation that the grand canyon is a thing that carries water. And yet, sometimes people still do object to the idea. But maybe that's how the water in the grand canyon feels too: surely it can't be the same kind of thing as the water that flows through the spigot in your kitchen, right? It must be special because it's in the grand canyon.
[deleted]
My usual quip on this topic is, remember the first profession to be replaced by computers was the profession of *computer*. Which is to say computers resemble brains for... well, lots of reasons, but most saliently because they're machines designed and purpose-built to do certain kinds of numeric and symbolic information-processing work that we previously did by exercising evolved faculties of our human brains. So they are importantly different kinds of things in the sense that one is a machine *for* processing information in a way the other isn't, because in a naturalistic universe it makes no sense to think of humans and our organic capacities as being *for* anything at all -- except in the very contingent and limited sense of having a job with a job description. But that would be the case even if the actual computational work done is equivalent in every other relevant way, which it very well might be.
[deleted]
> These are all socially constructed ideations of what the brain does in a big way. Oh, absolutely, but by the same token a semiconductor logic gate is a socially constructed ideation on top of what electricity and silicon do. There's a huge background of intersubjectively constructed meaning involved if you want to say anything more useful than "some energy moves around through some stuff in this pattern, or sometimes that one, sometimes that one" about either case. ... If I did have a point with respect to the original thread, I don't remember what it was, now. Oh well.
I think the brain as an organ is indeed reducible to the property of doing computation for the purpose of decision making. This is consistent with how we interpret other organs in the body: nobody objects when we say that the intestines are "for" absorbing nutrition, or that the heart is "for" pumping blood, or that the eyes are "for" seeing things (another information processing function, even!). The difference between organisms and, say, the grand canyon is that an organism is a discrete agent that pursues goals, and so it makes sense to describe its parts in terms of their uses in achieving those goals. Computation has this in common with metabolism. A fire is the same essential chemical process that occurs inside of a organism's cells - oxygen is combined with carbon and hydrogen to release energy - but we say that fire is different from metabolism because metabolism separates the released energy into two components: waste energy, and useful energy that is extracted to be used for various purposes. So too with computation. Information processing is something that occurs in natural settings all the time, but what distinguishes computation from other kinds of information processing is that it's used for a purpose.
[deleted]
I happily use the term "makes sense" because there are many ways to interpret the matter, and I think the important distinction is which of those ways is useful and justifiable, and which is not. If we really want to bottom the thing out then it is technically correct to just write down the schrodinger equation and refuse to elaborate further, but that's dissatisfying for obvious reasons. It really is true that we can describe organisms as agents that act in the world towards accomplishing goals, and that this distinguishes them from other kinds of things for which that is not true. If we're thinking in those terms then that also distinguishes things like metabolism or computation from the broader set of phenomena that occur when the rate of entropy increase is accelerated or slowed down. I personally don't think there's a substantive difference between artificial tools that humans create and structures that occur naturally. It's true that organs have multiple overlapping functions, of course, but that's true of the tools that humans create too. It's accurate to characterize the heart as a pump, and it's also accurate to characterize the brain as a computer.
[deleted]
> there is nonetheless a principled gap to be maintained between a thing which may be seen in the light of a certain purpose and a thing which was explicitly designed for that purpose. Is there? Like, what would that difference be exactly?
[deleted]
I don't disagree that reductionism is necessarily only an approximation, I just contend that it can be a reasonable and accurate one, especially in the case of the brain. Like, it might be more correct to say "the brain is a computer, and also other things too", but that doesn't change the fact that it is a computer. I don't think there's any clean dividing line between things that are tools and things that are not. It is never true that tools do only the things they are designed to do, we just usually don't notice the other things that they do because we have no reason to. The only difference between hammering something with a convenient rock vs hammering it with a chunk of smelted metal vs hammering it with a purpose-built hammer is the degree of effort at reshaping one's environment. Improvisation is a nontrivial act of toolmaking, and all tools are ultimately just an accumulation of accidental discoveries. We happen to be able to understand many of them, and so we can generalize their use after their discovery, but their original discovery itself is rarely an act of what one would call "design". This isn't different from how natural evolution operates: useful structures are found by happenstance and then are maintained and reused, and extended to new use cases when appropriate. The only important difference between human tool making and natural evolution is efficiency; humans can solve some problems faster and with less energy. Evolution is ultimately an optimization algorithm, and as such it does have objectives. Those objectives are emergent rather than intentional, but then, in the big picture, so are ours, we being products of evolution ourselves. Optimization algorithms can and do produce other, better optimization algorithms.
I love how in Chinese, this debate is already over: a computer is a [电脑](https://en.wiktionary.org/wiki/%E9%9B%BB%E8%85%A6).
> but thanks to group theory and whatever we know that matrix multiplication can stand in for literally any function at all That's provably not true, matrices can only do linear transformations in a given field. That's the "linear" part of linear algebra. You mathematically cannot emulate the function n -> n^2 with just matrix multiplications.
You absolutely can. With a finite field you can represent any number as a finite dimensional one-hot encoded vector, and then any function at all can be represented as multiplication of a one-hot vector by a permutation matrix. edit: any invertible function that is. non-invertible functions would be some other kind of matrix. edit edit: don't forget, function composition itself is a linear operator :P
I mean, technically yeah. There probably exists a matrix that can play perfect chess because it essentially functions as a lookup table: mapping a one-hot encoding of the input position to a one-hot encoding of the best move, but this clever observation is useless from an engineering perspective. This is not how neural nets work. Out here in the real world, we can't have as many neurons as possible inputs. V0ldek is right -- matrix multiplication alone is insufficient: it's classically known that neural nets based on matmul alone cannot learn functions like XOR. That's *why* neural networks always involve a nonlinear activation function after the matmul, so that they can [*approximate*](https://en.wikipedia.org/wiki/Universal_approximation_theorem) any function without necessarily memorizing its truth table.
I'm not talking about what people should be doing for function approximation, I'm talking about what they should be doing to interpret the functionality of a pretrained neural network. People who try to do model interpretation by looking at the values of individual weights or individual activation values are missing the forest for the trees: *a matrix vector multiplication is potentially an abstract representation of literally any function whatsoever.* It's entirely plausible that every matvec in a given model does have some kind of sensible, operational meaning in terms of information processing, but identifying what that operation is isn't easy because the search space is potentially very large and the specific operation may be abstracted in a way that isn't easy to identify.

It’s not just matrix multiplication – there’s layernorm, GELU and self-attention in there too.

In some sense it’s “just math”, but in another sense the math has a very interesting emergent behavior. It is very hard to point out where the recipe for gluten-free peanut butter cookies exists in the trained model, or to describe how it is encoded.

Yudkowsky submits that unless or until we fully understand how information is encoded in a large language model, we can’t characterize its behavior – which seems rather silly to me, since we hardly understand the human brain at a similar level, and yet we have no problems characterizing the behavior of other humans.

> we hardly understand the human brain at a similar level, and yet we have no problems characterizing the behavior of other humans. That's the thing of it. These sorts of "it's just X" complaints always seem a bit silly to me, because we don't have anything like a comprehensive account of how you go from the low-level structures of the human brain to the human experience of consciousness and intelligence. It's certainly the case that whatever human brains are doing is *vastly more efficient* than what ChatGPT is doing, because we are A) much smarter and B) use far fewer resources. But I don't think anyone knows enough to say "the AI is doing this thing and we are doing this other thing" with any kind of confidence.
> But I don't think anyone knows enough to say "the AI is doing this thing and we are doing this other thing" with any kind of confidence. I'm not sure these kinds of statements aren't impossible in principle, even -- it may be that ultimately we're just looking for a Wittgensteinian beetle in a box, and people are drawing wild, unwarranted conclusions about machine intelligence from the observation that we don't systematically understand how it appeared there. If there's a "black box" at work at all, it's just the problem of other minds *per se* and not some world-shaking discovery of a radically new kind of "box".
I was talking about this with my wife the other day, because no amount of posting here can spare her the occasional sneer IRL. She raised the more interesting point that while we can be pretty confident that the current crop of LLMs aren't conscious agents their capabilities are enough to discredit the traditional Turing test as a means of determining if we *have* created one. Therefore this could be used as an opportunity to consider and outline the kinds of moral and social frameworks we should have with artificial consciousnesses if and when they exist. I.e. questions like the morality of copying, shutting down, or editing a 'true' AI should probably be treated with some more relevance, as should how to not be assholes to the AI. Of course this is a very different argument than whether or not we're months away from an AGI Cascade that will lead to the extermination of all life on earth.
n.b.: The Turing test was never meant to be definitive as to whether the successful machine *is* a conscious agent, because Turing considered that a philosophical question beyond the scope of the experiment. It isn't asking "can a machine think?", exactly, but more like "can a machine do what humans do when we appear to be thinking (whaterver that is), well enough to convince a human?" Whether or not those two actually are synonymous is a question the original thought experiment was aiming more to *raise*, not settle. (tl;dr: your wife has it exactly right)
[Lena](https://qntm.org/mmacevedo).
Well thanks. I didn't know what to do with my evening but I guess I'm staring off into the middle distance in pure existential terror.
> this could be used as an opportunity to consider and outline the kinds of moral and social frameworks we should have with artificial consciousnesses if and when they exist Like, *maybe.* I personally would prefer that people limit their enthusiasm about such a project, though. That's the sort of topic about which a lot of people are going to have confident, elaborately-considered opinions that amount to nothing more than counting the angels on pinheads, and yet which could also have the emotional valence and intensity of topics like animal cruelty. There's a pretty good chance, I think, that people are going to create something that everyone agrees seems to qualify as "true AI", and that none of our intuitions based on our subjective experiences will apply to it in any way. People should probably hesitate to invent new moral theories when they don't actually understand the thing that they think they're theorizing about.
We don't understand our brains but we posit moral theories all the time because we more or less have to in order to keep societies wheels spinning.
We don't understand our brains well, but we do at least have access to our subjective experiences, which doesn't count for nothing. You can generally believe someone when they say stuff like "you're causing me to feel pain", and you can contextualize it in terms of your own experiences. The same isn't true of AI. Even if/when someone creates one, we won't know anything about it's subjective experiences or how they relate to our own, or if it really even has any.
Yeah, I'm being a bit reductive here of course. It *is* totally fascinating to me that GPT can encode so much with mere statistical inferences about which words follow which. That's a very interesting problem space to talk about, but these nerds are skipping straight past that into WhAt Is It ThInKiNg and you know I'm not going to let some nuance stop me from dunking on that.
What gets me is there are all these really fascinating problem spaces *adjacent* to OpenAI's product and business model but the zeitgeist is driven by AI nerds who mostly just ignore them because working productively in these spaces takes a more nuanced and humanistic approach than they're interested in or comfortable with (or capable of, maybe, but I'm trying to be charitable).

[deleted]

Model interpretability is a weird space. It's strongest detractors - the ones who think that neural networks are a baffling mystery - are certainly wrong. But it's most enthusiastic proponents are often also wrong, or, in my opinion, deluded. I think there are a lot of people spending their time looking for things that aren't actually there, and I think the paper that you link to might be one of them. It's a genuinely hard thing to do. In my opinion it probably isn't true that these things are always understandable in mundane terms, such as by looking at probability distributions for tokens in intermediate network layers. A higher level of abstraction is needed.
[deleted]
So the question is, "how does the neural network work?". There are a lot of technically-correct answers to that question but there aren't a lot of *good* answers to it. Like, you could trace the execution path of the code that runs it and write that down and that would be kind of an accurate answer. But it would be dissatisfying. What people really want when they ask that question is to have a simple explanation that will give them the ability to understand how the model's behavior will change if circumstances are somehow different. I don't think the paper that you linked to accomplishes that goal. It technically works, but I'm not sure if it's illuminating any useful facts about how the model operates. It might be giving us the illusion of understanding without the actuality of it. I don't think it's ever truly a black box, but I think that the pertinent level of abstraction might be higher than looking at how token probabilities change by layer. Something more useful, in my opinion, would be something like [this paper](https://web.stanford.edu/~yplu/pub/TransformerODE.pdf).
[deleted]
I'm not really disagreeing with you at all about making incremental progress etc, I'm getting at something different. People think that it's either "we don't understand the model" or "we do understand the model", but I'm suggesting that there's an orthogonal third option that goes something like "we have created a story about how the model works that corresponds to things that we can actually observe about it, but that story is basically just folk science that doesn't really have anything to do with the true reasons that the model does what we want (or don't want) it to do". I think it's important to make that distinction because people who are not technical professionals have no idea at all that it exists (even a lot of technical people don't realize this!), and they try to use model explanations for making decisions. It's much better for them to correctly think that they don't know what a model is doing than it is for them to incorrectly think that they do understand.
[deleted]
When someone says "gosh but isn't it concerning that we don't know how it works?", i usually reply with some combination of the following: * *you* don't know how it works, but I do * you also don't know how your car works and you never worry about that right? I think it's okay to be ignorant - we're all ignorant of something - but it's not okay to be afraid.
I completely agree! It's a strange research space right now. there's a lot of different interpretability methods in use, they have an unfortunate tendency to disagree with each other, and it's difficult to even figure out how to measure how 'good' an interpretability method is because there's so many different qualities that might be relevant. I am hoping for an increase in studies applying the methods to slightly more complex situations and evaluating them - there's been a few studies doing this in the environmental sciences recently.

[deleted]

I try not to be cynical, but it's basically impossible for me not to notice that *AI doomerism is good for OpenAI.* A lot of people - even non-rationalists - seem to find this counterintuitive, though, to the point that they sometimes don't even know what to make of it when I point it out.
It's kind of a characteristically narcissistic thought-pattern, tho. If you can't be the world's great hero, you can at least be the world's great villian. Anything's preferable to being exposed as just another ordinary grifter.
[deleted]
Ehh, I'm not so sure. I mean yes, the 'black box' phrasing can also be a get-out-of-jail-free card for dealing with any model bias rather than (heaven forbid) actually fixing it. But at the same time, the black box behaviour is a real issue and does mean that models can behave unexpectedly as soon as you input anything slightly different from the training data. We've seen a lot of big failures in the use of deep learning models in medicine, environmental science, etc due to this. But to be fair, better/more rigorous evaluation of models would go a long way & model evaluation in academia is worse than in industry, so there might be a bias there
[deleted]
Yes
[deleted]
Very possible that I misunderstood something :) but I did read it! I apologise if my comment came across rudely as your response suggests.

This proves way too much: it’s like saying that you know “what [any] program is doing” once you read the intel chip spec and know the individual CPU operations. Yeah, sure, the operations come from a simple set, applied at scale. But how do these operations in this order lead to the behaviour we see? How could you look at the structure of these operations and identify likely failure risks of the whole system? If/when you do identify a problem, how do you identify a roughly minimal change to the program that fixes it? Saying “they’re matrix multiplications” doesn’t give you any of the control or predictability that we expect from an ‘understood’ technology. No engineer would justify that they “know how the bridge is built” by just writing down Newton’s laws; the need for higher-level explanations here is a real concern.

The thing I come back to is that the *whole point* of modern AI/ML systems is to do things we don't understand. Yes, the folks working on ChatGPT know more than you or I do about how it works. No, it isn't an entirely opaque black box that does things inexplicably. But it is also true that ChatGPT is not simply doing something we understand perfectly, because if we could perfectly understand what ChatGPT does we would simply *write a program that does that* rather than writing a program that generates a program that does it based on a truly enormous amount of input data. No one particularly likes deep learning as a solution to problems. It has all sorts of issues, like requiring huge amounts of compute and data for training, or having non-deterministic outputs, or surfacing weird bits of data bias. It's just that no one has figured out how to make a chatbot that chatbots as well as ChatGPT using anything else.
> whole point of modern AI/ML systems is to do things we don't understand. Maybe that's true of the people who are using ML in an attempt to create AGI. However, another dominant research direction in ML (especially in unsupervised learning), is to create an algorithm that is able to take a large dataset, to then extract the meaningful content of that data that you're interested in and to give an, ideally, human-understandable explanation of whatever it is that can be gleamed from the data. For example, giving raw MRI data as an input and having the algorithm extract the meaningful variable of "how likely is this patient to have brain cancer?" and to then explain the reasons for that conclusion. Edit: Ok, obviously in my example the algorithm still "does things we don't understand", however, the goal is somewhat different in that it's to have an algorithm that can explain and discover previously unknown features in our data.
That's precisely the sort of "things we don't understand" I mean. When you say "here are a bunch of MRI scans, these ones are bad, these other ones are good, construct a function that separates the bad ones from the good ones" you are doing that as an alternative to simply writing that function yourself. And, sure, we would like the algorithm to be able to explain itself, because that way we can more easily check for errors like "all the pictures of malignant tumors had rulers measuring their size, so this picture of a guy holding a ruler has cancer in it". But fundamentally, if you understood the things in a MRI that were indicative of cancer, you'd just write a program that checks for those things. Because that doesn't take the time or training data that using ML to solve the problem does.

Well, not exactly, it’s a sequence of matrix multiplications and non linear functions, which can approximate what ever function it wants to.

The function it’s approximating is, in the case of GPT4, the set of human writings its trained on.

For it to be intelligent, it would have to be approximating the process that lead to creation of those writings, but that of course can not be done because the writings were produced by a very large number of people based on their experiences and interactions with the real world.

Lesswrongers love to talk about brains as “computing” something - well, a very large amount of computing power - not just human but also that of the world we observe - went into creation of those writings. Enormously greater than what goes into an “AI”.

This isn’t a situation where you’re trying to approximate a function from a bunch of samples and you end up with somehow equivalent function rather than a lookup table. This is a situation where you end up with a glorified lookup table.

An actual intelligence wouldn’t even be good at the metrics that are being maximized during training.

What is particularly ridiculous about lesswrongers is that they are the most spooked by the most database-ish and least intelligence-ish applications of the neural networks.

generative AI, in general, is having a hay day for the same reason in a broad sense.

We, don’t have a good language to explain this even though there is, obviously, some relevant structure. The fact that we’ve been able to identify *some* relevant features in the systems of our language and vision, but rarely able to articulate the *system* of those features, is precisely the problem we’re seeing.

We invented the thing without having the language, to explain it. This is a relevant and interesting inherent problem to what we’re seeing, buy Yud and the such would love us to think that “can’t done it before, can’t done it now” == “never can done it, time traveling super ai destroys us all.”

The collective cognitive dissonance we’re having is precisely because we don’t have a good language to explain what we’re seeing and it appears to be magic. In the mist of uncertainty and a little priming from a scifi doomer, unfortunately, fear is stronger than imagination.

The language to describe these things exists, it's just abstractly mathematical. None of this stuff is really a mystery but it's also inaccessible for most people.
I don't really think that that's it. Obviously it's easy to write down the formulas for how the giant matrix multiplications are arranged on an abstract level. Anyone who's taken a linear algebra class should roughly be able to follow what mathematical operations are applied. However, ask anyone who's trained a deep neural network and they will not be able to answer why, at the end of the training process, the bias at neuron #2312 in layer 15 is 0.2 and not -1. They will not be able to predict what happens to the model output when the first 100 neurons in layer 10 are forced to feed a value of 10 into the next layer. They will not be able to give you the reason why 8 attention heads are probably better than 4. These are not questions that are impossible to answer, however we don't have any theory that can even get us close to an answer to these sorts of question.
I think the situation is more nuanced than that. It's true that there's often no simple or obvious explanation for e.g. the specific value of a specific activation function for a specific input, but that's because that is an inappropriate level of abstraction to use for thinking about the problem in the first place. Imagine if you tried to understand how a computer program works by looking at how the value of a specific register in the CPU changes with time; it would be very hard to make any progress. A higher level of abstraction is needed. The value of linear algebra, especially, is that it can be very abstract; [here's a good example of a paper](https://arxiv.org/abs/2209.15430) where concepts from linear algebra are used to understand things about the behavior of neural networks.

For one, AI-models aren’t merely doing matrix-multiplication. Even in a simple neural-network, we compose with functions like the sigmoid function or the RELU function which aren’t linear (hence no matrix representative).

This is why it starts to act like a black-box because it isn’t a mere composition of linear functions anymore (which itself is linear and hence just a matrix). And now comes the real-mathematical marvel: If your function isn’t a polynomial, (aka sigmoid, Relu) then you can approximate any continuous function by composing your original function with a matrix and fine-tuning the parameters for that matrix. This is precisely the contents of the celebrated Universal Approximation Theorem.

So we can approximate (locally uniformly) any continuous function (which can be very bizarre, for example the Weierestrass function). So not only AI-models are a black-box, it is unsurprising they are one.

„Why are people so confused by the mind and consciousness? It’s literally just chemistry that happens in the brain you idiots“

Sorry but this is an incredibly poorly thought out take. Yes, obviously at the base level what is being carried out is matrix multiplications. But what you’re engaging in is probably the most on-the-nose example of “missing the forest for the trees” I’ve ever encountered.

The human brain is just electrochemistry. We understand electrochemistry. The physical interactions between neurons can be modeled; that is not the hard part of cognitive science.