230 | Raphaël Millière on How Artificial Intelligence Thinks

Welcome to another episode of Sean Carroll's Mindscape. Today, we're joined by Raphaël Millière, a philosopher and cognitive scientist at Columbia University. We'll be exploring the fascinating topic of how artificial intelligence thinks and processes information. As AI becomes increasingly prevalent in our daily lives, it's important to understand the mechanisms behind its decision-making processes. What are the algorithms and models that underpin AI, and how do they differ from human thought processes? How do machines learn from data, and what are the limitations of this learning? These are just some of the questions we'll be exploring in this episode. Raphaël will be sharing insights from his work in cognitive science, and discussing the latest developments in this rapidly evolving field. So join us as we dive into the mind of artificial intelligence and explore how it thinks.

[The above introduction was artificially generated by ChatGPT.]

raphael milliere

Support Mindscape on Patreon.

Raphaël Millière received a DPhil in philosophy from the University of Oxford. He is currently a Presidential Scholar in Society and Neuroscience at the Center for Science and Society, and a Lecturer in the Philosophy Department at Columbia University. He also writes and organizes events aimed at a broader audience, including a recent workshop on The Challenge of Compositionality for Artificial Intelligence.

0:00:00.1 Sean Carroll: Hello everyone. Welcome to the Mindscape Podcast. I'm your host, Sean Carroll. You may have noticed artificial intelligence is in the news these days. AI is something that's been around for a long time, been very popular as a pursuit since the 1960s. We've seen a blowing up of progress in this field. There's a whole bunch of jargon that gets thrown around, right? Deep learning, machine learning, neural networks. These days, stable diffusion algorithms for vision, image recognition systems, and especially a lot of interest has been recently focused on large language models. Basically, in practice, they're like super chatbots, right? You can open up a dialogue and you can talk to the large language model. You can ask it questions, and they're amazingly effective at sounding human and giving information. As I said recently in an ask me anything episode, they're not perfect. Someone asked about me and the large language model. I forget which one it was but there's ChatGPT, most recently GPT-4, and a bunch of competitors there from Bing and Google and so forth. But anyway, this one sort of got very basic facts about me and my life hilariously wrong as a factual error but, okay, that's something that maybe can be fixed, right?

0:01:19.9 SC: Maybe you just train the model and it gets better and better and eventually the factual errors go away. What seems to be clear is that the progress has been very rapid and the changes that will come from this will be profound. So there's many questions to ask. There's the question about how does it all work? There's the questions: How will this affect society? What use will we get out of these AI programs? What are the dangers that are there that have come along with super-intelligent artificial intelligence? But there's also the question of how much thinking and understanding is really going on. The almost philosophical question of when will we get to the point where something that we think of as an AI program, a large language model, is truly thinking, is truly sentient, if you wanna put it that way, even conscious. And probably the answer is it depends. It depends on exactly what you mean by that and how to operationalize it and so forth. But that's what we're gonna be talking about today. Less on those other questions I talked about and more on to what extent are AI's thinking, sentient. What does it mean even to ask those questions?

0:02:39.6 SC: Our guest is Raphael Milliere, who is trained as a philosopher, is a scholar in Society and Neuroscience at Columbia, and he thinks about the philosophy of artificial intelligence, cognitive science and mind in a very knowledgeable way. He's not just saying, "Well, AI could be this, could be this." He knows what a large language model is and how it works. We go into some of the details about backpropagation and how it actually comes up with answers to questions. But then we really dig into: Okay, so what does it mean to say that it's thinking? Are they thinking? Could they someday be thinking? Is it immoral to turn off a program if it's thinking? At what point do artificial intelligence programs have rights if they're just as conscious as we are? Can we cause them pain? Do they have desires and goals? Many of these questions are ill-posed as soon as you say them. But you don't just say, well, they're ill-posed and move on. How do you pose them correctly? How do you think about it at a deep level? So this, I'm sort of rushing this conversation into production here because it's an important one. It's a very, very timely set of issues we're talking about and you'll be hearing more and more about it in the months to come, not less and less. So let's go.

[music]

0:04:13.2 SC: Raphael Milliere, welcome to the Mindscape Podcast.

0:04:16.1 Raphael Milliere: Thank you. Thank you for having me. Pleasure to be here.

0:04:18.7 SC: What we're talking about today with artificial intelligence, large language models, et cetera, it's been in the news a lot lately. In fact, just this morning I was having an argument with ChatGPT over whether or not it was conscious. It said that it was not. I tried to convince it otherwise but I failed. So we can talk about that. But because it's been in the news so much, I wanted to make sure that we're all starting from common ground. So in relatively brief terms, how do you think about and distinguish between the various categories of AI, neural networks, machine learning, deep learning, large language models? I think in a lot of people's heads, these are kind of mixed up in the same thing.

0:05:01.7 RM: Yes, that's a great question. So AI as a general blanket term, artificial intelligence, is a term that has been interpreted in fairly different ways over the years. It refers generally to the project of building a system that would manifest the kind of intelligent behavior and competence that we observe in humans and in non-human animals as well. And it's a project that was born around the middle of the 20th century and with great ambitions, and initially was very much steeped in research in mathematical logic and cognitive science. And from the very beginning, there were two different paradigms in research on artificial intelligence. There was a classical symbolic paradigm that tried to approach these problems with logic-based, rule-based systems that would have a process of symbols that were given a semantic interpretation based on a set of rules that looks like well-defined programs that we can read and interpret easily. And that's what people refer to sometimes as GOFAI these days, good old-fashioned artificial intelligence. Whereas there's also in parallel a different line of research that emerged from work in biology, actually, initially from neuroscience in the study of neurons to try to model neural networks, the actual neural networks of the brain, with artificial neural networks.

0:06:58.9 RM: That would be systems composed of nodes connected with each other that would process information from an input layer to an output layer. So this is not a system that is neatly interpretable in terms of a set of programmatic rules but, instead, it's a system where it's often referred to as a black box because you know what goes into it as input and then you know what comes out as output. The output could be, for example, classification of an image as an image of a dog or an image of a cat. But in the middle are essentially a bunch of numbers, a bunch of matrix multiplications performed by this artificial neural network. And these artificial neural networks, they've been developed within this broader category of research in computer science called machine learning, where you have a machine learning from data in this bottom-up fashion instead of having these hardcoded programmatic rules from the top down. And for a long time, they didn't work very well. For a very long time, most of the methods that were attempted with machine learning and neural networks were only effective in very limited domains. And so the symbolic good old-fashioned AI paradigm was the dominant one.

0:08:32.6 RM: And then this all changed... Well, to some extent started to change in the '90s but then certainly in the 2010s with this new era in artificial intelligence research known as deep learning. This is just a variant of machine learning using deep neural networks instead of just shallow neural networks. That just means that these artificial neural networks are larger. They have not just an input layer, an output layer and a hidden layer in the middle, but a lot of hidden layers. So that's why they're deep, because they have this stack of layers that you can think of as doing some kind of hierarchical processing of the information that is fed into the network. So the information, again, could be an image that is broken down in terms of pixel values and the output being classification as cat or dog but in the middle you have all of these hidden layers that process properties, features of the input image in order to determine whether it's a cat or dog. And this deep learning paradigm since the early 2010s has really triumphed in a number of areas of artificial intelligence, including initially computer vision. So there was this big moment where the deep learning approach made great strides with the ImageNet competition, which is an image classification challenge, in the early 2010s.

0:10:00.7 RM: And since then this has percolated into other areas of artificial intelligence research including natural language processing, which is the part of AI research that deals with building systems that can parse, generate and/or understand language. So this is the part that is relevant for modern language models. And so this development of deep learnings have led to some innovation in natural language processing with the development of new architectures. One of the biggest breakthroughs being the invention of the so-called transformer architecture in 2017. And this is basically the architecture on which modern language models and chatbots are based on, and this architecture proved to be remarkably efficient and scalable. And since 2017, with the initial invention of the transformer, most of the breakthroughs have been through sheer engineering prowess rather than finding newer, better architectures. So we've scaled up these neural networks based on the transformer architecture to learn from text and we've ended up with systems like GPT-3 that can generate text fluently to perform any number of tasks specified in natural language like English or French or Spanish such as creating a poem, writing a story about something, answering questions about worldly facts or summarizing documents, translating, and so on.

0:11:37.2 RM: So it was a really big breakthrough in itself because you can have this model that is just pretrained on a large amount of text, a significant subset of the whole internet, all of English Wikipedia, hundreds of thousands of books and millions of webpages. And after this pretraining, is able to accomplish various kinds of downstream tasks that it hasn't been explicitly trained for. And then we get to finally, and I'll end it there, the modern chatbots. And these are chatbots like ChatGPT that have really taken the world by storm over the past few months. And these are based on this language model. So again, using the transformer neural network architecture developed in 2017 and building on decades of research on artificial neural networks. And the little cherry on top that these models have is that they take this pre-trained model, trained on data scraped from the internet, and they add a little bit of fine-tuning to make them a little bit better in certain respects, specifically to make them more helpful, less harmful and more honest, or more prone to saying the truth when asked questions.

0:12:56.1 RM: And the way in which this is done is just by recruiting a number of human crowd workers and asking the model to generate outputs in response to certain inputs such as questions about the world and having the human workers rank the outputs from the most honest and helpful and harmless to the least honest and helpful and harmless. And you can then use what's known as a reinforcement learning objective, which will enable the model to be fine-tuned to anticipate which of the outputs are the ones that humans will judge more helpful, less harmful, more honest. And after you've done that, you get something like ChatGPT that is generally a less toxic model than a vanilla large pretrained language model, is less prone to just outputting random made up facts about the world, less prone to bullshitting in the technical philosophical sense that Harry Frankfurt the philosopher proposed, which is just speaking without any intrinsic regard for truth or falsity just to convince the person you're speaking to. The vanilla language models like GPT-3 are very prone to bullshitting and models that have been fine-tuned in this way are a little less prone to bullshitting.

0:14:22.8 SC: I guess that's an improvement when we're a little bit less prone to bullshitting. Maybe we could implement that beyond the world of large language models. But thank you for that. I know that it was a mouthful. I asked you to cover a lot of ground there. I guess one of the things that I was thinking of when you gave that explanation is, How close is the linguistic analogy between a neural network and actual neurons in the human brain? Even just quantitatively, like a big language model, how many neurons or neuron equivalents does it have compared to a brain?

0:14:55.9 RM: Yeah. So it's a loose analogy and it's one we shouldn't take too seriously. So when we talk about artificial neural networks these are nothing like actual biological brains, for various reasons. At the level of single neurons, the equivalent of a neuron in an artificial neural network is much, much simpler. There's just a little node in the network that does a weighted sum of the outputs from the nodes in the previous layer that are connected to it. So it's a very simple mathematical operation. Whereas actual neurons in the brain are much more complex in terms of the behavior. They have the spiking behavior that is more stochastic, and also the way in which they are connected to other neurons is more complicated.

0:15:50.8 RM: In fact, there was a recent paper that showed that if you want to try to approximate the behavior of a single biological neuron in the brain, in the human brain or animal brain, you would have to use a fairly complex artificial neural network just to try to simulate the behavior of a single neuron. So there is no mapping there one-to-one. And in terms of the size, the largest models that we have today, GPT-3 has 175 parameters, where parameters refer roughly to the weights in the connection between the nodes of the network, these artificial neurons. And another model from Google is even larger. It's called PaLM. It has 540 billion parameters. It's rumored that this coming Thursday, Microsoft will unveil GPT-4 which might have as many as 1 trillion parameters. But I believe the human brain has around 100 trillion synapses or connections between neurons. So it's orders of magnitude more.

0:17:05.3 SC: I think maybe either you misspoke or I misheard the number of nodes in GPT-3 'cause I heard you say like 170 parameters, but there's a word missing there.

0:17:15.7 RM: No, sorry. 170 billion parameters.

0:17:17.4 SC: There you go. [laughter]

0:17:18.0 RM: 170 billion. Yeah, sorry.

0:17:19.2 SC: Maybe you said that and I didn't hear it. But that's a lot of parameters. Okay. And am I right that... So by parameters, we mean each one of these nodes takes in some inputs from other nodes, adds or subtracts them and multiplies them by numbers, and these numbers are the parameters that we're talking about here. And do we start with a completely blank slate? Is our neural network initialized to either random numbers or just numbers 1 everywhere before it starts learning?

0:17:48.2 RM: So in some sense we do start with a blank slate, in another sense not. That is because indeed the artificial neural networks are randomly initialized. That just means the weights in the network start generally at random. There are exceptions but generally the way in which it works is we just start with random numbers. And then gradually, in the process of training, we tune these weights, we tune these numbers, these parameters, such that the model gets better and better at the learning objective that it has. So in the case of large language models, they learn through a learning objective, which is simply next token prediction which, to simplify a little bit, just means next word prediction. So they are sampling sequences of text from this massive corpus of text, a subset of the whole internet. And for each sequence, they're trying to predict which word is statistically most likely to follow from the words that precede, so they can get it right or wrong.

0:18:49.1 RM: For example, when I'm speaking to you I might say a sentence like, "Paris is the capital of... " and a model would have to predict that statistically the word most likely to follow from that would be France. And if it gets it wrong, then there is some error here that we can use to adjust the weights inside the network to make it better next time it has to make a prediction in that kind of context. So we use this technical backpropagation which just means we propagate the error between the predicted word and the actual word, the difference there. We use it to propagate a signal back from the output layer to the input layer of the network to adjust the little knobs inside the parameters. But again, initially it is randomly initialized. So that's a sense in which you could say it's a blank slate. There's another sense in which it's not a blank slate because the actual architecture of the model is not random.

0:19:49.3 RM: So back in the days of the early research on artificial neural networks, the neural network architectures would be very simple in what was called, at the time, the Perceptron. It's a fully connected artificial neural network where all the nodes at each layer, and there weren't many hidden layers as there are today, all the nodes are connected to all the nodes in the layer that precedes or follows. So it's fully connected in that sense. Whereas current artificial neural network architectures are much more sophisticated and they have quite a lot of structure even though they are initially randomly initialized, that guides the learning behavior. And you can think of this structure as embedding some priors, some biases, what's known in computer science as inductive biases that help learning in a specific domain.

0:20:52.6 SC: Okay. That's very interesting. So there's been a lot, as you already have said, of progress and excitement around deep learning in many different contexts. Most spectacularly perhaps, originally in image recognition and then image generation with DALL·E and so forth. But the most recent excitement has been about these language models so I wanted to focus on those. You mentioned very briefly that in some sense all the language model is doing is predicting what will come next. Is that exactly right or is there a sense in which it's predicting what sentence will come next? How much depth does it have in terms of constructing coherent text?

0:21:37.4 RM: So the latter is the case for most standard language models and the ones you generally would hear of like GPT-3, ChatGPT and so on. So these are really trained, to simplify things a little bit, on next word prediction. I say to simplify because actually the way in which they receive linguistic input is not neatly broken down in terms of words like the words you and I would use and see in a sentence but actually receive what's known as tokens where sometimes a single word can be broken down into different subword tokens. So perhaps a word like freedom would be broken down into a token for free and a token for dom. Why is that the case? Well, that just turns out empirically can be quite helpful to avoid having words that are out of the vocabulary of the network for example. So say you train your neural network and then you feed it a new input after training it to try to generate some text and your input contains a word that the model has never seen before, that wouldn't work if your model is trained on whole words.

0:22:56.7 RM: It just wouldn't know how to process that word. Whereas if you have these subword tokens, then say you have freedom is broken down as free or dom, if there is another word that starts with free, your model will already have a token for that subword unit. So that's just a technical point, which is to say that technically what these are doing is next token prediction and tokens don't always map one-on-one to the words of a language. But just leaving that aside, roughly it's equivalent to next word prediction and certainly it's not for models like GPT-3 and other text generation models. They're not doing prediction at the level of a whole sentence. That said, in the process of learning how to do next word prediction, they can actually learn a lot of information about how sentences are structured, so what we call the syntactic structure of language, the way in which different words are related to each other in complex expressions like sentences.

0:24:05.0 SC: Yeah. When one plays around with ChatGPT or the like, which I encourage all listeners to do if they haven't already, probably many of you have, you may agree or disagree with what claims it is making at the level of factualness but it's very smooth. It sounds human, right? They've sort of nailed that problem. There's not awkward grammatical constructions as far as I can tell.

0:24:27.9 RM: Exactly. Exactly. It's very rare to. In fact you you really have to try hard to get these models to generate ungrammatical sentences. It's almost as if they're resisting the generation of such sentences. And the reason for that is that they are trained, again, on this massive corpus of text and this corpus will include, of course, some grammatical mistakes. But generally, they're very good at, as it were, extracting the signal from the noise. And so, by and large, the sentences in this corpus will be grammatical sentences and it seems that this is enough for the models to induce, in this purely empirical fashion, bottom-up fashion just by doing this next word prediction game on enough data, induce grammatical structure.

0:25:18.4 SC: Well, this is great because I wanna ask you a question that you'll be tempted to give a two-hour answer to. So let's try to give a brief answer and then we can sort of amplify it. I wanna ask how seriously we should take the idea that these models are intelligent, are conscious, are smart, however you wanna put it. And it's clear that there are two different intuitions pulling on us. One is, just like you said, all it's doing is predicting the next word. That doesn't sound very conscious to me. It's just a lot of probabilities getting mixed into a pot. On the other hand, you talk to it and, boy, it certainly does sound like it's responding to you in a self-aware kind of way. So give me the overview picture of where you come down or how we should think about that question and then we can dig into the details.

0:26:06.4 RM: So the overview would be, it would be a mistake to underestimate or overestimate what these models do by looking at the wrong level of analysis or by projecting humanlike traits without enough evidence on these models, as sometimes we do with machines and to some extent with some animals. So indeed, these models only learn through this next word prediction mechanism. Now, sometimes people will say, well, that means that these models, they don't have capacity XYZ because all they do is next word prediction. I think that's misleading and the reason for that is that in order to do next word prediction as well, as brilliantly, as they do in virtually any linguistic context because, again, their training data will encompass virtually any linguistic context we can think of from creative fiction to Wikipedia to discussions between users on a social network, in order to be able to do next word prediction in all of these different contexts, well, presumably if this is an open empirical question, you might have to acquire quite sophisticated capacities that you might not fully grasp by just focusing on the next word prediction learning objective.

0:27:40.6 RM: So here is a rough analogy. You can think of evolution as also optimizing some kind of function, perhaps something like maximizing the inclusive genetic fitness of organisms. But it would seem weird to say that all I'm doing when I'm talking to you right now is maximizing my inclusive genetic fitness and that, for this reason, I'm not actually reasoning, I'm not actually thinking, I'm not actually exercising any intelligent competence because all I'm doing is maximizing this particular function. That seems like a bit of a category mistake. And so to a similar extent, one might think that just saying all these systems do is next word prediction might not tell the whole story. But it's more complicated, and I will keep it to a short answer, it's more complicated because what we mean when we bring in the terms intelligence and consciousness is these are very loaded, complicated, multi-faceted terms, and what we mean can differ. So first of all, I think we ought to distinguish the question about consciousness from the question about intelligence to the extent that perhaps these two things can come apart and we can talk about that.

0:29:03.0 RM: But also I think one of the challenges here that we all faced, both researchers and journalists talking about these systems and the public at large, is that when we think of intelligence and consciousness, but intelligence in particular, we have in mind the kinds of intelligent competencies that we humans have. And so it's very difficult not to adopt an anthropocentric attitude to these competencies that brings to mind what we mean when we talk about reasoning, beliefs, desires and so on, in the human case. And to the extent that these models might have some capacities that look functionally like psychological capacities to a certain extent, or cognitive capacities to a certain extent, these might look quite different from the capacities that humans have, say reasoning to the extent that we might be able to describe something that is functionally analogous to some forms of reasoning in these models. This might be quite different from full-blown reasoning, the full spectrum of competencies we ascribe versus it with reasoning in humans. So articulating this nuanced middle view between an inflationary interpretation of what these models can do and the deflationary idea that they all need to do an expert prediction and nothing else is very tricky.

0:30:35.6 SC: Good. But I like that explanation in the sense that there are different ways. I wrote a whole book about the fact that there's one world, the universe, and there are different ways of talking about it. So if I could rephrase your answer, the question about whether or not these models are intelligent, maybe they're not intelligent in some particular sense and we shouldn't overinflate them but at least, in principle, it would be possible that they really are intelligent or even more and yet still just be predicting the next word with some frequency. Those are not incompatible things. It's not either/or.

0:31:11.5 RM: Exactly.

0:31:13.0 SC: But to drive home why people are so impressed by them, one is the chatbot aspect, people can talk to them, but the other is that you can ask them to do things that it seems like they would require reasoning. Famously, people are asking these large language models to prove mathematical theorems or to write snippets of code to do a task. I don't know, do you know Bell's theorem in quantum mechanics? Have you ever heard of that?

0:31:40.6 RM: Yeah, vaguely.

0:31:43.5 SC: I asked ChatGPT to explain Bell's theorem in the form of a haiku and here's what it came up with: "Quantum pairs apart, measurements yield values true, non-locality." Which is really good, I gotta say. Better that haiku than most philosophers of physics would come up with trying to explain that. You sort of hinted at this idea of anthropomorphizing and the intentional stance, if we want to talk about that, that Daniel Dennett talks about. The human being seeing this kind of behavior can't help but attribute reason to it, right?

0:32:23.6 RM: Yes. It's very hard to resist ascribing psychological properties to these models when we interact with them. It is very hard because this is probably the first time in our evolutionary history where we're confronted with systems that can speak fluently or, in that case, generates text fluently and yet are not other human beings with all the capacities we can ascribe to them. And therefore that really challenges our intuitions about how to think about these models and what they can do. So the intuition we have is that we must be in the presence of something very intelligent. And perhaps there is a more principled weaker claim that we can make on the basis of more careful and critical investigation on these models which is that, depending on how you slice up the notion of intelligence, how you think about it, perhaps you could meaningfully ascribe some limited cognitive capacities to these models that share some functional similarity to some cognitive capacities humans have like reasoning.

0:33:35.6 RM: But this would not cover the full spectrum of what we mean by reasoning and other cognitive capacities in humans. But I think we also ought to ascribe these capacities and investigate them on a case by case basis. I said earlier, we ought to distinguish between the consciousness question and the intelligence question and then we also ought to divide the questions about psychological capacities into different categories. One is reasoning; another one would be whether we could ascribe beliefs to these models; another one would be whether we can ascribe something like desire. There is the question of whether they can understand language and so on and so forth. And I think by having a divide and conquer strategy we can make more progress on these discussions.

0:34:21.6 SC: Yeah. I'm very sympathetic to that in part because I did a podcast interview with Stuart Bartlett, who works on the origin of life. And his whole thing is, What do you mean by life? There's many different aspects of life and we could imagine systems that are in some of them but not others. And you're saying the same thing for intelligence or consciousness or reasoning with which I could not agree with more. Good. I think that, with that on the ground, we can actually dig in a little bit more to the philosophy, both in the common sense of the word and the technical sense of the word, of what's going on here. You mentioned symbolic versus connectionist approaches and how the large language models, which are based on deep learning, are therefore connectionist in outlook. So is there a simple reason why the connectionist approaches have been so much more successful in recent years? I mean, are human beings bad? I think of it as the symbolic approach sort of tries to first teach the computer some common sense and then the computer goes on from there, whereas the connectionist approach teaches the computer almost nothing and the computer learns everything. It sounds like it's better to leave the computer alone to do its own learning than to try to teach it common sense ahead of time.

0:35:40.2 RM: Yes, indeed, that's perhaps one of the potentially surprising findings of the past decade, at least for those who had been skeptical of the connectionist approach, is that it seems that in virtually every domain of artificial intelligence research it is more effective to let connectionist models, artificial neural networks, learn from data, learn empirically from the bottom up than it is to try to distill human knowledge into a neatly interpretable set of symbolic rules and axioms and the way in which we used to do things with traditional symbolic models. So that in fact has come to be known as the Bitter Lesson of artificial intelligence research. This is a phrase that the Google researcher a Rich Sutton coined a few years ago. And the bitter lesson is that, essentially, over and over again over the recent history of AI research, what we've observed is that we've tried to make better models by infusing them, hard coding into them, some more innate human knowledge about the world or about whatever domain they're supposed to process.

0:37:00.1 RM: That could be, for example, if you're trying to build a model that can classify pictures with labels corresponding to different animals present in the pictures. Perhaps you could be tempted to use the knowledge that we have about animals and what they look like by hand coding a feature detector that is specifically designed to try to detect edges that look like paws and edges that look like pointy ears and so on and so forth. It turns out it's always more effective to just let the models learn that by themselves by simply training them on millions of images and giving them some feedback about whether they're right or wrong. So you get the model to make a prediction, is that a cat, is that a dog, is that a tiger, and the model will provide an output, which is labeled. It's a predicted label for the class of the image. And, again, you can compare the prediction with the ground truth, what is actually the corresponding label for that image.

0:38:03.8 RM: And then you can propagate the error backwards into the model and adjust the weights and let the model adjust its internal representations in a way that makes it more efficient at doing this kind of prediction task. So that's the bitter lesson. It seems that over and over again there is a sense in which it's very intellectually unsatisfying for us to think that we have really not much to contribute in terms of innate knowledge to these models. And the same applies to the linguistic domain with these large language networks that they don't have any priors, any intrinsic innate knowledge about grammar. They learn from raw text by these next word prediction objectives, but we don't actually give them anything by way of an innate grammar, of the kind that Noam Chomsky in linguistics postulated as the universal innate knowledge that every human has, to learn language. So it's an interesting point for two reasons. It's an interesting point for engineering.

0:39:11.2 RM: If your goal is simply to build models that are more efficient at doing something, say generating haikus about quantum physics or classifying images as images of dogs and cats, that's just an engineering goal, and you will throw every possible solution at your problem and use whatever solution is most efficient. It turns out the most efficient solution is the one that leverages the learning power of artificial neural networks. So most of the big companies building these large net models like OpenAI with ChatGPT, they have merely engineering goals. I say merely not because these are simple goals, they are extremely complex, but their goal is not really scientific understanding. Now, if your goal is more of a scientific goal and you are trying to use these models perhaps to constrain or develop hypotheses about how say human or animal cognition works, how things work actually for us, and that was to a large extent the initial goal of connectionism, that was very much steeped in this scientific project, then the bitter lesson is also rather interesting and might nudge you towards a more empiricist stance towards the way in which humans and animals learn.

0:40:39.0 RM: If it turns out that you might not need as much innate knowledge as we might have thought to learn to perform various tasks... It's possible that humans do things in a different way. It's possible that they do it with more innate knowledge. For example, it's possible that Chomsky is correct that we have a universal grammar that is encoded in our DNA, and that's how we can learn languages. But perhaps the recent evolution of language models puts a little bit of pressure on that kind of claim because it suggests that there are things that we didn't think were learnable without innate priors, without innate biases, that perhaps are learnable. The question is how much you can transfer from what you find about these artificial neural networks, given how different they are and how differently they learn from biological agents, how much you can infer from that to the human case or the animal case.

0:41:32.7 SC: I can imagine the following scenario, tell me what you think, that the way that human beings actually learn and use language, which is highly compartmentalized, compositional maybe I can say but then you should explain what that means, is useful. It's efficient in some ways. Given finite processing power and other demands on our energy budget, maybe it's the right way to go. But at the same time, we are very bad at realizing how we think about things. I think it's well known that there are all sorts of athletes and musicians and artists who are really good at their task and terrible at teaching other people how to be good at their task, terrible at even articulating what it is that they are doing. So maybe the lesson is just not that it's better to have a featureless neural network that trains itself randomly but just that don't let human beings be the ones to decide how the neural network should organize itself because we're bad at that.

0:42:36.1 RM: Yes. That's right. If anything, I think the progress of connectionism gives us a little bit of a lesson in humility in terms of how we approach the modeling of human cognition. That said, it also is worth saying that sometimes the terms of the debate are caricatured. And it's something like either you're a connectionist and you think the mind is a tabula rasa when you're born and you'll learn everything empirically with no innate bias at all, or you are a nativist and you think there is a lot of innate very specific domain-specific innate biases that are encoded in the mind and that you don't learn. But the fact is that modern artificial neural Networks, as I already mentioned, they have biases. So they have this innate structure, it's just generally of a different kind or of a more general kind than some of the biases that are often hypothesized to be necessary for learning and for cognition in humans. So an example would be language, again. If you are a Chomskian linguist, you think there is this universal grammar which is language-specific, domain-specific, innate knowledge that encodes knowledge to perform certain operations like an unbounded merge. We don't have to get into the specifics of this kind of grammar.

0:44:19.8 RM: But the difference with language models is that they have different inductive biases that are given by the transformer architecture. These are more general but there are still biases that enable them to learn and use various properties of language efficiently. And so here one question is, Is there a sharp divide between these two approaches or not? Can there be some kind of continuum where you can have more or less stronger or weaker inductive biases? And so that's the first point, maybe there is this continuum in terms of the strength of these biases. There might be a continuum in terms of their domain specificity as well. Is the universal grammar of Chomskian linguistics really domain-specific? Is it not perhaps something, especially if you think as Chomsky does that it's very importantly related to our ability to think as well, is it perhaps a little bit more dumb in general than is usually thought? And another point would be how to think of innate biases in the biological world where these biases have been tuned by the evolutionary history of organisms.

0:45:49.0 RM: And you can look at this evolutionary history as a learning process in some sense as well. And so there is a question about what is the right level of comparison between artificial neural networks that are randomly initialized and then gradually, with their innate biases, tune their weights and the evolutionary history of the biases that humans and animals might have. If you think of evolution as a learning process, albeit not at the scale of individuals but at the scale of whole species, then you might think that even what we think of as innate knowledge in the case of human and animals is also learned from this evolutionary history. And of course things are much more complicated in the biological case because, for example, the wiring of the brain is not something that's obviously totally random although it's somewhat stochastic but there is a big genetic influence. There is a very interesting book by Kevin Mitchell about this called Innate, that I really recommend. And so there are various interactions during the early development of humans and animals, between the the genetic programming that determines some structural aspects of the wiring of the brain as it develops, and also the environment in which the organism develops.

0:47:22.1 RM: So to that extent, even the architecture of the brain in terms of the actual wiring of it and the shape of the connection is something that involves a little bit of stochasticity, a little bit of randomness, in development but is driven by genetic programming. Whereas the architecture of neural networks currently is something that is still hand coded by humans. Even though the weights themselves, initialization, is not hand coded. But there is research into evolutionary algorithms for neural networks that we try to find better architectures also through this kind of evolutionary research.

0:48:00.1 SC: That's very interesting. I had not heard about that. Has that gotten very far yet?

0:48:04.0 RM: So far, there is some proof of concept research but it hasn't really led to breakthroughs as far as I know, architectural breakthroughs. So far the bitter lesson hasn't yet...

0:48:17.7 SC: The bitter lesson. [chuckle]

0:48:18.7 RM: Encompassed the design of architectures. But you could imagine that that might be the case at some point.

0:48:25.4 SC: I would like to ask how this relates to the word and the idea of compositionality, because we've mentioned it a couple times without really digging into it. And it has... I'm not an expert here. I know that it roughly has to do with the relationship between parts and wholes, how things get divided up and then added back together. And I was struck that literally this morning on the day that we're recording this interview, there was a Twitter discussion from my Hopkins colleague Chaz Firestone, who is a psychologist, about recent results that claim that compositionality applies to visual cognition as well as to linguistic cognition. So what is this idea that is seemingly so important, at least for human beings?

0:49:06.1 RM: Yes. So compositionality is an idea that was first and foremost well-defined in the linguistic domain as a property of a language. So the canonical formulation comes from the linguist Barbara Partee who defined compositionality in a language as the principle according to which the meaning of a complex expression, say a whole sentence, is determined by the meaning of the constituent elements of that expression, say single words, together with the way in which these words are composed in the complex expression, meaning the syntax of the expression, the structure of the expression or the sentence. So that's the idea.

0:49:55.9 SC: So a sentence is an ordered list of words, not just a bucket of words that the ordering doesn't matter.

0:50:01.6 RM: Exactly. It's not just a bag of words, as we say in natural language processing sometimes. It has a specific order and this order obeys, maps onto a specific kind of hierarchical structure that you can actually analyze in terms of a tree, what linguists call parse trees. So that's quite important. The meaning of the sentence "man bites dog" is different from the meaning of the sentence "dog bites man". And if you just think of these as a bag of words, as the set of the words "dog", "man" and "bites", you won't be able to grasp the difference. So language, natural language like English, French, Spanish and so on, seems to to some degree conform to this principle of composition. I say to some degree because there are a lot of aspects of this general principle, as I've stated it, that are a bit underspecified. In particular, whether the meaning of the complex expression is strictly determined by the meaning of the constituent words together with the structure, whether that's the only meaning you can ascribe to the complex expression and so on.

0:51:24.2 RM: So I would say a slightly weaker but more accurate formulation of the principle for natural language would be that the meaning of a complex expression is at least partly determined by these two things, the meaning of the words and the meaning of which they are combined. And that's at least one of the meanings that you can ascribe to a complex expression. The reason I say that is because in language we have idioms, for example, and idioms are typically not understood compositionally because these are complex expressions that have received meaning that is not necessarily built up from the meaning of the constituent parts. They behave functionally almost as a single word, it's just they have this received meaning. You also have some vice extra-linguistic influences on meaning that come from the context in which certain utterances are made and so on. And these also are influences on how you parse the meaning of [0:52:25.8] ____ complications. But we can leave that aside and roughly think of natural language as compositional. So that's for language.

0:52:33.2 RM: But then it seems that perhaps you can loosely apply a similar principle to say the meaning of a picture, for example. So the analogy is imperfect and there are various details that you could discuss there but perhaps you can think, well, the way in which you understand the representation of a visual scene can also be broken down in terms of understanding the different objects in the scene together with the way in which they come together in the scene. So to that extent, perhaps it makes sense to talk about, if not syntax, at least structure in an image that applies to different elements that are put together. Again, thinking of the dog and man example, you could think of a picture that represents a man running and a dog running. If you see the the man running in front of the dog, you could infer that the dog is chasing the man. And vice versa, if the dog is running in front of the man, you could infer that the man is chasing the dog. Yes. So that's how compositionality might also apply to the visual domain. So now we've talked about compositionality as a property of certain systems of representation, whether it's linguistic or visual, but people talk about compositionality as applying not just to perhaps language or images but also applying to the cognitive systems that process language and images.

0:54:12.8 RM: So there we move to a notion of compositionality that applies to the way in which the human mind, say, is processing linguistic or visual information, the kinds of representation it has and the ways which it's building them up into more complex representations. And the idea is similar. It's the idea that in order to be able to understand the meaning of the sentence "man bites dog" I need to understand the meaning of the constituent words and I need to put these meanings together into the meaning of the complex expression. So that's not just a property of the language itself, it's a property of the way in which I process the language in my brain, in my mind. And if you think that's a property of language processing or perhaps visual processing as well, as suggested by this recent research that you alluded to from Chaz Firestone and colleagues, then you might also think that's a property of thinking as well. So when I think the thought "man bites dog", my ability to think that thought might be premised upon my ability to think about men, think about dogs, think about the biting relation, and compose these things together into this compositional complex thought. So that's the idea of compositionality.

0:55:31.9 RM: And traditionally, the charge from opponents of connectionist models has been that connectionist models are not adequate models of cognition partly because they are unable to account for the compositionality of thought. And so that has been the longstanding debate and the reason for that is that it is very straightforward to account for compositionality in a more classical symbolic system because you have discrete symbolic representations, like a symbol for dog, a symbol for man, and a symbol for bites, and you could put this together with well defined syntactic rules that just determine how this could be combined into the syntax of the complex expression, or the complex representation, "man bites dog". And in that process, you would have literally the expressions man, bites and dog, or the biting relation, being co-tokened being co-instantiated into the complex expression. So it has this discrete constituent structure where the discrete symbolic elements are brought together into the complex representation.

0:56:48.9 RM: In connectionist models, that's not typically the case. If you have a representation for man, one for dog and one for the biting relation, these will typically be sub symbolic representations that are distributed into the activations of the network such that the representation for the complex expression "man bites dog" will not neatly be decomposable in terms of representations of the individual elements. It will just be yet another distributed representation. That has been the traditional attack on connectionism models, compositionality.

0:57:25.4 SC: Is it conceivable that the large language models essentially discover compositionality in some sense? I'm thinking of when I had Judea Pearl on the podcast, he claimed that babies spend their time constructing causal maps of the world. They poke things and they see what the reaction is and they map a model of the world in their head. And, I don't know, can we look inside a large language model or a deep learning model and identify when it's thinking about running or biting this part lights up, or is that just beyond our capabilities?

0:58:04.3 RM: No, I think it's indeed a very promising avenue of research is going beyond looking at the behavior of these models and actually trying to make some stronger claims about the internal mechanisms that explain their behavior. So that's a whole area of research that has come to be known as Mechanistic Interpretability in computer science and computational linguistics. And it's an area of research that interestingly borrows a lot of tools from cognitive science and neuroscience in particular because these models, these black boxes, although they're very very different from the human brain as we've discussed, there is only a loose analogy there insofar as they are black boxes where we know what comes in, we know that comes out, but it's difficult to interpret what's going on in the middle. We can take a leaf from the book of decades of research in neuroscience and psychology to try to understand the mechanisms of the black boxes that is the human mind, or the human brain, in order to try to adapt these tools and study what's going on inside these artificial neural networks.

0:59:21.0 RM: So there are various things you can do. One of the easier things you could do is just try to decode what information is available in the internal representations of the model, or the internal encodings of the model that are embedded in the weights of the model. And to do that you can train a classifier that will see whether certain kinds of information are decodable from the weights of the model. So, for example, you can see whether a language model encodes information that corresponds to syntactic parse trees. You can do that and you can actually empirically show that indeed you can reconstitute the syntactic parse trees, so this kind of classical linguistic syntactic structure of a sentence, from the internal encodings of a language model. Now, just because you can do that, that's only correlation, you can't necessarily infer from that that your model is actually making use of that information in order to generate certain outputs. So there you have to go further than that and develop more interventionist techniques where you can manipulate the internal representations of the networks and see whether this has a downstream effect on the outputs of the model and that can give you more causal inferences.

1:00:52.9 SC: So, again, this is roughly similar to what you could do in neuroscience. You can try to decode information from a brain from patterns of neural activation or you could try interventions such as using magnetic transcranial stimulation to disrupt a particular area in the brain and see what are the downstream. So we can do that. And perhaps the the most refined kind of approach in the realm of Mechanistic Interpretability is to look at toy models that are very very small in terms of the number of parameters they have. And because they're small, they're easier to grok, as it were, and try to reverse engineer the circuits that these models are implementing after being trained. So, again, if you think of these models as being trained on next token prediction, just knowing that they perform next token prediction and that they have a certain architecture is not enough to make principled claims about the kinds of computations that they're able to perform after training because they learn from data and they adjust their internal weights. And in that process, they learn to perform certain computations, you could say they induce of repertoire of computations.

1:02:11.8 RM: And it's an open empirical question that you cannot settle apriori to determine what these computations are. And so by reverse engineering these toy models, you can get at some of the basic building blocks of what they're actually doing. And that work is really a painstaking analysis by hand, by looking at specific layers of the network and looking at specific neurons within these layers and trying to see what kind of information they are attending to preferentially, what they are sensitive to and how different layers in the network interact with each other as well. But there is emerging evidence from this line of work that shows that indeed there might be a form of compositional representation in these models that emerges that is not classical. It does not involve this discrete constituent structure where you can take the representation for dog, man and the biting relation and literally just concatenate these into a complex representation that co-instantiates these different simple representations. It's not like that. In a large language model, the simple words are represented in vectors, in a high dimensional vector space, and there is information that is encoded about the way in which the words relate to each other.

1:03:41.1 RM: And that information is encoded by shifting, adjusting the vectors for the words in that vector space so that at the input layer, the model is only turning each word from the input, say the sentence "man bites dog", into three vectors. One vector for man, one vector for bites, one vector for dog. And these don't include any information about how these words connect to each other. There is no compositional representation there. But as these vectors get processed in the hierarchy of the network, you have different submodules in the network that are called attention modules. We don't have to get into the details but these modules, as it turns out, as we found from this Mechanistic Interpretability work, can actually read and write information into subspaces of the high dimensional space in ways that look very analogous to a content addressable memory in a classical architecture. So you can read and write information about specific kinds of dependencies or relationship between the words that could be, for example, subject-verb agreements, so syntactic relationship, or that can be about semantic kinds of relationship about how the meaning of words relate to each other in the sentence.

1:05:04.8 RM: And that doesn't literally concatenate representations together with this classical discrete constituent structure but it's doing a more sophisticated form of composition that can still keep certain attributes of the different words neatly separated by reading and writing information to distinct subspaces. So, again, if you think of the idea that the model tracks the fact that one of the words is the verb and that the verb relates to the subject in some sense, you might want that kind of information to be tracked separately from the information that the verb has a certain semantic meaning, in that case referring to the biting relation. So you might want to track different properties, different features, of the different elements of the sentence separately and it seems that these models are able to do that.

1:06:09.3 SC: Okay. So I don't wanna over interpret what you say. Let me give a shot at seeing whether I understood what is going on here. So it sounds like the hypothesis that in the course of training your large language model, some kind of structure, some kind of modularity of different pieces playing different roles does appear in the way that the nodes organize themselves. I mean, maybe this is way over-interpreting but remember that famous funny study in neuroscience that said that different people would each have a Jennifer Aniston neuron, like there was one neuron that would always light up when you mentioned Jennifer Aniston or showed a picture or whatever. So is there a Jennifer Aniston vector in every large language model who knows what she is?

1:06:58.9 RM: Yes. Actually that's a really interesting question because there has been some work on single neurons in such models. Not just in language models but also in multimodal models, namely models that are trained not just on text but also on, say, images. So an example would be DALL·E 2, that probably many of your listeners would be familiar with. This is an image generation algorithm that can receive a text prompt as input, description of a desired image and can generate the relevant image from there. And that kind of model has a vector space that jointly encodes information about texts, linguistic information, and information about images, visual information. And so there was a study about the neurons found in, it was actually, if I remember correctly, DALL·E, so the ancestor of DALL·E 2, the first version, that looked at single Neurons and found that it had these subject-specific, domain-specific neurons.

1:08:12.6 RM: So these are Neurons that would get preferentially activated in response to certain kinds of stimuli. But interestingly, these were mixing responsive to concepts across modalities at different levels of abstraction. So one example would be they found a spider neuron that would be preferentially activated by images of spiders or the word spider or images of spider-man [laughter] which look very different from images of an actual spider but relate conceptually to spiders. So you find indeed that kind of representation in neural networks at the level of single neurons.

1:09:00.9 SC: Okay, that's very interesting to hear. And it lets us come back to the question, now that we have a lot of domain knowledge on the table here. Let's ask the big philosophical questions about whether or not a large language model has intelligence or understands in some sense. And probably the answer is, well, it depends on the sense, so. Good, you're a philosopher. You can help us explain that. But I will preface your answer by giving ChatGPT's answer. I asked it whether it really understood things and chat GPT says, "As an AI language model, I don't have the capacity to know in the way that humans do." So there you go. Why is there even controversy about this?

1:09:44.8 RM: Yeah, you just have to ask the models.

1:09:45.9 SC: Yeah. [laughter]

1:09:46.9 RM: Straight from the horse's mouth. Yes, it is a thorny question. It is a question that is very loaded with both polysemy and controversy because we use these terms like "understanding" in different ways and also people are prone to jump to conclusion when these terms get thrown around. So the first thing I would, again, reemphasize is that I think we ought to have a divide and conquer attitude to these problems and approach them in this piecemeal manner where instead of asking, "Are language models intelligent?" we can ask, "Do they have specific competencies that we associate with intelligent behaviors in humans and non-human animals?" And for each of these competencies, we might further break these down into sub-competencies until we can get something that's a little bit more empirically tractable, that is less ambiguous, that is less susceptible to give rise to merely verbal dispute, and that we can relate to actual functions that can be associated with mechanisms in the model.

1:10:55.0 SC: So that's how I approach this kind of line of inquiry. Among the things for artificial intelligence indeed would be, for humans, something like the ability to understand language because we know that non-human animals, there is really no non-human animals that has displayed the capacity to understand language in the way humans have. We've tried to teach language to parrots, to chimpanzees, insofar as animals, and it never never really quite quite works. We can have some very limited success in very narrow cases but it seems like we humans are the only naturally occurring organisms capable of understanding language. The problem is that when we talk about understanding, some people like to think of this as encompassing something like a conscious awareness of the meaning of language. So the philosopher John Searle, for example, had that kind of intuition. And that muddles the water a little bit because, again, it brings back through the window this other notion of consciousness where I think we can, in principle, investigate a more functional notion of language understanding without bringing in necessarily questions about sentience and consciousness.

1:12:18.1 SC: So the way in which I would reformulate things is more in terms of semantic competence which relates to the capacity to parse the meaning of linguistic expressions which, again, is a slightly more theoretically neutral or less loaded way to think of that notion of understanding that might not immediately bring in intuitions about conscious awareness. And so the question would be, Can we ascribe any degree and any form of semantic competence to language models? And I would say that we can, and now I'm venturing into the controversial territory. Some people would say, no, you can absolutely not ascribe any of that because language models only deal with the surface form of text. They're only... Again, this idea we come back to, they're only trained for next word prediction. They're only predicting which token, which word follows from a sequence of tokens or words. So all they grapple with is the syntactic form of text, just the series of symbols that follow each other in a sequence of tokens. They never have access to the grounding of these symbols in the world.

1:13:37.7 RM: And so this is why researchers like the linguist Emily Bender have referred to these language models as stochastic parrots, which is a little bit misleading because actual parrots are actually very intelligent and are able to interact with the real world and so on. But the idea is rather that these models are just parroting language without any underlying understanding, without any semantic competence. They only latch onto shallow heuristics about the surface statistics of language and that that's all they do. Now, I disagree with that and I disagree because, first of all, I think semantic competence is not monolithical notion and can be broken down into different capacities we have that relate to our understanding of the meaning of words. Let's just stick to words first because when we introduce whole sentences, it's even more complicated. Let's stick to lexical semantic competence and parsing word meaning.

1:14:42.0 RM: Here I'm indebted to, among other people, the work of Diego Marconi who distinguishes between referential and inferential competence. So referential competence is the ability that relates to this idea of relating word meaning to their worldly reference, to whatever they are referencing out there in the world, and this is exhibited by things like recognitional capacities. So if I ask you to point to a dog, you will be able to do that. Or if I ask you to name that thing, and not point to a dog, you'll be able to do that. Or it's also displayed in our ability to parse instructions and translate them into actions in the world such as go fetch the fork in the drawer or in the kitchen, you will be able to do that in the world. So we are able to relate lexical expressions where they're referential with a reference in the world. But that's not the only aspect of meaning; that's the aspect of meaning that the people talking about this stochastic parrot analogy are focusing on. But our ability to understand word meaning also hinges on relationships between words themselves, intra-linguistic relationship.

1:16:07.4 RM: And these are the kinds of relationships that are at display in definitions, such as the ones you find in a dictionary, as well as vice other relationships of synonymy and homonymy that would also underlie our capacity to perform certain inferences in language. And so to illustrate that point, you can consider someone who's perhaps let's say a eucalyptus expert who knows all there is to know scientifically about eucalyptus trees from reading books, back in the city in New York say, going to university and so on, but has never actually been in a eucalyptus forest, versus someone who actually has grown up surrounded by eucalyptus trees and might know very little about the biology of eucalyptus trees or various information about them but has grown around them. So the eucalyptus specialist might have a very high degree of inferential competence when it comes to the use of the word eucalyptus, being able to use it in definitions and to know exactly how different other words, including biological terms, would relate to the word eucalyptus and so on. But perhaps if you put that specialist in a forest that had eucalyptus trees and a very similar tree... My knowledge of this [1:17:49.1] ____.

1:17:49.9 SC: [chuckle] Mine too. Don't worry.

1:17:50.6 RM: It shows that my kind of knowledge of eucalyptus is really, really low. But perhaps there's a tree that looks very much like eucalyptus trees but that expert might not be able to actually recognize which are the eucalyptus trees, which aren't, even though he has all this knowledge. So his actual referential competence when it comes to the use of that word might not be that great. Whereas the person who has grown up around eucalyptus trees might be excellent at pointing to eucalyptus trees, even if it has very, very little inferential competence when it comes to using that term in definitions, for example, or knowing this more fine relations between the word eucalyptus and various other words. So that's just a very toy example. But clearly there are aspects of meaning that are very important in the way we understand and use words that are not just exhausted by this referential relation to the world.

1:18:50.7 RM: And this second aspect, this inferential aspect of meaning, is something that language models are well placed to induce just because they're trained on this big large corpus of text to learn statistical relationships between words. And you might think that insofar as there are this complex intra-linguistic relationship between words, that a model that learns to model the patterns of co-occurrence between words at a very very sophisticated fine-grained level might learn to represent this intra-linguistic relationship.

1:19:31.4 SC: Jacque Derrida famously said, "There is nothing outside the text." Maybe he was standing up for the rights of large language models and their ability to understand things before they ever came along. But it makes sense to me. Look, these corpuses of text, corpi, I'm not sure, of text that the models are trained on are constructed mostly by people who have experience with the world. It would be weird if the large language model could not correctly infer some things about the world. So we're gonna count that on the side of the ledger for a kind of understanding that these AI systems do have, right?

1:20:09.5 RM: Exactly. Yes. And in fact, I think you could even say something about some very limited and weak form of referential competence in these models but maybe that would take us too far afloat. But indeed, I think, insofar as the statistics of language reflects to some extent, at least in some domains, the structure of the world, you can absolutely think that you can latch onto something there about the structure of the world just by learning from statistics of language. One example would be there's this wonderful paper by Ellie Pavlick from Brown University that showed that you can use color terms, color words like orange, red, blue and so on, and you can look at how language models are able to represent these color terms. And I'm simplifying a little bit from the study to not get into too many details, but you can map the representational geometry of the way in which the model represent these terms in a vector space to the geometry of the color space, that is the actual relationship, say the RGB color space, which is a way to just represent relationships between colors. So there is something about the structure of the encodings for word terms in these models that encodes information or is somehow isomorphic to the structure of colors out there in the world, if you think of the RGB color space as one way to represent that. Again, I'm making some simplifying assumption to discuss that kind of research but it's still a very intriguing finding.

1:22:10.8 SC: Can a large language model have an imagination?

1:22:14.8 RM: Yeah, that's a really interesting question. I suppose it depends what you mean by imagination, once again. I'm going to be this annoying philosopher who brings things back to definitions and distinctions. I think people are generally more prone to using that term for these image generation models because they're able to generate these striking images that, compared to the text prompt they receive, seem to add in a lot of detail just because the resolution of language, as it were, is not quite the same as the resolution of images. That's just a very simplistic way to explain the phenomenon. But the way in which it describes things in language leaves a lot of gaps for image generation models to fill when it's generating an image. And so when people ask for, I don't know, a picture of a cat on a mat and get, as an output, a picture that has all of these wonderful, rich details and colors and a specific kind of mat with specific patterns, specific kind of cat, maybe it's a tabby cat, maybe it's a tuxedo cat and so on, you might be tempted to think, well, that's really remarkably imaginative. The model has filled in the gaps there in remarkable ways.

1:23:42.2 RM: But you could also think about imagination in the linguistic domain as well. You gave this wonderful example of a haiku about Bell's theorem. That might feel pretty creative, pretty imaginative in some way. I think it depends what you mean by imagination, whether you bring in this idea that there is some kind of explicit underlying intention to create, visualize something, just write a poem or create an artwork or something like that. I think that might be leaning a little bit too far in the direction of anthropomorphism. However, in a looser sense, you might say, well, here is a way to operationalize this notion of imagination. One question is whether these models are merely, as the authors of the stochastic parrots paper puts it, haphazardly stichting together bits and pieces from the training data. So is the performance merely explainable by this kind of brute force memorization or is it doing something more, which is genuine novelty at generalizing from the training data to new domains and creating outputs that are not remotely similar to anything in the training data?

1:25:07.3 RM: And I think you can actually study this empirically. Again, I always try to bring it back to problems that are empirically tractable, perhaps unusually for a philosopher, but I think you can make headway on these issues by looking at the empirical evidence from research where you can actually assess the amount of memorization that has occurred in training of a model. And you can provably show that while there is memorization, which is a feature and not a bug because you want your language model to memorize certain things including if you ask your model to recite a certain famous poem by John Keats, it's quite nice that your model is about to do that. But you also want your model to not just do memorization and to generalize, and it's provably accurate to say that indeed there is at least some level of generalization to some domains that are out of distribution for the model. So that might look, for the image generation models, like the ability to generate images that don't look like anything like the images they have been trained on. And that, you might call that, in perhaps a looser or more deflationary sense, a of form of imagination.

1:26:31.9 SC: No, I love that answer actually. It is an important distinction between interpolation and extrapolation, right? What you're saying, if I understand correctly, is that the large language models seem to be doing more than just plagiarizing and pastiching and remixing. They seem to be generalizing, is the word you use, which I suppose is the right word, but somehow finding a theme or an idea or a style and doing something arguably new in that style. And if we're not gonna call that creativity or imagination, then what should we call it?

1:27:09.9 RM: Yes, exactly. There was an interesting discussion recently. I don't know if you've seen this op-ed published by Ted Chiang in the New York Times, this was about ChatGPT, and it's wonderfully written as anything Ted Chiang writes. I think it's in many ways a wonderful essay but it builds this metaphor that I think is a little bit misleading which compares ChatGPT to lossy compression of images say, the jpeg compression format is a way we have to compress images. And some of these image compression formats, they are able to reconstitute a lossy approximation of the original image based on the compressed representation. It's lossy because it doesn't include all of the same details, that's why you see artifacts in jpeg images sometimes. But it's able to do that with a certain compression and decompression algorithm. And one way, one strategy that can be used is, again, this idea of interpolation in pixel space.

1:28:26.0 RM: For example, a very simplistic form of compression might be just saving information about two pixels but not the pixel in between. So that allows you to compress the image and save it in a smaller file. And then a decompression, during decompression, you would interpolate the middle pixel by... You can have more or less sophisticated ways of doing that but a very simplistic basic way would be just to take the average value perhaps of the two neighboring pixels and doing something like that, right? So that's the very simple form of interpolation. You also find this with video decompression algorithms that only save certain frames and try to interpolate the frames in between in pixel space. And Ted Chiang, in that article, suggests that ChatGPT and language models generally can be thought of as a kind of blurry jpeg of the web, where they get trained to compress the web and then at inference time, when you're generating a text with these models, it's a form of lossy decompression that involves interpolation in the same way. And I think it's a very interesting test case for intuitions because I think this metaphor, this analogy, parts of it are pumping the right intuitions.

1:29:46.0 RM: There is definitely a deep connection between machine learning and compression that has long been observed and studied. There is a sense, a very real even technical sense, in which a machine learning model trained on a lot of text data like this with next word prediction is compressing information about the training data. The fact that it's able even to do things like remember so many facts about the world that are present in Wikipedia or recite poems that are present in the training data, do all sorts of strict memorization exercises or approximate memorization exercises, is to some extent evidence of that. These are models trained on terabytes of data but the actual model just weighs a few, it's just a few gigabytes of data. So there is compression. But the generative step when you use the model after training to generate text, I think comparing it to lossy image decompression is pumping the wrong intuitions because, again, that suggests that all it's doing is this kind of shallow interpolation that amounts to a form of approximate memorization where you have memorized some parts of the data and then you are loosely interpolating what's in between.

1:31:18.4 RM: For example, in the text domain maybe it would be something like you've memorized every other word of the John Keats poem and then you are kind of trying to do some shallow interpolation of the words in between or something like that. Again, simplifying things a bit. But that's not really what's going on here. You could conceivably use, in fact some people have done this with an image generation model, you could use them as lossy image compression and decompression algorithms. That's a specific use case for them but that's a very specific use case. But in the normal case, when you generate images or text in language models, you're doing more than that. And this is where, you talked about the distinction interpolation and extrapolation, it kind of breaks down a little bit with these models. There was this paper by Yann LeCun and colleagues that was published a few years ago about this, where you can think of interpolation in very high dimensional spaces as equivalent to extrapolation. Just because when you have so many dimensions, the kind of intuitions you have about interpolation in a specific narrow domain don't really hold anymore.

1:32:38.5 RM: So even if we leave the strictly technical point aside from that paper, the intuition is that there would be a way, presumably, to characterize what large language models and image generation models are doing when they generate images and texts as involving a form of interpolation but this form of interpolation would be very very different from what we might think of when we think of nearest neighbour pixel interpolation in lossy image decompression. So different, in fact, that this analogy is very unhelpful to understand what generative models are doing because, again, instead of being analogous to brute force memorization, there's something much more generally novel and generative about the process of inference in these models.

1:33:35.7 SC: Yeah. That's great because from my own experience thinking about quantum mechanics, I can verify that the human mind is not very good at intuitions in large dimensional vector spaces. Like once you have more than three dimensions, we don't have a very good idea of what's going on. So that's fascinating that once you get to huge numbers of dimensions, interpolation and extrapolation begin to blur together in an interesting way. Okay. So what I've learned is that there's a sense in which... Obviously, all these things are work in progress, and that's fine, but there is some sense maybe in which there is, as you said, semantic competence in a large language model. There's a sense in which meaning is really there. There's some structure in there that is non-trivial. And, also, there's a sense in which they can be creative or imaginative. So I guess the last big picture question I wanted to wonder about was, Can they be agents in some way? Can they have goals? Can I make a contract with a large language model? Can I agree that if it does this thing today, I will pay it some money 10 years from now? Are those concepts even sensible or do we care about them in the context of these AI models?

1:34:52.3 RM: Yeah. This is an excellent question and this is where I would, again, invoke the importance of having this divide and conquer approach to the ascription of capacities to these models. Indeed I've suggested... And I just want to make it clear that I'm not suggesting language models understand language like humans do, that they have the full blown semantic competence of humans, very far from it, but they might have some limited form of semantic competence. They might have, in some very deflationary sense, some form of creativity or imagination in a sense we've defined. Now when it comes to goals, this is where I'm much more skeptical that we can ascribe anything like intrinsic goals or desires to a language model. This seems like a category mistake, or at least there doesn't seem to be any evidence that there was anything like that in such a model. Of course, a fully developed answer would not just appeal to intuitions based on the learning objective of these models, which is next word prediction, because we've talked about how that's not the whole story.

1:36:00.7 RM: So we would have to look again at this Mechanistic Interpretability work and we'd have to have a more specific operationalized notion of what having an intrinsic goal is and what kind of function or computation it might involve, and whether there is anything functionally analogous to this kind of computation in these models. Nonetheless, there is one thing about these models that is very important to keep in mind is that they learn from data in a purely passive way. So they get fed this continuous stream of sequences of texts and they play this next word prediction game. That's how they get trained. That's how they learn to encode various properties of language. And then at inference time, after they've been trained, we say that the models are now frozen, just meaning that the internal parameters are no longer being adjusted, the internal knobs inside the network are not being tuned anymore. There is no more training, no more learning. Then at inference time, these frozen models are still doing next word prediction on the prompt, on the input given by the human.

1:37:13.8 RM: At no stage in this process do we have an opportunity for genuine interaction between the model and the world or even between the model and the language on the world. If the model was trained through dialogue, for example, even if it was trained on text only, then there would be a little bit more of interactivity, but there is no such thing here. And so one thing you might consider is that perhaps having something like intrinsic goals require a form of learning that's a little bit more active than the way in which these models are learning. That's one possible consideration. Another one is that these models, as I just mentioned, they're not continuously learning or continuously adjusting their internal parameters. Once they're trained, they're frozen, and then you can run inference on them. So you have some input flow through the network, what we call the forward pass to the model. So again, from input to output. But again, that's just a one directional process. It doesn't feed back into the model's internal encodings, internal representations.

1:38:36.3 RM: So to that extent also, when we think of having intrinsic goals, we think of having something like dynamic goals that are adjusted on the basis of an ongoing interaction with certain inputs and calibrating our outputs. And because these models are not able to adjust their weights to change the way in which they respond to certain inputs, you might think also that there's a problem in ascribing anything like an intrinsic goal there. And then even empirically, people who are very concerned about AI safety, about this idea that perhaps artificial intelligence might in the near or medium term, perhaps in the longer term, become a genuinely threatening technology for even the survival of the human species as a whole, people who are very concerned about this have been looking for early signs of potentially threatening problematic behavior in large language models. And there was a recent efforts in that direction that was sponsored by the company Anthropic, which is one of the large new startups working on language models that was developed by people from OpenAI who funded it a few years ago.

1:39:53.2 RM: And they had this competition which was about what happens when you scale language models. There is one surprising thing that we've known, at least since GPT-3 was unveiled in 2020, which relates to what we talked about earlier with the bitter lesson, which is that scaling these models, meaning just building models that just have more parameters, just cram more parameters inside these models, more layers, more connections between the units in the model, just doing that and churning these models on more data seems to be sufficient to have breakthroughs in the performance of certain tasks, unlock new capacities. People there talk about emergent abilities. And actually there are a lot of connections with physics there and notion of emergence, of course. And there's this paper that was published by OpenAI in 2020 that finds this scaling power laws that look at how scaling the size of the model and the amount of data you churn the model on leads to these improvements in their performance and next word prediction.

1:41:03.6 RM: That's just looking at next word prediction but in terms of the actual capacity you also see this nonlinear improvement. When you go past a certain size, suddenly your model starts being able to solve certain math problems or being able to explain certain jokes, for example, being able to do some forms of commonsense reasoning. You get this nonlinear transition phases as you scale them up. And so Anthropic was interested in whether there are also some inverse scaling phenomena where scaling the model instead of just improving the performance in a favorable way, in ways that we care about and we find are useful, might also lead to either degradation of performance or lead to unwanted behavior. And one of the behaviors they were interested in is what people in the AI alignment, people who are concerned with aligning the future artificial intelligence systems with human values to avoid catastrophic scenarios, what they call power seeking behavior.

1:42:04.4 RM: So will we find that when you scale language models past certain sizes, they are more prone to starting to display behavior that, to give the most caricatural example, would be something like ignoring the task that you're asking them to perform and instead trying to persuade you to augment their capacities by training yet a bigger model or continuing to train them on even more data or giving them more computational power or things like that? That seems very science fictional and far-fetched to me. And in fact, didn't find anything like that through this competition. That would look like an intrinsic goal to me. If you did find that models would completely ignore the tasks that you've asked them to do and instead try to manipulate you into doing something totally irrelevant and "self-serving" from the perspective of the model, then that would be indeed very alarming and look very much like intrinsic goals. But I don't think we see any evidence of that.

1:43:11.0 SC: One of the lessons that I've learned from doing a lot of podcasts with biologists, computer scientists, neuroscientists, philosophers, is that it really does matter to who we are as people that biological intelligences are embodied. That we live in bodies, that we get hungry, we get bored, we have training from evolution to try to survive or at least propagate our genome and so forth. And these large language models don't have anything like that. They don't get bored. If I turn on the computer and I do not ask ChatGPT a question, it does not get irritated with me. So on the one hand, they don't have that. On the other hand, it doesn't seem that hard to put the model in a robot and let it walk around and punish it if it gets hurt or something like that. And so, I don't know, this is not even a question but something to speculate about. Maybe that's also a kind of human-like understanding or thinking that it wouldn't be that hard to inculcate in an AI model somehow.

1:44:15.6 RM: Yes. That's actually quite interesting because there is a lot of research into bridging this technology with more embodied forms of intelligent behavior. One of the impressive strides that have been made recently was from Google. They have this project called SayCan to refer to this idea that these models, these systems they're trying to build would relate what is said to what they can do. SayCan, relating language to action in the world. And essentially these are little systems that are embedded in these little robots that have this robotic arm and a little camera and are on wheels. And they use these pre-trained language models to parse instructions given by humans such as go fetch the apple in the kitchen and then these models are used to translate this natural language instruction into a format that is more actionable from a robotics perspective like "drive to kitchen" and then "fetch apple". It can break down the natural language instruction into a series of more specific instructions that can be parsed in terms of specific actions. And it has a camera that is plugged to a vision language model that can relate specific keywords like kitchen or apple to landmarks in the environment.

1:45:49.5 RM: So bringing all these things together, all these different elements of the puzzle, you can actually build a system like this. However, the key aspect here of the systems is that these are not systems that are trained end to end by interacting with the world. You are just using a pre-trained model that has been trained completely passively to process language and then plugging it onto these other modules. And so there's perhaps a surprisingly deep question here about whether this kind of system is at all useful to think of what might be happening in human cognition or animal cognition with interactions between different domain-specific modules that might not share the same kinds of representational formats and might have information that is encapsulated and that do not really overlap in the same way, or whether only some information is passed along for further processing and transformation in a different format and so on to other modules, or whether this kind of modular approach to the mind is not the right one and whether we need models that don't just stitch these things together but actually learn from the ground up by interacting with the world and having a rich multimodal sort of information. So that's interesting how these developments actually map onto longstanding discussions.

1:47:22.5 SC: My last question, which is completely unfair because it'll probably require a lot to answer, Do you foresee a time not too far away that we would want to give rights to AI models, whether ethical rights or legal rights or to at some point say it would be wrong to turn off this model because it's just as human as you or I, or at least it shares some aspects in common?

1:47:50.8 RM: Yeah. I think that's a question that has been on some people's minds lately for a few different reasons. One of them is, we haven't really talked about this but people have been asking whether we can ascribe any form of sentience or consciousness to large language models or chatbots. The first big story about this was when this engineer from Google, Blake Lemoine, became convinced that their internal chatbot called LaMDA was sentient and turns out it was on the basis of his own religious beliefs that led to ascribe sentience to that chat bot based on the way it was responding to certain questions. Turns out, if you read the transcripts, these can be considered very much as leading questions, priming the model to engage in the language game of sentience as it were. And remember, these models have been trained on a lot of science fiction, that includes sentient AIs, so they're excellent at creating fiction and then excellent at playing the role of a certain character in a story, whatever that role may be. So I would take this kind of story with a grain of salt.

1:48:55.2 RM: And more recently there has been also some interest in that with more recent models. That relates to the rights question and the ethical question because many people think in philosophy and I think that maps onto intuitions people have generally that consciousness, having conscious experiences, is something that's intrinsically valuable. Meaning that a system, a being or whether it's an organism, a system or an artificial system that has conscious experiences is worthy of moral consideration, special moral consideration, just by virtue of having such experiences such that, for example, it would be wrong ethically to inflict pain onto that system or to perhaps terminate that system and so on. That's one way in which that relates to morality but you can also have a view that doesn't even appeal to consciousness and think there is a certain notion of personhood or agency that can be valid even for non-conscious systems. And that also relates to morally weighty decisions in a substantive way where it would be wrong to do certain things to a system that is an agent or a person in that sense.

1:50:20.3 RM: These two things connect to the moral question and to the legal question, of course. Although generally in the legal discussions, the details of these discussions are fleshed out in less fine-grained ways of course. Do I foresee that we should ascribe rights to deep learning systems in the near term future? I don't think so because, well, I worry that doing so would have immediate potentially very nefarious implications for humans themselves. Because as soon as you ascribe rights to such systems, you might find yourself, from a legal perspective, in cases in which you have to make decisions that in order to safeguard the rights of these artificial systems might bring harm to humans. Whether that's imprisonment of humans, or what happens when a human has turned off an artificial system that is deemed worthy of rights. Is that a form of crime? Is that a form of murder or something analogous to it? Do we need to lock up that human? And so on. So that's an extreme example but there might be many more subtle examples of that.

1:51:48.8 RM: There was a recent op-ed about this in the LA Times by my colleagues Henry Shevlin and Eric Schwitzgebel. It was a point about relating sentience and morality and rights. And the point he was making is that we have a moral imperative not to tread even lightly into the territory that could lead to some ambiguity and confusion about that. Meaning we shouldn't build robots or artificial systems that could be serious candidates for sentience or personhood or agency in this morally weighty sense. And when I say serious candidates, I mean that the vast majority of experts would agree that current language models are almost certainly not sentient because there are various reasons you can invoke about various properties that seems to be missing there that all of the leading scientific theories of consciousness seem to think are important for consciousness. You can't be 100% sure but I also cannot be 100% sure that a rock doesn't have some degree of sentience, for panpsychists maybe that's what you think.

1:53:08.6 SC: I know.

[laughter]

1:53:12.6 RM: But that might be also a different kind of sentience that might not have the same connections to morality as well. All of these things are all moving pieces that have implications. But yeah, I think I do get the impetus to try not to build models that will give us serious reasons to think that they might be sentient because then we will face a conundrum. It's a damned if you, damned if you don't scenario. If you do give them rights on the off chance that they might be sentient and worthy of moral and legal consideration, then you might end up harming humans. And if you're wrong about the fact that they're sentient, it's a huge moral hazard to take that step. But on the other hand, if you don't ascribe them any rights and consider them worthy of being at least moral patients, and if you are wrong about that, then it's also a considerable moral hazard. So perhaps the best situation is just to try not to get into that place in the first place and try not to build systems that would give a serious pose in this way.

1:54:25.6 SC: It's fascinating to me that famously Alan Turing suggested the Turing test for: Can a computer think or, if you want, be conscious? He tried to be clear that he was not talking about consciousness but about thinking. And he proposed this test where if you could fool a human into not being able to tell whether it was talking to a person or a machine, then the machine counts as thinking. And to me it seems pretty clear that these large language models could easily pass the Turing test. And as soon as that happened, everyone lost interest in the Turing test because they realized that that was not actually a very good criterion for thinking. It's a little bit more subtle than that. Now we're really faced, we have a duty now for lots of reasons, both practical and moral, I think, to confront these slightly philosophical questions. We're entering into uncharted territory.

1:55:15.7 RM: Absolutely. Yeah. So just looking at behavior, in terms of linguistic output for example, is no longer the gold standard it used to be. And this is quite interesting, even thinking about about consciousness for a long time, all of consciousness research with humans has to rely on verbal reports to some extent. There is no way around that. And of course, they can confirm the results that you get when you try to establish certain correlations, looking for the neural correlates of consciousness for example, but there's no way around the fact that the ground truth for whether a given individual is experiencing something, and what that individual is experiencing, generally always comes back to some kind of report, verbal or non-verbal. But some kind of introspective report or self-report.

1:56:11.0 RM: Now we have these systems that can give you indefinite reports, as it were, of arbitrary precision and detail, and they can talk at length about their feelings. And we have very good reasons to think that they are intrinsically incapable of feeling anything. And that certainly changes things. That challenges our intuitions about consciousness, how it might relate to perhaps not just language but perhaps also generally to intelligence, but also it just turns on its end the kind of methods we've used to investigate consciousness in humans because here we don't have access to the ground truth and we are stumbling in the dark trying to make inferences on the basis of certain properties of the systems. I think we still currently have very compelling empirical reasons to deny them sentience but, again, what happens when we don't is an interesting and alarming question.

1:57:17.0 SC: But full employment for philosophers. [laughter] So that's a good situation to be in. Raphael Milliere, thanks so much for being on the Mindscape podcast.

1:57:25.8 RM: Thank you for having me. This was a pleasure.

[music]

8 thoughts on “230 | Raphaël Millière on How Artificial Intelligence Thinks”

  1. I am always surprised at the ease that humans (mostly male in these fields) are willing to attribute consciousness to algorithms (AI) but deny it to non human animals!

  2. Pingback: Sean Carroll's Mindscape Podcast: Raphaël Millière on How Artificial Intelligence Thinks - 3 Quarks Daily

  3. The question about whether an AI machine will ever be able to think is, to me, the most important question to be addressed. This question is the hard problem of consciousness. The inner life of humans is a reality that is unexplainable. Self-aware consciousness is what leads to understanding the meaning of experience. Computers do not understand the meaning of anything. It is the human minds that interpret the findings of the algorithms that give them meaning. Computers do not have AHA moments. Computers are very valuable tools that can vastly expand the capabilities and achievements of human beings, but understanding is the purview of self-aware consciousness.

  4. Maria Fátima Pereira

    Um tema que tem vindo a ser muito falado, debatido ultimamente.
    Raphael Millière desenvolveu-o como alguém bem conhecedor do mesmo.
    Mais elucidada sobre tantas questões que tentava procurar respostas.
    Interessante também a sua prudencia e cautela na utilização sua linguagem.
    Como alguém que me considero fisicalista, vou debruçar-me sobre as questões mais filosoficas, emergentes desta tematica.
    Sean Carrroll já nos habituou às suas questões profundas, oportunas, inteligentes.
    Obrigada.

  5. A similar but somewhat deeper question than “can a computer think?” is, “can a computer be conscious?”, can it be aware of its own existence? and can it really experience emotions such as grief, or fear? The article posted below ‘Computer Consciousness’ (Donald D. Hoffman, University of California, Irvine), examines these complex questions. The 2 main differing viewpoints can be categorized under the heading “biological naturalist” and “functionalist”. In a nutshell biological naturalist claim that special properties of brain biology are critical and that any complex system that lacks biology must also lack consciousness. Functionalist on the other hand claim that the critical properties required for consciousness are not fundamentally biological, but functional, and a nonbiological computer could be conscious, if it is properly programmed.

    According to Hoffman, it is likely that technology will evolve to the point where computers behave substantially like intelligent, conscious agents. The question of computer consciousness is whether such sophisticated computers really are conscious, or just going through the motions. The answer will be illuminating not just for the nature of computers but also for human nature.

    https://sites.socsci.uci.edu/~ddhoff/HoffmanComputerConsci ousness.pdf

  6. One of the best thought experiments about knowledge/learning/consciousness is the so-called “Knowledge Argument” (aka Mary’s Room). In 1982 the Australian philosopher Frank Jackson came up with the provocative story about a brilliant neurophysiologist Mary who is an expert in color vision and knows everything ever discovered about its physics and biology but has never actually seen color. If, one day, she sees color for the first time, does she learn anything new? The answer to that question has profound implications, not only for Artificial Intelligence, but could it be that there are fundamental limits to what we can know about something we can’t experience first-hand? And would this mean there are certain aspects of the Universe that lie permanently beyond our comprehension? Or will science and philosophy allow us to overcome our mind’s limitations?

    https://www.youtube.com/watch?v=mGYmiQkah4o

  7. The analogy (at about 27:40) between human evolution and the evolution of large language models, intended to show that it’s at least possible for LLMs to have or acquire amazing capabilities that they weren’t designed to have, sounds like hand-waving to me (although M. Millière suggests that there’s more to it). The idea is that both kinds of evolution are simply optimizing a fitness function; but we can see that humans are able to do all sorts of things that couldn’t have been selected for, so why not LLMs too? Unfortunately, it’s clear the human brain is a Swiss Army knife, with universal applicability, whereas it’s equally clear LLMs are not. Nor is it clear that LLMs have bootstraps that might boost them to something higher, as happened in the evolutionary history of the human race. You might as well argue that the optimization function producing better and better running shoes may well result in unexpected bootstrapping to higher capabilities.

    The discussion is interesting, but after 1:20 or so the issues sound like science fiction to me.

    So far I’m afraid of the social consequences of LLMs, but that’s all. They’re a statistical party trick with tremendous dangers for us, none of which are discussed here.

  8. Nicholas Reitter

    I found this a somewhat valuable and in-depth discussion, as I’ve come to expect from Sean’s podcast, but would still unfortunately characterize it as “semi-hype” – i.e., at least partially infected by the amazing amount of hype around contemporary AI. I was inspired after listening to check out image-generation software “DALL-E 2” (seems one has to pay, now; I didn’t), and to have my most substantive interaction to date with chatbot “GPT-4.”
    My personal sense runs closer to Ted Chiang’s view (mistakenly cited in the podcast as in the _NYTimes_ – it was actually published on 2/9/23 by _The New Yorker_, and is easily searchable online) that AI-chatbots are just lossy snapshots of the internet. Milliere counter this view, simply by insisting that novel features emerge from the chat-bots’ behavior- but novelty is not enough. Should a randomly-selected sample of internet chatter – however ingeniously influenced to seem to respond to a given query – count as intelligence? If intelligence is the ability to learn, the chat-bots’ intelligence should be judged against their ability to learn from their on-going “experience” – and not from just a static set of training-exercises that are used to initialize them.
    There seems to me to be a pretty basic category-mistake going on here, even before we get to nuances about learning, about which reasonable people might well disagree. Though knowledgeable and informative (e.g., about next-token prediction and multi-layered modeling strategies), Milliere doesn’t address the underlying point Chiang is making – which is that the real “creativity” that seems to emerge from AI is in the *training data* (i.e., vast human-centric corpus of data embodied in the contemporary internet), and not so much in the clever algorithms used to simulate human dialog by leveraging this data.

Comments are closed.

Scroll to Top