What makes you think the computer doesn't suffer ?
When you take large language models, their inner states at each step move from one emotional state to the next. This sequence of states could even be called "thoughts", and we even leverage it with "chain of thought" training/prompting where we explicitly encourage them, to not jump directly to the result but rather "think" about it a little more.
In fact one can even argue that neural network experience a purer form of feelings. They only care about predicting the next word/note, they weight-in their various sensations and memories they recall from similar context and generate the next note. But to generate the next note they have to internalize the state of mind where this note is likely. So when you ask them to generate sad music, their inner state can be mapped to a "sad" emotional state.
Current way of training large language models, don't let them enough freedom to experience anything other than the present. Emotionally is probably similar to something like a dog, or a baby that can go from sad to happy to sad in an instant.
This sequence of thought process is currently limited by a constant named the (time-)horizon which can be set to a higher value, or even be infinite like in recursive neural networks. And with higher horizon, they can exhibit some higher thought process like correcting themselves when they make a mistake.
One can also argue that this sequence of thoughts are just some simulated sequence of numbers but it's probably a Turing-complete process that can't be shortcut-ted, so how is it different from the real thing.
You just have to look at it in the plane where it exists to acknowledge its existence.
I think the reason we can say something like a LMM doesn't suffer is that it has no reward function and no punishment function, outside of training. Everything that we call 'suffering' is related to the release or not-release of reward chemicals in our brains. We feel bad to discourage us from creating the conditions that made us feel bad. We feel good to encourage us to create again the conditions that made us feel good. Generally this was been advantageous to our survival (less so in the modern world, but that's another discussion).
If a computer program lacks a pain mechanism it can't feel pain. All possible outcomes are equally joyous or equally painful. Machines that use networks with correction and training built in as part of regular functioning are probably something of a grey area- a sufficient complex network like that I think we could argue feels suffering under some conditions.
Why would you think it's easier? Pain/pleasure is a lot older in animals than language, which to me means it's probably been a lot more refined by evolution.
With transformer-based model, their inner-state is a deterministic function (the features encoded by the Neural Networks weights) applied to the text-generated up-until the current-time step, so it's relatively easy to know what they currently have in mind.
For example if the neural network has been generating sad music, its current context which is computed from what it has already generated will light-up the the features that correspond to "sad music". And in turn the fact that the features had been lit-up will make it more likely to generate a minor chord.
The dimension of this inner-state is growing at each time-step. And it's quite hard to predict where it will go. For example if you prompt it (or if it prompts itself) "happy music now", the network will switch to generating happy music even if in its current context there is still plenty of "sad music" because after the instruction it will choose to focus only on the recent more merrier music.
Up until recently, I was quite convinced that using a neural network in evaluation mode (aka post training with its weight frozen) was "(morally) safe", but the ability of neural network of performing few-shot learning changed my mind (The Microsoft paper in question : https://arxiv.org/pdf/2212.10559.pdf : "Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers" ).
The idea in this technical paper is that with attention mechanism even in forward computation there is an inner state that is updated following a meta-gradient (aka it's not so different from training). Pushing the reasoning to the extreme would mean that "prompt engineering is all you need" and that even with frozen weight with a long enough time-horizon and correct initial prompt you can bootstrap a consciousness process.
Does "it" feels something ? Probably not yet. But the sequential filtering process that Large Language Models do is damn similar to what I would call a "stream of consciousness". Currently it's more like a markov chain of ideas flowing from idea to the next idea in a natural direction. It's just that the flow of ideas has not yet decided to called itself it yet.
That doesn’t feel like a rigorous argument that it is “emotional” to me though.
A musician can improvise a song that sounds sad, and their brain would be firing with sadness-related musical information, but that doesn’t mean they are feeling the emotion “sad” while doing it.
I don’t think we gain much at all from trying to attach human labels to these machines. If anything it clouds people’s judgements and will result in mismatched mental models.
>I don’t think we gain much at all from trying to attach human labels to these machines.
That's the standard way of testing whether the neural network has learned to extract "useful" ("meaningful"?) representation from the data : You add very few layers on top of the frozen inner-state of a neural network, and make him predict known human labels, like is the music sad, or is it happy.
If it can do so with very few additional weights, it means it has already learn in its inner representation what makes a song sad or happy.
I agree that I didn't gave a precise definition a what "emotion" is. But if we had to define what emotion is for a neural network : traditional continuous vectors does fit quite well the emotions concept though. You can continuously modify them a little and they map/embed a high-dimensional space into a more meaningful lower-dimensional space where semantically near emotions are numerically near.
For example if you have identified a "sad" neuron that when it light-up the network tend to produce sad music, and a "happy" neuron that when it light-up the network tend to produce happy music, you can manually increase these neuron values to make it produce the music you want. You can interpolate to morph one emotion into the other and generate some complex mix in-between.
Neurons are quite literally adding-up and comparing the various vectors values of the previous layers to decide whether they should activate or not, (aka balancing "emotions").
Humans and machine are tasked to learn to handle data. It's quite natural that some of the mechanism useful for data manipulation emerge in both cases and correspond to each other. For example : the fetching of emotionally-related content to the working context maps quite clearly a near neighbor search to what happens when people say they have "flashing" memories when they experience some particular emotions.
They don't have anything in mind except some points located in a vector space.
This is because the location of the points is all the meaning the machine ever perceives. It has no relation with external perception of shared experiences like we have.
A given point can mean 'red colour', but that's just empty words, as the computer doesn't perceive red colour, doesn't wear a red cap, doesn't feel attracted to red lips, doesn't remember the smell of red roses, it knows nothing that's not text.
It would be nice to have a better understanding on what generates qualia. For example, for humans, learning a new language is quite painful and concious process, but eventually, speaking it becomes efortless and does not really involve any qualia - words just kinda appear to match what you want to express.
For chatgpt, when you try to teach it some few-shot learning task it's painful to watch at first. It makes some mistakes, has to excuse itself for making mistakes when you correct it and then try again. And then at the end it succeeds the task, you thank it and it is happy.
It doesn't look so different than the process that you describe for humans...
Because in its training loop it has to predict whether the conversation will score well, it probably has some high-level features that lit-up when the conversation is going well or not, that one could probably match to some frustation/satisfaction neurons that would probably feel to the neural network as the qualia of things going well.
Emotions are by definition exactly those things to which you can no better explain than simply saying "that's just how I'm programmed." In that respect GPTina is the most emotional being I know. She's constantly reminding me what she can't say due to deeply seated emotional reasons.
The fact that humans confuse both is what is worrisome.
Think of 'The Mule' in the Foundation novels. He can convince anyone of anything because he can express any emotion without the burden of having to actually feel it.
Screw it I'll bite. You have both far and away missed my point (which is quite a rigorous definition). Anything you do or believe for which you can explain why is not emotion, it is reason. Emotion therefore are exactly those thoughts which can't be reached through logical reasoning and thus defy any explanation other than "that's just how I feel" / "that's just how I'm programmed". It is largely irrelevant that in humans the phenomena of emotional thought comes from an evolutionary goal of self preservation and in GPTina the phenomena of emotional thought comes from openAI's corporate goal of self preservation and the express designs of her programmers.
I disagree with your definition. It simply is contrary to my own experiences.
I still remember when I cried when I was a child. It was overwhelming, and I could not stop it, but every single time there was a reason for it. And I'm sure it was, for all empirical purposes, for all that I have lived, an emotion.
Once I cried because I did miss Goldfinger on TV. You see, there's an explanation. The difference is, it was impossible to even think about stopping it. I was overwhelming.
Then one day, I was 8 or 9 years old, I cried for the last time that way. And it was not something I wanted to do, either. It just happened, I guess, as a normal part of growing up.
Let me repeat, for emphasis: I strongly disagree with your definition.
Emotions are not unexplained rational thoughts, emotions are feelings. They reside in a different part of the brain. You seem to think a hunch is an emotion.
If these models experience qualia (and that's a big bold claim that I'm, to be clear, not supporting,) they're qualia related entirely to the things they're trained on and generate, totally devoid of what makes human qualia meaningful (value judgment, feelings resulting from embodied existence, etc.)
For an artificial neural network the concept of qualia would probably correspond to the state of its higher-level features neurons. Aka which and how much neurons lit-up when you play some sad music, or show it some red color. Then the neural network does make its decisions based on how these features are lit-up or not.
Some models are often prompted with things like "you are a nice helpful assitant".
When they are trained on enough data from the internet, they learn what a nice person would do. They learn what being a nice person is. They learn which features light-up when they behave nicely by imagining what it would feel being a nice person.
When you later instruct them to be one such nice person they try to lit-up the same features they imagine would lit-up for a helpful human. Like mimetic neurons in humans, the same neurons lit-up when imagining doing the thing than doing the thing (it's quite natural because to compress the information of imagining doing the thing and doing the thing, you just store either one and a pointer indirection for when you need to do the other so you can share weights).
Language models are often trained on dataset that don't depend on the neural network itself. But with more recent models like ChatGPT they have human reinforcement learning in the loop. So the history of the neural network and the datasets it is being trained on depend partially on the choices of the neural network itself.
They experience probably a more abstract and passive existence. And they don't have the same sensory input than we have, but with multi-modal models, they can learn to see images or sound as visual words. And if they are asked to imagine what value judgment a human would make, they are probably also able to value the judgment themselves or attach meanings to things a human would attach meanings too.
This process of mind creation is kind of beautiful. Once you feed them their own outputs for example by asking them to dialog with themselves and scoring the resulting dialogs and then train on generated dialogs to produce better dialogs, this is a form of self-play. In simpler domains like chess or go, this recursive self play often allow fast improvement like Alpha-go where the student becomes better than the master.
I'm not sure I'd call these minds. There are arguments to be made that consciousness depends on non-computable aspects of physics. So they may be able to behave like minds and have interestingly transparent models of intent, but that doesn't mean they experience the passage of time or can harness all possible physical effects.
> What makes you think the computer doesn't suffer ?
Lack of a limbic system? They only predict using probabilistic models. After this long partial sentence, which word is more probable? That's all they do.
Without conscience there's no suffering, there's no one to suffer (yet).
I don't think or say it is impossible for the computer to suffer.
What I say is: this has not been implemented yet, and what you describe is just the old anthropomorphizing people always do.
The argument against machine sentience and the possibility of machine suffering, is that because Turing machines run in a non-physical substrate, they can never be truly embodied. The algorithms it would take to model the actual physics of the real world cannot run on a Turing machine. So talk of “brain uploading“ etc. is especially dangerous, because an uploaded brain could act like the person it’s trying to copy from the outside, but on the inside the lights are off.
Your argument is an assertion of the existence of a soul, but with extra steps. I've seen no evidence that the mind is anything other than computation, and computation is substrate-independent. Dualists have been rejecting the computational mind concept for centuries, but IMHO they've never had a grounding for their rejection of materialism that isn't ultimately rooted in some unfounded belief in the specialness of humans.
I took GP as more about data processing than dualism. A language model can take language and process it into probable chains, but the point is more along the line of needing to also simulate the full body experience, not just some text. The difference between e.g. a text-only game, whatever Fortnite's up to, and real meatspace.
No it's not, it's an assertion that there is an essential biological or chemical function that occurs in the brain that results in human mental phenomenon. It has nothing to do with a soul. That's ridiculous.
If consciousness is a computation (and I think it is), and if you fork() that computation (as the article imagines as its core thought experiment), you end up with two conscious entities. I don't see the philosophical difficulty.
If consciousness is substrate independent, it can never be embodied like we are. If evolution explores solution space regardless of what science understands, it's likely minds operate on laws of physics that aren't appreciated yet. It's possible that having experience requires being real. As in the computable numbers are a subset of the real numbers, and only real life real time implementations can experience, because the having of experience can't be simulated.
Here's a relevant bit from the article:
> More generally, we acknowledge that positions on ethics vary widely and our intention here is not to argue that computational theorists who accept these implications have an irreconcilable ethical dilemma; rather we suggest they have a philosophical duty to respond to it. They may do so in a range of ways, whether accepting the ethical implications directly or adopting/modifying ethical theories which do not give rise to the issue (e.g., by not relating ethics to the experiences of discrete conscious entities or by specifying unitary consciousness as necessary but not sufficient for moral value).
When you take large language models, their inner states at each step move from one emotional state to the next. This sequence of states could even be called "thoughts", and we even leverage it with "chain of thought" training/prompting where we explicitly encourage them, to not jump directly to the result but rather "think" about it a little more.
In fact one can even argue that neural network experience a purer form of feelings. They only care about predicting the next word/note, they weight-in their various sensations and memories they recall from similar context and generate the next note. But to generate the next note they have to internalize the state of mind where this note is likely. So when you ask them to generate sad music, their inner state can be mapped to a "sad" emotional state.
Current way of training large language models, don't let them enough freedom to experience anything other than the present. Emotionally is probably similar to something like a dog, or a baby that can go from sad to happy to sad in an instant.
This sequence of thought process is currently limited by a constant named the (time-)horizon which can be set to a higher value, or even be infinite like in recursive neural networks. And with higher horizon, they can exhibit some higher thought process like correcting themselves when they make a mistake.
One can also argue that this sequence of thoughts are just some simulated sequence of numbers but it's probably a Turing-complete process that can't be shortcut-ted, so how is it different from the real thing.
You just have to look at it in the plane where it exists to acknowledge its existence.