Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

With transformer-based model, their inner-state is a deterministic function (the features encoded by the Neural Networks weights) applied to the text-generated up-until the current-time step, so it's relatively easy to know what they currently have in mind.

For example if the neural network has been generating sad music, its current context which is computed from what it has already generated will light-up the the features that correspond to "sad music". And in turn the fact that the features had been lit-up will make it more likely to generate a minor chord.

The dimension of this inner-state is growing at each time-step. And it's quite hard to predict where it will go. For example if you prompt it (or if it prompts itself) "happy music now", the network will switch to generating happy music even if in its current context there is still plenty of "sad music" because after the instruction it will choose to focus only on the recent more merrier music.

Up until recently, I was quite convinced that using a neural network in evaluation mode (aka post training with its weight frozen) was "(morally) safe", but the ability of neural network of performing few-shot learning changed my mind (The Microsoft paper in question : https://arxiv.org/pdf/2212.10559.pdf : "Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers" ).

The idea in this technical paper is that with attention mechanism even in forward computation there is an inner state that is updated following a meta-gradient (aka it's not so different from training). Pushing the reasoning to the extreme would mean that "prompt engineering is all you need" and that even with frozen weight with a long enough time-horizon and correct initial prompt you can bootstrap a consciousness process.

Does "it" feels something ? Probably not yet. But the sequential filtering process that Large Language Models do is damn similar to what I would call a "stream of consciousness". Currently it's more like a markov chain of ideas flowing from idea to the next idea in a natural direction. It's just that the flow of ideas has not yet decided to called itself it yet.



That doesn’t feel like a rigorous argument that it is “emotional” to me though.

A musician can improvise a song that sounds sad, and their brain would be firing with sadness-related musical information, but that doesn’t mean they are feeling the emotion “sad” while doing it.

I don’t think we gain much at all from trying to attach human labels to these machines. If anything it clouds people’s judgements and will result in mismatched mental models.


>I don’t think we gain much at all from trying to attach human labels to these machines.

That's the standard way of testing whether the neural network has learned to extract "useful" ("meaningful"?) representation from the data : You add very few layers on top of the frozen inner-state of a neural network, and make him predict known human labels, like is the music sad, or is it happy.

If it can do so with very few additional weights, it means it has already learn in its inner representation what makes a song sad or happy.

I agree that I didn't gave a precise definition a what "emotion" is. But if we had to define what emotion is for a neural network : traditional continuous vectors does fit quite well the emotions concept though. You can continuously modify them a little and they map/embed a high-dimensional space into a more meaningful lower-dimensional space where semantically near emotions are numerically near.

For example if you have identified a "sad" neuron that when it light-up the network tend to produce sad music, and a "happy" neuron that when it light-up the network tend to produce happy music, you can manually increase these neuron values to make it produce the music you want. You can interpolate to morph one emotion into the other and generate some complex mix in-between.

Neurons are quite literally adding-up and comparing the various vectors values of the previous layers to decide whether they should activate or not, (aka balancing "emotions").

Humans and machine are tasked to learn to handle data. It's quite natural that some of the mechanism useful for data manipulation emerge in both cases and correspond to each other. For example : the fetching of emotionally-related content to the working context maps quite clearly a near neighbor search to what happens when people say they have "flashing" memories when they experience some particular emotions.


They don't have anything in mind except some points located in a vector space.

This is because the location of the points is all the meaning the machine ever perceives. It has no relation with external perception of shared experiences like we have.

A given point can mean 'red colour', but that's just empty words, as the computer doesn't perceive red colour, doesn't wear a red cap, doesn't feel attracted to red lips, doesn't remember the smell of red roses, it knows nothing that's not text.


It would be nice to have a better understanding on what generates qualia. For example, for humans, learning a new language is quite painful and concious process, but eventually, speaking it becomes efortless and does not really involve any qualia - words just kinda appear to match what you want to express.

The same distinction may appear in neural nets.


For chatgpt, when you try to teach it some few-shot learning task it's painful to watch at first. It makes some mistakes, has to excuse itself for making mistakes when you correct it and then try again. And then at the end it succeeds the task, you thank it and it is happy.

It doesn't look so different than the process that you describe for humans...

Because in its training loop it has to predict whether the conversation will score well, it probably has some high-level features that lit-up when the conversation is going well or not, that one could probably match to some frustation/satisfaction neurons that would probably feel to the neural network as the qualia of things going well.


It requires a deep supervision of the process. A "meta" GPT that is trained on the flows, rather than words.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: