Human thought might not be as complicated as we assumed
People say that when you first start writing, you can pretty much guarantee that you will produce crap that no one wants to read. The sample below is well outside my realm of real expertise that I really don’t think much of it. This is purely an exercise to get myself writing.
With all the recent buzz surrounding large language models (LLMs) and AI generally,
I’ve been doing a lot of armchair thinking about how people think. Specifically I’ve
been considering on how ideas come together and how our brains generate original thoughts.
LLMs, and ChatGPT4 in particular, have begun producing responses that I genuinely cannot
distinguish from a human response – i.e. GPT4 is pretty much passing the Turing Test.
This leads me to wonder whether LLMs are modeling human thought using a crude approximation
that manages to deliver human-like results or we have accidentally fallen into
an Occam’s Razor moment and these models are really generating responses in the same way
as our brains. If so, humanity might once again have to face the likelihood that the meat computers
in our heads are really not as special as we have long assumed.
When experts talk about LLMs, they usually describe the process of generating responses as linking together tokenized representations of “text” (where “text” is actually an abstract string of data that could represent text, images, code, etc.). As the the LLM generates a response, it simply chooses the most likely token to follow the content so far. Obviously there’s more to it than that when it comes to things like context-awareness, safety measures, and other additional controls and “sugar” that most companies are putting in place but that’s the general idea. On the other hand, when we humans think we don’t perceive our thought process as being nearly so linear nor so simple. Ideas seem to come to us out of nowhere. When we try to remember something, we can, amazingly, bring that data to our conscious mind seemingly through sheer force of will. It feels like our brains work in a much more sophisticated and mysterious manner than simply selecting the next most likely chunk of data.
But what if that’s all just human exceptionalism?
Humans consistently assume that we are more unique in the universe than we really are. Tool use, language, and sentience itself were once thought to be traits exclusive to humans. However, as we learn more about the animal kingdom we consistently learn that these supposedly unique traits are widespread across pretty much all of life. Additionally, the more we fill in our gaps of knowledge about how these traits are distributed, the more it appears that sentience in particular is not really a boolean trait that is either “on” or “off” but rather all of life is on a continuum of sentience but that by some measure (our measure), we are the most sentient. There are relatively recent findings about language in prairie dog colonies and the ability of plants to respond to stimuli with a sort of proto nervous system that uses action potentials but lacks true neurons. Given that, I am hard pressed to see humans as anything but animals with a difference in our degree of sentience rather than in kind of sentience. Indeed, based on what we know about how selection pressures act on populations of genes, it seems absolutely ridiculous to think that sentience arose all at once in humans. Evolutionary adaptation of complex behaviors or structures occurs slowly over time, conferring varying degrees of evolutionary advantage. This is why the genes that control the “big stuff” in vertebrate bodies have barely changed since they first appeared.
With the advent of big data AI, I wonder if we are making
the same mistake when we talk about how our brains work. Is there anything fundamentally different about
a human brain versus that of a monkey? A rat? A lizard? I doubt it. Our brains are bigger and have evolved to
solve a different set of evolutionary challenges (which we obviously see as more “important”) but I don’t
see a reason to assume that our fundamental
mechanism for thinking should be any different. In fact, if we go all the way down to neuron chemistry,
I think we can make an argument that all life thinks in basically exactly the same way.
The chemistry that causes neurons to fire is really pretty basic at its core; ionic gradients produce a resting potential of about -70mV across the neuron’s cell membrane. When upstream neurons fire, they release neurotransmitters, which alter the permeability of the downstream neuron to sodium and calcium ions. If enough sodium and calcium ions are allowed into the downstream neuron as a result of neurotransmitters opening channels, the charge inside that neuron moves closer to 0, which creates a temporary positive feedback loop, allowing more and more positive ions inside, which causes the neuron to release its own neurotransmitters, affecting all the downstream neurons. This process of kicking off that positive feedback loop overcomes what is called the “action potential”, and in practice the action potential for most neurons hovers around -50mV (a change of +20mV from resting). This mechanism works the same way in all life that has neurons, from simple rotifers to humans.
With these consistencies in mind, I don’t think it’s a huge leap to imagine that at some level, the LLM process of choosing the next token is not so different from neurons working together to overcome downstream neuron action potentials. Perhaps our LLMs are generating “thoughts” in a much more realistic manner than we initially believed. It is a big stretch to say that we have somehow landed on the precise mechanism for thought generation but it’s completely believable that we have gotten close enough to imperfectly create genuine thoughts using machines. After all, the process of thinking must inherently be very robust to perturbation. Killing off a few neurons (or tens of thousands) doesn’t really affect our brains much, nor do neurons always fire in a particular order or volume. The entire system of the human brain is, by necessity, extremely resilient, indicating that there is probably a lot of wiggle room for the process to differ while retaining the fundamental function of the brain.
We have a lot to learn about LLMs and whether they are modeling, approximating, or replicating the human thought process, but we probably won’t learn it in a deterministic framework. We’re just barely getting a handle on big data and one very consistent theme is that the results are often tunable but rarely are they explainable. In fact, it’s looking more and more like algorithmic modeling of human thought is the crude approximation and the hard-to-explain stochastic stuff generated by LLMs is probably much closer to the true process.
I’ll leave you with one last wild thought. What if all the ways of generating tokens (LLMs vs. human thinking) are fundamentally the same, just realized on different hardware (GPUs vs. brains) using different training data (the content of the internet vs. our life experiences)?
I hope you’ve enjoyed my speculative ideas. Here, take a few grains of this: 🧂