Andy Clark in Time Magazine: Generative AI—think Dall.E, ChatGPT-4, and many more—is all the rage. It’s remarkable successes, and occasional catastrophic failures, have kick-started important debates about both the scope and dangers of advanced forms of artificial intelligence. But what, if anything, does this work reveal about natural intelligences such as our own? I’m a philosopher and cognitive scientist who has spent their entire career trying to understand how the human mind works. Drawing on research spanning psychology, neuroscience, and artificial intelligence, my search has drawn me towards a picture of how natural minds work that is both interestingly similar to, yet also deeply different from, the core operating principles of the generative AIs. Examining this contrast may help us better understand them both.
The AIs learn a generative model (hence their name) that enables them to predict patterns in various kinds of data or signal. What generative there means is that they learn enough about the deep regularities in some data-set to enable them to create plausible new versions of that kind of data for themselves. In the case of ChatGPT the data is text. Knowing about all the many faint and strong patterns in a huge library of texts allows ChatGPT, when prompted, to produce plausible versions of that kind of data in interesting ways, when sculpted by user prompts—for example, a user might request a story about a black cat written in the style of Ernest Hemingway. But there are also AIs specializing in other kinds of data, such as images, enabling them to create new paintings in the style of, say, Picasso.
What does this have to do with the human mind? According to much contemporary theorizing, the human brain has learnt a model to predict certain kinds of data, too. But in this case the data to be predicted are the various barrages of sensory information registered by sensors in our eyes, ears, and other perceptual organs. Now comes the crucial difference. Natural brains must learn to predict those sensory flows in a very special kind of context—the context of using the sensory information to select actions that help us survive and thrive in our worlds. This means that among the many things our brains learn to predict, a core subset concerns the ways our own actions on the world will alter what we subsequently sense. For example, my brain has learnt that if I accidentally tread on the tail of my cat, the next sensory stimulations I get will often include sightings of wailing, squirming, and occasionally feelings of pain from a well-deserved retaliatory scratch.
This kind of learning has special virtues. It helps us separate cause and simple correlation. Seeing my cat is strongly correlated with seeing the furniture in my apartment. But neither one of these causes the other to occur. Treading on my cat’s tail, by contrast, causes the subsequent wailing and scratching. Knowing the difference is crucial if you are a creature that needs to act on its world to bring about desired (or to avoid undesired) effects. In other words, the generative model that issues natural predictions is constrained by a familiar and biologically critical goal—the selection of the right actions to perform at the right times. That means knowing how things currently are and (crucially) how things will change and alter if we act and intervene on the world in certain ways.
How do ChatGPT and the other contemporary AIs look when compared with this understanding of human brains and human minds? Most obviously, current AIs tend to specialize in predicting rather specific kinds of data—sequences of words, in the case of ChatGPT. At first sight, this suggest that ChatGPT might more properly be seen as a model of our textual outputs rather than (like biological brains) models of the world we live in. That would be a very significant difference indeed. But that move is arguably a little too swift. Words, as the wealth of great and not-so-great literature attests, already depict patterns of every kind—patterns among looks and tastes and sounds for example. This gives the generative AIs a real window onto our world. Still missing, however, is that crucial ingredient—action. At best, text-predictive AIs get a kind of verbal fossil trail of the effects of our actions upon the world. That trail is made up of verbal descriptions of actions (“Andy trod on his cat’s tail”) along with verbally couched information about their typical effects and consequences. Despite this the AIs have no practical abilities to intervene on the world—so no way to test, evaluate, and improve their own world-model, the one making the predictions. More here.
FEATURED VIDEO: AI generated ‘women in So. Asia