Steve Nadis in Quanta: Imagine you had a friend who gave different answers to the same question, depending on how you asked it. “What’s the capital of Peru?” would get one answer, and “Is Lima the capital of Peru?” would get another. You’d probably be a little worried about your friend’s mental faculties, and you’d almost certainly find it hard to trust any answer they gave.
That’s exactly what’s happening with many large language models (LLMs), the ultra-powerful machine learning tools that power ChatGPT and other marvels of artificial intelligence. A generative question, which is open-ended, yields one answer, and a discriminative question, which involves having to choose between options, often yields a different one. “There is a disconnect when the same question is phrased differently,” said Athul Paul Jacob, a doctoral student at the Massachusetts Institute of Technology.
To make a language model’s answers more consistent — and make the model more reliable overall — Jacob and his colleagues devised a game where the model’s two modes are driven toward finding an answer they can agree on. Dubbed the consensus game, this simple procedure pits an LLM against itself, using the tools of game theory to improve the model’s accuracy and internal consistency.
“Research exploring self-consistency within these models has been very limited,” said Shayegan Omidshafiei, chief scientific officer of the robotics company Field AI. “This paper is one of the first that tackles this, in a clever and systematic way, by creating a game for the language model to play with itself.”
“It’s really exciting work,” added Ahmad Beirami, a research scientist at Google Research. For decades, he said, language models have generated responses to prompts in the same way. “With their novel idea of bringing a game into this process, the MIT researchers have introduced a totally different paradigm, which can potentially lead to a flurry of new applications.”
More here.