How MIT Researches Leveraged Game Theory for Better AI Accuracy

How MIT Researches Leveraged Game Theory for Better AI Accuracy | Artificial Intelligence and Machine Learning | Emeritus

Game theory is a branch of mathematics often used in real life. It is also used in many other areas besides mathematics. Essentially, it tells us how to play a strategy game. It tells the players how to move and what order they should follow. At each decision point, it looks at the players’ knowledge. Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have now leveraged game theory principles to develop the “Consensus Game” to enhance AI’s text comprehension and generation capabilities.

Before going further into the Consensus Game approach and what it means, let’s first understand game theory with an example. Assume you and a friend are engaged in a game where the objective is to use only cryptic language to send each other hidden messages. Guessing the covert meaning of your sentences is your friend’s task. Sometimes, you provide direct clues; at other times, your companion must use yes-or-no questions to deduce the meaning. The difficulty is in your mutual need to confirm that you both understand one another and concur on the hidden message.

The researchers at MIT found they could significantly increase the AI’s capacity to provide accurate and logical responses to inquiries by viewing this interaction as a game in which both AI components cooperate under predetermined guidelines to agree on the proper message. Tellingly, this novel game-like method improved the AI’s performance overall in a range of activities. This includes reading comprehension, math problem solving, and conversational carrying on.

strip banner

One Approach

Large language models, or LLMs, can act smart, just like humans. They learn the trends and links between words and phrases by looking at vast amounts of data with statistical models. Usually, large language models respond in one of two ways. One is either by generative querying, which generates answers directly from the model. The second way is discriminative querying, which scores a range of predefined answers and might produce inconsistent and occasionally contradictory results. When asked, “Who is the president of the United States?” The generative method can make a simple response like “Joe Biden”. When assessing the response, “Barack Obama”, a biased question could falsely deny this reality.

So, how do the researchers make sense predictions when their score methods don’t work well together?

What is Unique in the Consensus Game Approach?

The Consensus Game system finds equilibrium when everyone agrees, ensuring the answer is accurate and true to the model’s original ideas. To do this, the method changes how the generative and discriminative parts interact repeatedly until they agree on an answer that matches their starting beliefs and is true to reality. This method successfully fills the gap between the two querying approaches.

Using the Consensus Game method to language model querying in real life takes a lot of work, especially for tasks that involve answering questions. For instance, the model has to use the method for every query regarding datasets like MMLU, which have thousands of questions and multiple-choice answers. After that, it must agree on every question and possible answers between the creative and discriminative parts. In fact, it needed help with math word problems, which are required for high school students. It couldn’t come up with wrong answers, knowing the right one is essential.


The Consensus Game approach has been rigorously tested on multiple tasks, including reading comprehension, mathematical problem-solving, and conversational dialogue. The tests demonstrated significant performance improvements. Notably, when paired with the LLaMA-7B model, the ER algorithm surpassed the performance of larger models, showcasing the effectiveness of the Consensus Game methodology.

Implementing the consensus game, particularly for question-answer tasks, presents computational challenges. The primary challenge lies in the necessity for continuous consensus-building between each query’s generative and discriminative components. This requirement can lead to increased computational load and complexity.

The Road Ahead

The inspiration for this game theory-based approach partially originates from the AI agent “Cicero”, developed for the strategic board game Diplomacy. This game is all about negotiation and strategic planning using natural language. In fact, the Consensus Game’s approach to AI interactions mirrors these very elements. Looking ahead, researchers plan to integrate this methodology with foundational AI models to achieve more factual and consistent outputs. This integration enhances overall AI performance, making systems such as ChatGPT more reliable and accurate.

Furthermore, the Consensus Game design could significantly change how we decode language models. This game-theoretic method could open up new uses and make AI systems more reliable by encouraging more accurate and consistent content creation. This progress makes it possible for AI to connect with humans in more complex and reliable ways in many areas.

NOTE: The views expressed in this article are those of the author and not of Emeritus.

About the Author

Senior Researcher and Author, INDIAai Portal
With over 10 years of experience in research writing alongside a full-time Ph.D. in information technology and computer science, Dr. Nivash is a bit of a unicorn: a scientist who loves to write. His articles reflect not just his expertise in artificial intelligence but also his passion for technology and all the ethical questions it poses. Having worked with renowned publications like Analytics India Magazine and INDIAai, he is one of the leading voices in the fast-evolving universe of AI. When he is not neck-deep in research, Nivash is either road-tripping to the next destination or taking a shot at acting on stage, his one unrealized dream.
Read More About the Author

Courses on Artificial Intelligence and Machine Learning Category

US +1-606-268-4575
US +1-606-268-4575