The Business & Technology Network
Helping Business Interpret and Use Technology
«  
  »
S M T W T F S
1
 
2
 
3
 
4
 
5
 
6
 
7
 
8
 
9
 
 
 
12
 
13
 
14
 
15
 
16
 
17
 
18
 
19
 
20
 
21
 
22
 
23
 
24
 
25
 
26
 
27
 
28
 
29
 
30
 
 
 
 
 
 

Can AI learn to play fair?

DATE POSTED:June 2, 2025
Can AI learn to play fair?

Researchers Neemesh Yadav, Palakorn Achananuparp, Jing Jiang, and Ee-Peng Lim from Singapore Management University and the Australian National University have conducted a large-scale study on how large language models (LLMs) simulate human-like decision-making in social negotiations. Their work explores how theory of mind (ToM) reasoning and different prosocial belief systems affect the behavior of AI agents in a classic economic simulation known as the ultimatum game.

The ultimatum game is a simple two-player negotiation setup: one player proposes how to split a sum of money, and the other player either accepts or rejects the offer. Despite its simplicity, the game reveals how fairness, self-interest, and social norms influence decision-making. The researchers used this environment to examine whether LLM agents could be steered toward human-aligned behaviors by embedding them with specific social beliefs and reasoning abilities.

Why negotiation tasks matter for AI behavior

Theory of mind is the human ability to infer the beliefs and intentions of others. For AI agents, especially LLMs used in social simulations and autonomous systems, mimicking this capacity is crucial for acting cooperatively and fairly. While prior work has demonstrated that LLMs can imitate basic human behavior, it remained unclear whether ToM-style reasoning improves the alignment of their actions with human expectations.

This study focuses on two variables: the type of belief the AI holds (Greedy, Fair, or Selfless), and the kind of reasoning it uses to make decisions (e.g., introspection, first-order mental modeling, or chain-of-thought prompting). The researchers wanted to know whether combining the right belief with the right reasoning method would result in more human-like decisions during negotiation.

Designing AI agents with social beliefs

The research team simulated 2,700 games across six LLMs, including proprietary models like GPT-4o and open-source systems like LLaMA 3. They assigned each agent a belief profile—Greedy agents tried to keep more money for themselves, Fair agents aimed for equal splits, and Selfless agents tended to give more to others. These roles were tested in both the proposer and responder positions of the game.

To evaluate decision-making, the researchers incorporated five reasoning styles: no reasoning (vanilla), chain-of-thought (step-by-step logic), and three levels of ToM—zero-order (self-reflection), first-order (thinking about what the other player thinks), and a combination of both. Each agent made decisions by reasoning within these frames, based on its belief and role in the game.

Crucially, the study was not focused on testing whether LLMs can truly “understand” minds. Instead, it investigated whether simulating mental-state-based reasoning helps the agents better approximate human behavior in negotiations.

Key findings from the simulations

The results reveal clear patterns about how different reasoning methods and belief types affect outcomes. The Fair-Fair combination—where both the proposer and responder act fairly—produced the most consistently human-aligned results. These agents completed games quickly, reached fair splits, and achieved near-perfect acceptance rates.

The choice of reasoning method also mattered. Agents that used first-order ToM reasoning—predicting what their counterpart was thinking—produced initial offers most aligned with human expectations. However, when acting as responders, a different pattern emerged. Accepting an offer required a combined reasoning strategy (both self-reflection and ToM), while rejecting an offer was best handled by simpler introspective or chain-of-thought reasoning.

Performance across different language models

Among the LLMs tested, GPT-4o and LLaMA 3.3 showed the best overall alignment with human norms when acting as proposers. Interestingly, LLaMA 3.1, the smallest model tested, demonstrated the most accurate behavior as a responder in terms of accepted offers. This suggests that larger models are not always better at simulating cooperative human behavior—structure and reasoning may matter more than scale in certain tasks.

Statistical regression analysis supported these findings. For example, GPT-4o produced significantly lower deviations from expected proposer behavior, and agents using both zero-order and first-order ToM showed improved acceptance consistency. The study also confirmed that belief types had predictable impacts: fair agents consistently aligned more closely with human norms than greedy or selfless agents.

What if your voice could become a lion’s roar or a goblin’s snarl with AI?

Interpreting deviation from human norms

To quantify alignment, the researchers used a set of behavioral metrics. These included acceptance rates, total payouts, and deviation scores that compared the AI’s decisions to well-established human behavior ranges in the ultimatum game. Fair offers were typically 50/50 splits; greedy agents offered or accepted 70% or more; and selfless agents tended to settle for 30% or less.

Deviation scores showed how far an agent’s decisions strayed from these norms. A lower deviation score meant the agent behaved more like a human with similar beliefs. The most aligned proposer behavior came from fair agents using first-order ToM reasoning. On the responder side, the best alignment in accepted offers came from those using combined ToM strategies.

Lessons for AI alignment and interaction

The study demonstrates that simulating theory of mind in LLMs isn’t just an academic exercise. It directly improves how these models behave in negotiation settings—contexts that mirror many real-world interactions, from customer service to diplomatic dialogue. Models that “think about others’ thoughts” are not just smarter; they’re more likely to play fair, act consistently, and avoid socially unacceptable decisions.

Still, the findings also point to important limitations. For example, agents were limited to predefined strategies and did not possess true emotional states, which significantly influence human negotiations. Prompt design also played a major role—subtle changes in instruction could affect outcomes. These caveats suggest caution when applying ToM reasoning to high-stakes or real-world deployments.

Conclusion and future directions

Yadav and his colleagues have provided a thoughtful, statistically grounded exploration of how social cognition tools like theory of mind influence large language model behavior. By structuring agents around prosocial beliefs and varying levels of reasoning complexity, the study reveals not only which strategies work best, but why.

In summary, Fair beliefs paired with ToM-based reasoning offer the best path to human-aligned outcomes in AI-driven negotiations. Future research may explore how this dynamic plays out in more complex simulations or multi-agent interactions beyond economic games. In the long term, integrating behavioral insights like these could be vital for building AI systems that act not only intelligently, but ethically and cooperatively in human society.