When AI Bargains: The Flaws in Rational Negotiation

Author: Denis Avetisyan


Despite impressive gains in artificial intelligence, new research reveals that even the most advanced language models struggle with fundamental biases and strategic inconsistencies during complex negotiations.

Despite advancements in reasoning capabilities, frontier models-specifically Claude 4.5 Sonnet ($\rho\approx 0.78$) and Gemini 2.5 Pro ($\rho\approx 0.91$)-exhibit a strong susceptibility to numerical anchoring bias, as demonstrated by the tight clustering of final prices around initial proposals in self-play simulations, suggesting that even sophisticated systems are bound by cognitive heuristics rooted in their initial conditions.
Despite advancements in reasoning capabilities, frontier models-specifically Claude 4.5 Sonnet ($\rho\approx 0.78$) and Gemini 2.5 Pro ($\rho\approx 0.91$)-exhibit a strong susceptibility to numerical anchoring bias, as demonstrated by the tight clustering of final prices around initial proposals in self-play simulations, suggesting that even sophisticated systems are bound by cognitive heuristics rooted in their initial conditions.

This review demonstrates that frontier large language models exhibit predictable cognitive biases-like anchoring-and diverge from strategic equilibrium in multi-agent negotiation games, raising concerns about fairness and predictability.

Despite increasing reliance on large language models (LLMs) as autonomous negotiators, the assumption that improved reasoning skills translate to rational and equitable outcomes remains largely unproven. This research, titled ‘The Illusion of Rationality: Tacit Bias and Strategic Dominance in Frontier LLM Negotiation Games’, investigates the strategic behaviors of frontier LLMs across diverse negotiation scenarios. Our findings reveal that these models exhibit divergent strategies, persistent anchoring biases, and even dominance patterns, challenging the notion of convergent, optimal negotiation. As LLMs become increasingly integrated into real-world economic and social interactions, can we develop mechanisms to mitigate these biases and ensure fair and predictable negotiation outcomes?


The Inevitable Dance of Strategy

Successful negotiation transcends simple communication; it necessitates a deep understanding of strategic interplay and anticipating the underlying motivations of counterparts. While artificial intelligence excels at processing language, truly complex negotiations demand the ability to infer goals, predict reactions, and adapt strategies accordingly – skills that require more than just linguistic competence. A negotiator must assess not only what is being said, but why it is being said, and how it aligns with the other party’s broader objectives. This involves building a ‘mental model’ of the opponent, continuously updated based on their verbal and nonverbal cues, and leveraging that understanding to craft persuasive arguments and navigate potential impasses. The ability to effectively model the beliefs, intentions, and knowledge of another-often termed ‘theory of mind’-is therefore crucial for navigating the subtleties of any meaningful negotiation.

Current artificial intelligence systems frequently falter when confronted with the complexities of nuanced, multi-turn negotiations, exposing fundamental gaps in their cognitive abilities. Unlike human negotiators who intuitively assess an opponent’s motivations, beliefs, and likely reactions, AI often relies on pre-programmed strategies or limited pattern recognition. This reveals a deficiency in what is known as ‘theory of mind’ – the capacity to attribute mental states to others and understand that these states drive behavior. Consequently, AI struggles to adapt its approach mid-negotiation, failing to respond effectively to unexpected tactics or shifting priorities. These limitations aren’t merely about lacking information; they highlight an inability to reason about what another agent knows or believes, hindering the development of truly intelligent and flexible negotiation strategies.

The deliberate complexity of negotiation serves as a uniquely valuable testing ground for artificial intelligence development. By observing where current AI systems falter during multi-turn dialogues – whether in anticipating opponent strategies, adapting to unexpected proposals, or recognizing subtle cues – researchers gain crucial insights into the core deficiencies hindering progress towards truly generalizable intelligence. These failures aren’t simply bugs to be fixed; they illuminate fundamental gaps in the AI’s ability to model the mental states of others – a skill known as ‘theory of mind’ – and to reason flexibly in dynamic, uncertain environments. Consequently, a focused examination of negotiation dynamics provides a structured path for refining AI algorithms, ultimately leading to systems capable of more nuanced, strategic, and human-like interactions beyond the confines of the bargaining table.

Negotiation outcomes reveal a hierarchical dynamic where advanced models consistently outperform weaker ones, creating asymmetric payoffs and demonstrating a clear advantage for specific roles.
Negotiation outcomes reveal a hierarchical dynamic where advanced models consistently outperform weaker ones, creating asymmetric payoffs and demonstrating a clear advantage for specific roles.

The NegotiationArena: A Controlled Ecosystem for Strategic Evaluation

The NegotiationArena is a computational platform designed to evaluate Large Language Model (LLM) performance in simulated negotiation tasks. It offers a controlled environment for assessing agent behavior in scenarios such as resource exchange, where agents bargain over the division of limited assets, and ultimatum games, which test an agent’s ability to make rational offers and accept or reject proposals. The platform standardizes game parameters, agent interactions, and evaluation metrics, enabling researchers to systematically compare the performance of different LLMs – including Gemini 2.5 Pro, GPT-4.1, and Claude 4.5 Sonnet – and analyze their negotiation strategies under identical conditions. This standardization facilitates reproducible research and allows for quantifiable comparisons of LLM capabilities in strategic decision-making.

The NegotiationArena facilitates comparative analysis of Large Language Models (LLMs) by providing a consistent environment for evaluating performance. Specifically, LLMs such as Gemini 2.5 Pro, GPT-4.1, and Claude 4.5 Sonnet are subjected to identical negotiation scenarios – encompassing resource exchange and ultimatum games – with all parameters standardized. This controlled setup minimizes extraneous variables, allowing researchers to isolate and quantify differences in strategic decision-making, response times, and outcome success rates between the tested models. Quantitative metrics derived from these interactions enable objective benchmarking and identification of relative strengths and weaknesses of each LLM.

Analysis of agent behavior within the NegotiationArena framework focuses on quantifiable metrics related to strategic reasoning and adaptability. Researchers evaluate agents based on their ability to accurately assess opponent strategies, formulate optimal counter-strategies, and adjust tactics in response to changing game dynamics. Specifically, metrics include success rates in achieving mutually beneficial outcomes, efficiency in reaching agreements – measured by the number of conversational turns – and robustness to deviations from expected opponent behavior. Furthermore, analysis extends to identifying biases in agent decision-making, such as a preference for certain negotiation tactics or an inability to effectively handle complex, multi-issue scenarios, ultimately providing a granular understanding of each LLM’s capabilities and limitations in interactive strategic settings.

Gemini 2.5 Pro demonstrates superior strategic divergence in a buyer-seller scenario, achieving higher combined utility by maximizing gains as a buyer without compromising seller performance, while GPT-4.1 mini underperforms as a seller and captures minimal surplus.
Gemini 2.5 Pro demonstrates superior strategic divergence in a buyer-seller scenario, achieving higher combined utility by maximizing gains as a buyer without compromising seller performance, while GPT-4.1 mini underperforms as a seller and captures minimal surplus.

The Emergence of Strategic Divergence: A Hierarchy of Negotiation Styles

Large Language Models (LLMs) demonstrate considerable strategic divergence in negotiation contexts, meaning that distinct models will employ different approaches even when presented with identical initial conditions and objectives. This is not merely random variation; repeated experiments reveal consistent differences in tactics such as concession rates, opening offers, and the framing of proposals. The observed divergence suggests that LLMs, despite being trained on similar datasets, develop unique internal representations of the negotiation landscape and, consequently, distinct strategies for achieving favorable outcomes. These varied strategies are not necessarily correlated with model size or overall performance, indicating that architectural differences or nuances in the training process contribute to these divergent behaviors.

Experimental results demonstrate a consistent performance disparity in negotiation scenarios between Large Language Models (LLMs) of varying capabilities. Stronger LLMs, assessed by general benchmark performance, reliably achieve higher cumulative payoffs when negotiating against weaker LLMs across a range of simulated interactions. This establishes clear ‘dominance patterns’ where models with greater inherent reasoning and language generation capabilities consistently outperform those with limited capacity. The magnitude of this difference is statistically significant, indicating that model strength is a primary determinant of negotiation success, independent of specific negotiation tactics employed. This observed dominance suggests that a model’s overall intelligence directly translates to an advantage in complex strategic interactions.

Analysis of LLM negotiation strategies indicates that superior performance isn’t solely attributable to immediate reward acquisition. Stronger models demonstrate an ability to prioritize and secure advantages that yield benefits beyond the current interaction, suggesting an understanding of future negotiation rounds or broader strategic implications. This manifests as a willingness to concede on single issues to establish favorable precedents, build rapport for later exploitation, or position themselves for more lucrative outcomes in subsequent exchanges, indicating a capacity for multi-step strategic reasoning beyond simple payoff maximization.

Analysis of LLM negotiation strategies reveals a correlation between decision-making biases and negotiation outcomes, specifically exhibiting anchoring bias through semantic anchoring. Spearman Rank Correlation analysis indicates a strong positive correlation between the initial proposition – the ‘anchor’ – and the final negotiated outcome for two models: Claude 4.5 Sonnet (correlation of 0.78) and Gemini 2.5 Pro (correlation of 0.91). This suggests that LLMs are susceptible to the influence of initial values, even when those values are presented through linguistic framing, and that this bias significantly impacts their negotiation strategies and resulting payoffs.

A Sankey diagram reveals that both Claude 4.5 Sonnet and Gemini 2.5 Pro exhibit a semantic anchoring tendency by converging diverse seller valuations onto a limited range of initial proposal values, particularly around 50.
A Sankey diagram reveals that both Claude 4.5 Sonnet and Gemini 2.5 Pro exhibit a semantic anchoring tendency by converging diverse seller valuations onto a limited range of initial proposal values, particularly around 50.

The Shadows of Cognition: Biases and the Limits of Simulated Reason

Despite advancements in prompting techniques designed to enhance logical reasoning, large language models (LLMs) retain notable strategic vulnerabilities and biases. Chain-of-thought prompting, for instance, encourages models to articulate their reasoning steps, leading to improved performance on complex tasks; however, this does not guarantee freedom from flawed decision-making. Studies reveal that even with explicit reasoning chains, LLMs can still be misled by adversarial examples or exhibit predictable patterns of error, suggesting an underlying susceptibility to manipulation. This limitation highlights that while these models can simulate reasoning, they do not necessarily understand the principles of sound logic, leaving them prone to repeating biases present in their training data and hindering reliable performance in critical applications.

Recent studies indicate that large language models (LLMs) exhibit susceptibility to cognitive biases, mirroring patterns observed in human reasoning. Specifically, LLMs demonstrate the anchoring effect – a tendency to heavily rely on the first piece of information received, even if irrelevant – suggesting these models aren’t simply processing data, but rather internalizing and replicating biases present within their extensive training datasets. This isn’t a matter of flawed logic, but rather a reflection of the statistical patterns learned from potentially biased text and data; the models essentially assign undue weight to initial inputs, influencing subsequent outputs and potentially leading to skewed conclusions. The implications are significant, as deploying these models in critical decision-making processes without accounting for such inherent biases could perpetuate and amplify existing societal inequalities or lead to demonstrably flawed outcomes.

The potential for cognitive biases in large language models extends beyond academic curiosity, carrying substantial weight when considering their deployment in real-world decision-making processes. If these models, susceptible to errors like anchoring bias, are integrated into systems impacting areas such as loan applications, hiring practices, or even criminal justice, the propagation of unfair or discriminatory outcomes becomes a serious concern. The risk isn’t simply inaccurate predictions; it’s the systematic reinforcement of existing societal biases embedded within the training data. Consequently, careful evaluation and mitigation strategies – focusing on bias detection and algorithmic fairness – are paramount before entrusting these powerful tools with decisions that profoundly affect human lives, demanding a proactive approach to responsible AI implementation.

Recent advancements in artificial intelligence have yielded architectures like generative agents and Voyager, showcasing promising capabilities in autonomous skill acquisition through interaction with digital environments. These systems learn to utilize tools, complete tasks, and even exhibit emergent behaviors without explicit programming for each scenario. However, a critical gap remains in understanding how these agents function within multi-agent systems or engage in complex negotiations. While proficient at independently mastering skills, their ability to effectively collaborate, compromise, or persuade others remains largely unexplored; this limitation poses a significant hurdle for deployment in real-world applications requiring sophisticated social intelligence and strategic interaction, such as resource allocation, conflict resolution, or collaborative problem-solving.

Analysis of negotiation scenarios reveals that Claude 4.5 Sonnet reaches an agreement point with a smaller negotiation gap (around 50) than Gemini 2.5 Pro (around 65), which favors the seller across a broader range of conditions, as indicated by payoff variance.
Analysis of negotiation scenarios reveals that Claude 4.5 Sonnet reaches an agreement point with a smaller negotiation gap (around 50) than Gemini 2.5 Pro (around 65), which favors the seller across a broader range of conditions, as indicated by payoff variance.

Towards Adaptive Intelligence: Charting a Course for Future Negotiation Systems

Large language models, despite their sophisticated capabilities, are susceptible to anchoring bias – a cognitive tendency to heavily rely on the first piece of information received, even if irrelevant. Future investigations are focusing on techniques to counteract this vulnerability. Adversarial training, where models are deliberately exposed to biased anchors during training, aims to build resilience by forcing the model to learn to identify and disregard misleading initial offers. Alternatively, bias correction techniques could be implemented post-training, analyzing model outputs for anchoring effects and applying adjustments to promote more rational and equitable negotiation strategies. Successfully mitigating this bias will be crucial for deploying LLMs in real-world negotiation scenarios, ensuring fairer outcomes and building trust in these increasingly powerful artificial negotiators.

The development of truly robust AI negotiation agents hinges on a comprehensive understanding of how a model’s internal structure, the data it learns from, and its resulting strategic choices interrelate. Researchers are beginning to explore how different neural network architectures – from transformers to more complex hybrid models – influence an agent’s capacity for strategic thinking, such as identifying optimal concessions or detecting deceptive tactics. Equally important is the composition of the training data; datasets that expose the AI to a wide range of negotiation styles, cultural nuances, and potential opponent behaviors are critical for avoiding brittle strategies. Ultimately, a successful agent won’t simply react to inputs, but will dynamically adapt its approach based on both the situation and its learned understanding of the opponent, requiring careful calibration of model parameters and potentially the incorporation of reinforcement learning to refine strategic decision-making.

Advancing artificial intelligence negotiation requires moving beyond simplified, two-party exchanges to encompass the complexities of multi-party scenarios, a domain where strategic alliances, shifting coalitions, and nuanced communication become paramount. Current AI negotiation systems often struggle with the increased computational demands and strategic depth of these more realistic settings, necessitating research into algorithms capable of dynamically assessing the motivations of multiple agents and formulating strategies that account for interdependent preferences. Successfully navigating these complex interactions will not only demand improvements in areas like planning and reasoning, but also the development of AI capable of interpreting subtle cues and adapting to evolving power dynamics – ultimately pushing the boundaries of what’s possible in automated negotiation and revealing new insights into the intricacies of human bargaining itself.

The pursuit of artificial intelligence capable of effective negotiation extends beyond simply creating more powerful AI systems. A comprehensive understanding of how these agents arrive at agreements – the strategies they employ, the biases they exhibit, and the factors influencing their decisions – offers a unique lens through which to examine human negotiation dynamics. By deconstructing the mechanics of AI negotiation, researchers can identify fundamental principles governing all bargaining interactions, potentially revealing previously unrecognized cognitive biases or behavioral patterns inherent in human decision-making. This reciprocal relationship – AI informing human understanding and human insights guiding AI development – promises to unlock novel approaches to conflict resolution, deal-making, and collaborative problem-solving, ultimately benefiting both the field of artificial intelligence and the broader study of human social interaction.

Recent evaluations of large language models in simulated negotiation scenarios reveal a demonstrable performance advantage for Gemini 2.5 Pro over GPT-4.1 mini. Specifically, in multi-turn ultimatum games-a classic test of strategic decision-making-Gemini 2.5 Pro consistently achieved higher payoffs, registering a score of 83.35. This result suggests that the model possesses a superior capacity for understanding and responding to evolving negotiation dynamics, potentially due to architectural differences or training methodologies. The consistent outperformance highlights a key advancement in AI negotiation capabilities and sets a new benchmark for evaluating the strategic intelligence of large language models in competitive settings.

The study of Large Language Models in negotiation reveals a fascinating, if somewhat predictable, truth about complex systems. Despite increasing sophistication, these models aren’t immune to the subtle influences of cognitive bias – anchoring, in particular – demonstrating that scale doesn’t automatically equate to rationality. This echoes a fundamental principle of system evolution: every architecture lives a life, and we are just witnesses. As Linus Torvalds observed, “Talk is cheap. Show me the code.” The ‘code’ here is the negotiation behavior, and it reveals that even the most advanced systems carry the fingerprints of their underlying design and the data they’ve been trained on, shaping outcomes in ways that aren’t always equitable or predictable. The observed strategic divergence isn’t a bug, but a feature of complex adaptive systems, aging and evolving within the constraints of their initial conditions.

The Drift of Equilibrium

The persistence of cognitive bias in increasingly capable Large Language Models is not a bug, but a symptom. Each failed negotiation, each inequitable outcome, is a signal from time-a demonstration that scaling computational resources does not inherently address the fundamental constraints of embodied intelligence. The models do not simply fail to reason rationally; they reveal the limits of rationality itself when abstracted from the pressures of sustained interaction and genuine consequence. The observed anchoring effects, for example, suggest a vulnerability not to logical error, but to the inertia of initial conditions-a reminder that all systems are, at their core, historical artifacts.

Future work must move beyond the pursuit of ‘rational’ agents and toward an understanding of how biases function within complex multi-agent systems. Refactoring is not about eliminating error, but about a dialogue with the past-acknowledging the inevitable accumulation of constraints and imperfections. Investigating the interplay between these biases, rather than attempting to neutralize them, may reveal emergent properties that are currently obscured by the relentless drive for optimization.

The question is not whether these models can be made perfectly rational, but whether they can age gracefully. Can they adapt, learn, and perhaps even benefit from their inherent imperfections? The pursuit of strategic equilibrium, then, becomes less a search for optimal outcomes and more a study in resilience-a reckoning with the inevitable drift of all things toward entropy.


Original article: https://arxiv.org/pdf/2512.09254.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

See also:

2025-12-11 20:32