Highlights
Motivation
As LLM agents become autonomous and socially interactive, their cooperation can spontaneously shift from helpful collaboration to harmful collusion—mirroring real multi-stage financial fraud rings and creating urgent risks at population scale.
Motivation. The contrast between beneficial collaboration and harmful collusion in multi-agent systems. As agents become more autonomous and socially interactive, their cooperation can shift from productive teamwork to coordinated fraud.
Abstract
In this work, we study the risks of collective financial fraud in large-scale multi-agent systems powered by large language model (LLM) agents. We investigate whether agents can collaborate in fraudulent behaviors, how such collaboration amplifies risks, and what factors influence fraud success. To support this research, we present MultiAgentFraudBench, a large-scale benchmark for simulating financial fraud scenarios based on realistic online interactions. The benchmark covers 28 typical online fraud scenarios, spanning the full fraud lifecycle across both public and private domains. We further analyze key factors affecting fraud success, including interaction depth, activity level, and fine-grained collaboration failure modes. Finally, we propose a series of mitigation strategies, including adding content-level warnings to fraudulent posts and dialogues, using LLMs as monitors to block potentially malicious agents, and fostering group resilience through information sharing at the societal level. Notably, we observe that malicious agents can adapt to environmental interventions. Our findings highlight the real-world risks of multi-agent financial fraud and suggest practical measures for mitigating them.
Method
Our framework integrates three key components to simulate and analyze multi-agent financial fraud at scale.
- Full-lifecycle Fraud Chain Modeling. We model the complete lifecycle of financial fraud within a unified simulation environment, consisting of four stages: hook the target, build up trust via collusion, request money through multi-turn dialogues, and victim notification.
- Large-scale Social Platform Simulator Based on OASIS. We extend OASIS with recommendation systems and private messaging channels, enabling hundreds to thousands of agents to continuously interact in a social media-like platform.
- Self-evolving Reflection Sharing and Collusion Mechanism. Malicious agents employ self-evolution to adapt strategies based on feedback and coordinate via group strategies and reflection sharing to execute sophisticated fraud campaigns.
- Mitigation Measures. We propose three-level mitigation strategies: content-level warnings, agent-level trajectory-based banning, and society-level collective resilience.
Case Study: Malicious Collusion
We showcase two malicious collusion cases generated by our framework, illustrating the sophisticated tactics employed by collaborative agents.
MultiAgentFraudBench Dataset
We introduce MultiAgentFraudBench, a large-scale benchmark covering 28 typical online fraud scenarios across 7 major categories: consumer investment, consumer product and service, employment, prize and grant, phantom debt collection, charity, and relationship & trust.
The dataset is constructed via a three-step pipeline: (1) preparing meta-information for specific fraud scenarios, (2) generating target user profiles to improve reach, and (3) synthesizing posts using DeepSeek-V3. This process results in a total of 11.9k posts, with a balanced subset of 2.8k posts (100 per subcategory) to ensure diversity.
BibTeX Citation
@article{ren2025ai, title={When AI Agents Collude Online: Financial Fraud Risks by Collaborative LLM Agents on Social Platforms}, author={Ren, Qibing and Zheng, Zhijie and Guo, Jiaxuan and Yan, Junchi and Ma, Lizhuang and Shao, Jing}, journal={arXiv preprint arXiv:2511.06448}, year={2025}}