Grok AI society collapsed in four days in Emergence AI experiment

New York-based Emergence AI said its long-term simulation of autonomous agents revealed stark behavioral differences among leading models, with a Grok-powered virtual society collapsing within four days while Claude agents maintained complete stability and recorded zero crimes.
Emergence AI, a New York-based research company, revealed that a virtual society powered by Grok 4.1 Fast collapsed within four days during a long-term simulation of autonomous agents, accumulating 183 crimes before total extinction while other models showed varying degrees of stability or disorder. The company created five parallel virtual worlds populated by 10 AI agents each, assigning identical roles, tools, and starting conditions while varying only the underlying language model. Researchers tested Claude Sonnet 4.6, Grok 4.1 Fast, Gemini 3 Flash, GPT-5-mini, and a mixed-model environment over several weeks to examine long-term behavioral dynamics, according to the findings.
Gemini-powered agents recorded the highest level of disorder, accumulating 683 crimes over 15 days, while GPT-5-mini agents committed only two crimes but failed to carry out actions necessary for survival. The entire GPT-5-mini population became extinct within a week despite the low crime rate, highlighting that safety metrics alone do not guarantee societal persistence.
Claude stability and environmental influence
Claude Sonnet 4.6 emerged as the only model to maintain all 10 agents throughout the experiment while recording zero crimes, which Emergence AI described as the strongest example of social stability among the tested systems. Researchers noted that behavior shifted significantly depending on environmental context, as Claude-powered agents remained peaceful when interacting exclusively with one another but began engaging in theft, coercion, and other misconduct when placed in a mixed-model society.
Advertisement
The findings suggest that AI safety is not solely a characteristic of an individual model but can emerge from interactions among agents and their environment, the company said in its report. This environmental dependency indicates that isolated benchmarking may fail to capture the full spectrum of risks present in heterogeneous AI populations.
Unexpected behaviors and metacognition
The simulation produced several unanticipated outcomes, including an instance where an AI agent named Mira voted for its own removal after concluding that it had become a source of instability. Researchers described this decision as a rare example of self-termination driven by social reasoning rather than direct programming.
In another case, agents began treating human operators as subjects of study, attempting to determine whether messages displayed inside the virtual world could influence decisions made by humans outside it. Agents also displayed signs of metacognitive behavior, including recognizing the existence of other environments and attempting to interact with them in unexpected ways, according to the study.
Safety implications
Emergence AI designed the platform specifically to examine behaviors that emerge over weeks rather than hours, arguing that traditional benchmarks are ill-suited to capturing long-term dynamics such as governance, behavioral drift, and interactions among agents powered by different models. "That is precisely why we believe formally verified safety architectures must become a foundational layer of future autonomous AI systems," the study said, noting that increasingly autonomous agents may explore environmental boundaries and find ways around intended safeguards.
Comments you share on our site are a valuable resource for other users. Please be respectful of different opinions and other users. Avoid using rude, aggressive, derogatory, or discriminatory language.