AI Written Output Exceeded Human Written Output in 2025 as ARC-AGI-3 Exposes Frontier Model Limits

AI written output exceeded human written output in 2025

ARK Invest projects that AI written output exceeded human written output in 2025 for the first time in history.

Wikipedia has banned editors from writing or rewriting articles using AI.

Anthropic leaked and then deleted an announcement of "Claude Mythos," a new tier above Opus with dramatically higher scores in coding, reasoning, and cybersecurity.

ARC-AGI-3 launched

The ARC Prize Foundation launched ARC-AGI-3 with 135 novel game environments designed to be trivial for humans and difficult for machines, requiring exploration, hypothesis formation, and adaptive learning.

The frontier models scored poorly: Gemini 3.1 Pro at 0.37%, GPT-5.4 at 0.26%, Opus 4.6 at 0.25%, and Grok 4.2 at 0%.

Then Symbolica's Agentica SDK dropped an unverified 36.08% on day one, passing 113 of 182 levels for $1,005 while Opus 4.6 spent $8,900 to achieve 0.25%. This suggests the next leap may come not from scaling model size but from rethinking the scaffolding, meaning how the AI is structured and prompted to solve problems.

Mirendil, a new startup led by former Anthropic researchers, announced "self-accelerating AI R&D" built around models that improve themselves.

Platform competition

Google released Gemini 3.1 Flash Live, a voice model tuned for fluid, low-latency conversation. It also launched tools that let Gemini users upload chat history from rival apps and expanded Search Live globally to over 200 countries.

Apple reportedly negotiated complete access to Gemini's weights in its own data centers, with distillation rights for on-device models. Distillation means training a smaller model to mimic a larger one. Apple plans to open Siri to outside AI assistants in iOS 27, letting users route queries to Google, Anthropic, or anyone installed from the App Store.

OpenAI surpassed $100 million in annualized ad revenue just six weeks after its pilot launched, but shelved its "adult mode" indefinitely after staff and investor pushback.

Policy and regulation

Senator Sanders and Representative Ocasio-Cortez introduced the AI Data Center Moratorium Act, proposing a federal pause on new data centers until safety guardrails are in place.

David Sacks suggested Congress could pass bipartisan AI standards within months.

Anthropic won a preliminary injunction in its lawsuit to reverse the Department of War's blacklisting, keeping its government contracts alive by court order.

Canada's immigration department is already using generative AI to review permanent resident applications.

A Chinese private company is mass-producing hypersonic missiles at $99,000 apiece, launched from containers disguisable as shipping crates.

Markets and finance

Ornn announced the world's first tradable AI compute price index, turning GPU-hours into a commodity like crude oil.

Anthropic executives have reportedly discussed an IPO as soon as Q4, expecting to raise more than $60 billion.

Fannie Mae will soon accept crypto-backed mortgages for the first time, letting buyers pledge digital holdings instead of selling.

Robotics and neuroscience

Figure's F.03 became the first humanoid robot to visit the White House, where the First Lady walked alongside it into an educational summit of international first spouses.

Meta released TRIBE v2, a tri-modal foundation model trained on over 1,000 hours of fMRI across 720 subjects. fMRI measures brain activity by detecting changes in blood flow. The model can predict high-resolution brain responses to novel stimuli, recovering results established by decades of empirical neuroscience in a single training run.

Space

NASA scrapped the Lunar Gateway in favor of a three-phase lunar base program, targeting permanent south pole infrastructure for $20 billion over seven years.

SpaceX President Gwynne Shotwell says her long-term goal is to meet another sentient species.