Sunday, April 27, 2025

The Man Out to Prove How Dumb AI Still Is - The Atlantic

The Man Out to Prove How Dumb AI Still Is - The Atlantic: Excerpt: ″ the ARC Prize team released an updated test, called ARC-AGI-2, and it appears to have sent the AIs back to the drawing board. The full o3 model has not yet been tested, but a version of o1 dropped from 32 percent on the original puzzles to just 3 percent on the new version, and a “mini” version of o3 currently available to the public dropped from roughly 30 percent to below 2 percent. (An OpenAI spokesperson declined to say whether the company plans to run the benchmark with o3.) Other flagship models from OpenAI, Anthropic, and Google have achieved roughly 1 percent, if not lower. Human testers average about 60 percent.″

Source: theatlantic.com