News
Anthropic shared the results of Project Vend, an experiment it ran for about a month to see how Claude Sonnet 3.7 would do ...
Discover how Anthropic’s AI, Claude, is reshaping emotional support and the ethical questions it raises about human-machine ...
34mon MSN
Moonshot released the Kimi K2 AI model, a new open-source large language model that is positioned as a direct competitor to ...
Anthropic shifted its stance on AI use in job applications, allowing job candidates to collaborate with its chatbot Claude in ...
To Anthropic researchers, the experiment showed that AI won’t take your job just yet. Claude “made too many mistakes to run the shop successfully,” they wrote. Claude ended up making a loss; the ...
Researchers at Anthropic and AI safety company Andon Labs gave an instance of Claude Sonnet 3.7 an office vending machine to ...
Despite Claude making simple (and bizarre) errors as manager of a small store, Anthropic still believes AI middle managers ...
Metal cubes, a fake Venmo account, and an AI identity crisis — Claude's store stint spiraled quickly.
New research from Anthropic suggests that most leading AI models exhibit a tendency to blackmail, when it's the last resort ...
Anthropic is upgrading Claude for Education with the addition of integrations to three popular learning apps — Canvas, ...
The $20/month Claude 4 Opus failed to beat its free sibling, Claude 4 Sonnet, in head-to-head testing. Here's how Sonnet quietly crushed expectations with smarter, safer code.
In tests, Claude 2.0 outperformed its predecessor across multiple measures. It scored 71.2% on a Python coding test, up from 56%; it raised its middle school math quiz grade to 88% from 85.2%; and ...
Results that may be inaccessible to you are currently showing.
Hide inaccessible results