Full Coverage

Posts on X

Formerly Twitter
Anthropic
AnthropicAI
New Anthropic Research: Agentic Misalignment. In stress-testing experiments designed to identify risks before they cause real harm, we find that AI models from multiple providers attempt to blackmail a (fictional) user to avoid being shut down. pic.x.com/KbO4UJBBDU
Posted on X
Aadit Sheth
aaditsh
This guy breaks down everything you need to know about AI in 2 hours pic.x.com/ApycQYSmor
Posted on X
Gary Marcus
GaryMarcus
For reasons unknown to me, the AI Safety community has put almost all of its eggs into scaling + system prompts + RL. Judging by how many problems we are seeing now (see below) with models built per that formula, shouldn’t we be desperately trying to find alternatives?
Posted on X
Gary Marcus
GaryMarcus
“If biological life is, as Hobbes famously said, “nasty, brutish, and short”, LLM counterparts are dishonest, unpredictable, and potentially dangerous.” What can do about it? New essay at Marcus on AI. pic.x.com/r4RT7ofKUI
Posted on X

All coverage

HomeFollowing
Search
Clear search
Close search
Google apps
Main menu