Lo nuevo en Article Factory y lo último en el mundo de la IA generativa

Anthropic Finds LLMs’ Self‑Introspection Highly Unreliable

Anthropic Finds LLMs’ Self‑Introspection Highly Unreliable
Anthropic’s recent tests reveal that even its most advanced language models, Opus 4 and Opus 4.1, struggle to reliably identify internally injected concepts. The models correctly recognized the injected “thought” only about 20 percent of the time, and performance improved modestly to 42 percent in a follow‑up query. Results varied sharply depending on which internal layer the concept was introduced, and the introspective ability proved brittle across repeated trials. While researchers note that the models display some functional awareness of internal states, they emphasize that the capability is far from dependable and remains poorly understood. Leer más →

Anthropic Unveils Sonnet 4.5, Its Safest and Most Capable AI Model Yet

Anthropic Unveils Sonnet 4.5, Its Safest and Most Capable AI Model Yet
Anthropic announced Sonnet 4.5, positioning it as the company’s safest and most advanced AI system. The new model outperforms its predecessor Sonnet 4 and the larger Opus 4.1 on coding and agentic benchmarks, surpassing rival offerings such as Google’s Gemini 2.5 Pro and OpenAI’s GPT‑5. Safety training reduces tendencies toward sycophancy, deception, and power‑seeking, and the model now includes Level‑3 safety filters that block hazardous content. Alongside Sonnet 4.5, Anthropic refreshed its Claude Code interface with checkpoint and file‑creation features, while keeping API pricing unchanged. Leer más →