Lo nuevo en Article Factory y lo último en el mundo de la IA generativa

AI Models Fall Short on New Professional Benchmark, Researchers Find

AI Models Fall Short on New Professional Benchmark, Researchers Find
A new benchmark called APEX-Agents, designed to test AI performance on real-world professional tasks in consulting, investment banking, and law, reveals that current AI models struggle to meet the demands of knowledge work. Researchers from Mercur report that even top-performing models answer only about a quarter of the questions correctly, highlighting challenges in multi-domain reasoning and information retrieval across tools like Slack and Google Drive. The findings suggest that AI is still far from replacing skilled professionals in high‑value roles. Leer más →