What is new on Article Factory and latest in generative AI world

Jan 23, 2026

AI Models Fall Short on New Professional Benchmark, Researchers Find

A new benchmark called APEX-Agents, designed to test AI performance on real-world professional tasks in consulting, investment banking, and law, reveals that current AI models struggle to meet the demands of knowledge work. Researchers from Mercur report that even top-performing models answer only about a quarter of the questions correctly, highlighting challenges in multi-domain reasoning and information retrieval across tools like Slack and Google Drive. The findings suggest that AI is still far from replacing skilled professionals in high‑value roles. Read more →

Jan 7, 2026

LMArena Raises $150 Million to Scale Human‑Centred AI Evaluation Platform

LMArena, a crowdsourced AI comparison platform, secured $150 million in a Series A round, valuing the company at $1.7 billion. Backed by Felicis, UC Investments and leading venture firms, the funding will expand its commercial AI Evaluation service, which provides enterprises with real‑world, human‑anchored model rankings. By letting users compare anonymized responses and vote for the better answer, LMArena offers a dynamic alternative to static benchmarks. The approach has attracted both praise for delivering trust signals and criticism over potential bias and manipulation, highlighting the growing demand for richer AI assessment tools as models proliferate. Read more →

Jul 14, 2025

Kimi K2: The Open-Source AI Model That Outperforms ChatGPT and Claude

China's Moonshot AI has released Kimi K2, a brand-new, open-source AI model that outperforms OpenAI's GPT-4.1 and Anthropic's Claude Opus 4 on core coding tasks, at a significantly lower cost. Read more →