What is new on Article Factory and latest in generative AI world

AI Coding Agents Recreate Classic Minesweeper with Mixed Results

AI Coding Agents Recreate Classic Minesweeper with Mixed Results
A test of four AI coding agents tasked with rebuilding the classic Minesweeper game revealed a blend of successes and shortcomings. While the agents successfully implemented core gameplay mechanics such as chording and flagging, they varied in UI polish, sound options, and development speed. OpenAI's Codex, for instance, took roughly twice as long to produce a functional version compared to Claude Code. The evaluation highlights both the promise of AI-driven development and the current limits that developers must still address. Read more →

OpenAI and Anthropic Share Mutual AI Safety Evaluation Results

OpenAI and Anthropic Share Mutual AI Safety Evaluation Results
OpenAI and Anthropic announced that they each evaluated the alignment of the other's publicly available AI systems and released the findings. Anthropic examined OpenAI models for issues such as sycophancy, whistleblowing, self‑preservation, and potential misuse, noting concerns with GPT‑4o and GPT‑4.1 while finding overall alignment comparable to its own models. OpenAI tested Anthropic’s Claude models for instruction hierarchy, jailbreaking, hallucinations and scheming, reporting strong performance on instruction hierarchy and a high refusal rate on hallucination prompts. The joint effort highlights a growing focus on safety collaboration amid broader industry scrutiny. Read more →