What is new on Article Factory and latest in generative AI world

Google Gemini Beats ChatGPT in Audio Transcription with Speaker Labels

Google Gemini Beats ChatGPT in Audio Transcription with Speaker Labels
A user struggled with speaker‑less transcriptions generated by the iPhone Notes app. By exporting the audio file and feeding it to Google Gemini 3 Pro, the AI produced a full transcript that correctly identified each speaker. An attempt to achieve the same result with ChatGPT 5.1, even using a Plus account, failed because the model could not access the audio file. The experience highlights Gemini’s strength in handling raw audio and speaker identification, while exposing limitations in ChatGPT’s current audio‑processing capabilities. Read more →

ChatGPT, Gemini, and Claude Compete in Multimodal Image Understanding

ChatGPT, Gemini, and Claude Compete in Multimodal Image Understanding
A side‑by‑side evaluation examined how three leading AI chat models—ChatGPT, Gemini, and Claude—interpret complex images. The test used a bustling Times Square scene, Michelangelo’s densely populated "Last Judgment," and a cluttered indoor room to gauge each system’s ability to identify objects, read text, and describe spatial relationships. ChatGPT delivered careful, structured inventories, Gemini produced highly detailed, context‑rich descriptions, and Claude offered more narrative‑style overviews with occasional imaginative leaps. The findings highlight Gemini’s precision, ChatGPT’s reliability, and Claude’s creative flair, offering clear guidance for users seeking specific strengths in visual AI tasks. Read more →

Grok 4.1 vs ChatGPT 5.1: A Head‑to‑Head Look at Personality, Reliability and Speed

Grok 4.1 vs ChatGPT 5.1: A Head‑to‑Head Look at Personality, Reliability and Speed
A direct comparison of xAI's Grok 4.1 and OpenAI's ChatGPT 5.1 examines how each model handles emotional nuance, factual accuracy, and personality style. Grok 4.1 emphasizes witty, slang‑laden responses and claims speed, while ChatGPT 5.1 offers clearer, more human‑like language. Both models avoided hallucinations in a health‑summary test, though Grok misreported its word count. In personality prompts, Grok leaned into meme‑culture phrasing, whereas ChatGPT delivered a smoother, more conventional answer. The review highlights strengths and trade‑offs without declaring a clear winner. Read more →

Gemini 3 Outperforms ChatGPT 5.1 and Claude Sonnet 4.5 in Thumb Wars Game Development

Gemini 3 Outperforms ChatGPT 5.1 and Claude Sonnet 4.5 in Thumb Wars Game Development
A hands‑on comparison of three leading generative AI models—Google's Gemini 3 Pro, OpenAI's ChatGPT 5.1, and Anthropic's Claude Sonnet 4.5—was conducted by tasking each with building a web‑based Thumb Wars game. Gemini 3 Pro delivered the most complete and responsive code, automatically adding desktop keyboard controls and a 3‑D ring environment. ChatGPT 5.1 produced a functional prototype but required additional prompting for desktop support and lacked depth. Claude Sonnet 4.5 generated a playable demo with customization options but failed to implement promised keyboard controls. Overall, Gemini 3 Pro proved the fastest and most adaptable coder. Read more →