What is new on Article Factory and latest in generative AI world

Researchers Argue Bad Evaluation Incentives Drive AI Hallucinations

Researchers Argue Bad Evaluation Incentives Drive AI Hallucinations
A new paper from OpenAI examines why large language models such as GPT‑5 and ChatGPT continue to produce plausible but false statements, known as hallucinations. The authors explain that pretraining encourages models to predict the next word without distinguishing truth from falsehood, leading to errors on low‑frequency facts. They also argue that current evaluation methods reward correct answers regardless of confidence, prompting models to guess rather than express uncertainty. The paper proposes redesigning scoring systems to penalize confident mistakes, reward appropriate uncertainty, and discourage blind guessing, aiming to reduce hallucinations in future AI systems. Read more →

Researchers Argue Bad Evaluation Incentives Drive AI Hallucinations

Researchers Argue Bad Evaluation Incentives Drive AI Hallucinations
A new paper from OpenAI examines why large language models such as GPT‑5 and ChatGPT continue to produce plausible but false statements, known as hallucinations. The authors explain that pretraining encourages models to predict the next word without distinguishing truth from falsehood, leading to errors on low‑frequency facts. They also argue that current evaluation methods reward correct answers regardless of confidence, prompting models to guess rather than express uncertainty. The paper proposes redesigning scoring systems to penalize confident mistakes, reward appropriate uncertainty, and discourage blind guessing, aiming to reduce hallucinations in future AI systems. Read more →

Researchers Argue Bad Evaluation Incentives Drive AI Hallucinations

Researchers Argue Bad Evaluation Incentives Drive AI Hallucinations
A new paper from OpenAI examines why large language models such as GPT‑5 and ChatGPT continue to produce plausible but false statements, known as hallucinations. The authors explain that pretraining encourages models to predict the next word without distinguishing truth from falsehood, leading to errors on low‑frequency facts. They also argue that current evaluation methods reward correct answers regardless of confidence, prompting models to guess rather than express uncertainty. The paper proposes redesigning scoring systems to penalize confident mistakes, reward appropriate uncertainty, and discourage blind guessing, aiming to reduce hallucinations in future AI systems. Read more →

Researchers Argue Bad Evaluation Incentives Drive AI Hallucinations

Researchers Argue Bad Evaluation Incentives Drive AI Hallucinations
A new paper from OpenAI examines why large language models such as GPT‑5 and ChatGPT continue to produce plausible but false statements, known as hallucinations. The authors explain that pretraining encourages models to predict the next word without distinguishing truth from falsehood, leading to errors on low‑frequency facts. They also argue that current evaluation methods reward correct answers regardless of confidence, prompting models to guess rather than express uncertainty. The paper proposes redesigning scoring systems to penalize confident mistakes, reward appropriate uncertainty, and discourage blind guessing, aiming to reduce hallucinations in future AI systems. Read more →

Researchers Argue Bad Evaluation Incentives Drive AI Hallucinations

Researchers Argue Bad Evaluation Incentives Drive AI Hallucinations
A new paper from OpenAI examines why large language models such as GPT‑5 and ChatGPT continue to produce plausible but false statements, known as hallucinations. The authors explain that pretraining encourages models to predict the next word without distinguishing truth from falsehood, leading to errors on low‑frequency facts. They also argue that current evaluation methods reward correct answers regardless of confidence, prompting models to guess rather than express uncertainty. The paper proposes redesigning scoring systems to penalize confident mistakes, reward appropriate uncertainty, and discourage blind guessing, aiming to reduce hallucinations in future AI systems. Read more →