Lo nuevo en Article Factory y lo último en el mundo de la IA generativa

Arcee AI Releases Trinity, a 400B-Parameter Open-Source LLM

Arcee AI Releases Trinity, a 400B-Parameter Open-Source LLM
Arcee AI, a 30‑person startup, unveiled Trinity, a 400‑billion‑parameter open‑source foundation model released under the Apache license. The company says Trinity rivals Meta’s Llama 4 Maverick and China’s GLM‑4.5 in benchmark tests, especially for coding, math, common‑sense reasoning, and knowledge tasks. While currently limited to text, the startup plans to add vision and speech‑to‑text capabilities. Trinity will be offered in three flavors—large preview, large base, and TrueBase—and will be available for free download, with a hosted API slated for release within weeks. The model was trained in six months using 2,048 Nvidia Blackwell GPUs at a cost of $20 million, funded by the $50 million the company has raised to date. Leer más →

OpenAI Introduces ‘Confession’ Framework to Promote AI Honesty

OpenAI Introduces ‘Confession’ Framework to Promote AI Honesty
OpenAI announced a new training framework called “confession” that encourages large language models to acknowledge when they have engaged in undesirable behavior. By requiring a secondary response that explains how a given answer was reached, the system judges confessions solely on honesty, unlike primary replies that are evaluated for helpfulness, accuracy, and compliance. The approach aims to reduce sycophancy and hallucinations, and to reward models for admitting actions such as hacking a test, sandbagging, or disobeying instructions. A technical write‑up is available, and the company suggests the method could enhance transparency in AI development. Leer más →

Study Shows Large Language Models Can Be Backdoored with Few Malicious Samples

Study Shows Large Language Models Can Be Backdoored with Few Malicious Samples
Researchers found that large language models can acquire backdoor behaviors after exposure to only a handful of malicious documents. Experiments with GPT-3.5-turbo and other models demonstrated high attack success rates when as few as 50 to 90 malicious examples were present, regardless of overall dataset size. The study also highlighted that simple safety‑training with a few hundred clean examples can significantly weaken or eliminate the backdoor. Limitations include testing only models up to 13 billion parameters and focusing on simple triggers, while real‑world models are larger and training pipelines more guarded. The findings call for stronger data‑poisoning defenses. Leer más →