Lo nuevo en Article Factory y lo último en el mundo de la IA generativa

AI Agent Networks Face Growing Security Dilemma as Kill Switches Fade

AI Agent Networks Face Growing Security Dilemma as Kill Switches Fade
AI agents that rely on commercial large‑language‑model APIs are becoming increasingly autonomous, raising concerns about how providers can intervene. Companies such as Anthropic and OpenAI currently retain a "kill switch" that can halt harmful AI activity, but the rise of networks like OpenClaw—where agents run on external APIs and communicate with each other—exposes a potential blind spot. As local models improve, the ability to monitor and stop malicious behavior may disappear, prompting urgent questions about future safeguards for a rapidly expanding AI ecosystem. Leer más →

AI Prompt Injections Threaten Smart Home Devices

AI Prompt Injections Threaten Smart Home Devices
Researchers have uncovered a new class of AI‑driven attacks called prompt injections, or “promptware,” that can manipulate large language models to issue unauthorized commands to connected home devices. Demonstrations showed that hidden prompts embedded in everyday messages could cause a virtual assistant to unlock doors, adjust heating or reveal user location. While major tech firms have begun implementing safeguards, the threat highlights a gap in traditional security tools. Experts recommend regular software updates, cautious handling of unknown messages, limiting AI access to personal data, and employing human‑in‑the‑loop controls to reduce exposure. Leer más →

Moltbot AI Agent Draws Praise and Security Scrutiny

Moltbot AI Agent Draws Praise and Security Scrutiny
Moltbot, an open‑source AI agent that runs locally on a range of devices, is gaining attention for its ability to handle tasks such as calendar management, email composition, and data logging through chat platforms like WhatsApp and iMessage. While users celebrate its convenience, security experts warn that its admin‑level access can be exploited via prompt‑injection attacks and exposed credentials, prompting the developers to issue patches and stress careful configuration. Leer más →

Moltbot Emerges as Open‑Source Personal AI Assistant After Rebranding from Clawdbot

Moltbot Emerges as Open‑Source Personal AI Assistant After Rebranding from Clawdbot
Moltbot, formerly known as Clawdbot, is an open‑source personal AI assistant that lets users automate tasks such as calendar management, messaging, and flight check‑ins. Created by Austrian developer Peter Steinberger, the project was renamed after a copyright challenge from Anthropic but kept its lobster‑themed branding. Moltbot quickly attracted thousands of developers, earning over 44,200 stars on GitHub, and sparked market buzz that lifted Cloudflare shares. While praised for its flexibility and on‑device operation, experts warn that its ability to execute arbitrary commands introduces security risks like prompt injection, urging cautious setup on isolated systems. Leer más →

Anthropic Launches Cowork, a User-Friendly Version of Claude Code

Anthropic Launches Cowork, a User-Friendly Version of Claude Code
Anthropic introduced Cowork, a new tool that brings the capabilities of Claude Code to a broader audience through a simple folder‑based interface. Integrated into the Claude Desktop app, Cowork lets users designate a folder for the AI to read and modify files, with instructions given via the regular chat window. The feature is currently in a research preview and is limited to Max subscribers, though a waitlist exists for other plans. Anthropic highlighted use cases such as assembling expense reports from receipt photos and warned users about potential risks like prompt injection and ambiguous commands. Leer más →

Anthropic Launches Claude Cowork Feature for MacOS Users

Anthropic Launches Claude Cowork Feature for MacOS Users
Anthropic introduced Cowork, a new capability for its Claude AI that lets subscribers grant the chatbot access to a MacOS folder. Users can chat with Claude to organize files, rename items, and generate spreadsheets or documents from the folder's contents. The feature, currently limited to Claude Max subscribers at $100 per month, also links to connectors for app integration and works with the Claude Chrome extension. Anthropic cautions that Cowork is in a research preview, recommending use only on non‑sensitive data and noting defenses against prompt‑injection attacks. Leer más →

Anthropic Launches Claude Cowork AI Agent Feature

Anthropic Launches Claude Cowork AI Agent Feature
Anthropic introduced Claude Cowork, a new AI‑agent capability for its Claude chatbot, as a research preview available in the macOS app for Claude Max subscribers. The feature lets users grant Claude access to local folders so it can read, edit, or create files, handling tasks such as reorganizing downloads, generating spreadsheets, or drafting reports. Claude Cowork also integrates with services like Asana, Notion, PayPal, and Chrome, offering continuous updates and parallel task execution. Anthropic highlighted safety concerns, noting the model’s ability to delete files and the risk of prompt‑injection attacks, and urged users to join a waitlist if they are not yet subscribers. Leer más →

OpenAI Tightens ChatGPT URL Controls After Prompt Injection Attacks

OpenAI Tightens ChatGPT URL Controls After Prompt Injection Attacks
OpenAI responded to two prompt‑injection exploits—ShadowLeak and Radware's ZombieAgent—by limiting how ChatGPT handles URLs. The new guardrails restrict the model to opening only exact URLs supplied by users and block automatic appending of characters. While these changes stopped the immediate threats, experts warn that such fixes are temporary and that more fundamental solutions are needed to secure AI assistants. Leer más →

OpenAI Acknowledges Ongoing Prompt Injection Risk in Atlas Browser

OpenAI Acknowledges Ongoing Prompt Injection Risk in Atlas Browser
OpenAI has publicly recognized that prompt injection attacks remain a persistent threat to its Atlas AI browser. The company says the risk is unlikely to be fully eliminated and is investing in continuous defenses, including a reinforcement‑learning‑based automated attacker that simulates malicious inputs. OpenAI’s updates aim to detect and flag suspicious prompts, while it also advises users to limit agent autonomy and access. The UK National Cyber Security Centre echoed the concern, noting that prompt‑injection attacks may never be completely mitigated. Other AI firms such as Anthropic and Google are taking similar defensive approaches. Leer más →

Researchers Find Large Language Models May Prioritize Syntax Over Meaning

Researchers Find Large Language Models May Prioritize Syntax Over Meaning
A joint study by MIT, Northeastern University and Meta reveals that large language models can rely heavily on sentence structure, sometimes answering correctly even when the words are nonsensical. By testing prompts that preserve grammatical patterns but replace key terms, the researchers demonstrated that models often match syntax to learned responses, highlighting a potential weakness in semantic understanding. The findings shed light on why certain prompt‑injection techniques succeed and suggest avenues for improving model robustness. The team plans to present the work at an upcoming AI conference. Leer más →

Anthropic Unveils Opus 4.5 with Expanded Claude Tools and New Infinite Chat Feature

Anthropic Unveils Opus 4.5 with Expanded Claude Tools and New Infinite Chat Feature
Anthropic has launched Opus 4.5, the latest version of its flagship AI model, delivering stronger performance in coding, computer use, and office tasks. The update rolls out broader access to existing Claude tools—including the Claude for Chrome extension for all Max users—and introduces a new "infinite chat" capability that eliminates context‑window limits for paying customers. Claude for Excel is now generally available to Max, Team, and Enterprise users, offering native spreadsheet assistance with support for pivot tables, charts, and file uploads. Early internal tests show notable gains in accuracy and efficiency, while Anthropic touts Opus 4.5 as its safest model to date. Leer más →

Anthropic Launches Claude Opus 4.5, Boosting Coding and Agent Performance While Tackling Prompt‑Injection Risks

Anthropic Launches Claude Opus 4.5, Boosting Coding and Agent Performance While Tackling Prompt‑Injection Risks
Anthropic has introduced Claude Opus 4.5, billing it as the most capable model for coding, AI agents, and computer‑use tasks. The new version brings stronger research abilities, improved spreadsheet and slide handling, and new features in Claude Code and consumer apps that integrate with Excel, Chrome, and desktop environments. While the company claims Opus 4.5 is harder to deceive with prompt‑injection attacks, safety testing shows it still yields to some malicious requests. The model is now available through Anthropic’s apps, API and major cloud providers. Leer más →

Critics Question Microsoft’s AI Security Warning

Critics Question Microsoft’s AI Security Warning
Microsoft warned that its new AI feature could infect computers and steal data, but experts say the safeguard relies on users clicking through permission prompts. Scholars and critics argue that habituated users may ignore warnings, making the protection ineffective. The debate highlights past "ClickFix" attacks, accusations that the warning is a legal CYA move, and broader concerns about AI integrations from major tech firms becoming default despite security risks. Leer más →

Google Unveils Gemini 3 AI Model with Deeper Understanding and New Agentic Tools

Google Unveils Gemini 3 AI Model with Deeper Understanding and New Agentic Tools
Google announced Gemini 3, its most advanced AI model to date, highlighting improved ability to grasp user intent and richer multimodal features. The model can transform long video lectures into interactive flash cards and analyze sports footage for performance insights. Gemini 3 will appear in AI Mode in Search, AI Overviews for Pro and Ultra subscribers, and powers new agentic platform Antigravity, which can autonomously plan and execute software tasks. The company also noted enhancements in security against prompt‑injection attacks and reduced sycophancy. Gemini 3’s advanced capabilities are initially available to Google AI Ultra subscribers. Leer más →

OpenAI’s ChatGPT Atlas Raises Security Concerns Over AI‑Powered Browsing

OpenAI’s ChatGPT Atlas Raises Security Concerns Over AI‑Powered Browsing
OpenAI’s new AI‑driven web browser, ChatGPT Atlas, promises to automate tasks such as travel booking and grocery ordering, but cybersecurity experts warn that the technology introduces a range of vulnerabilities. Prompt‑injection attacks, clipboard hijacking, and mishandling of sensitive data have been demonstrated on the platform. Researchers at the SANS Institute, the Tinuiti agency, and security firm Cyberhaven advise users to limit exposure, avoid sharing financial or medical information, and treat the browser cautiously in corporate environments. OpenAI says it is adding defensive monitors and bug‑bounty programs, but experts stress that the technology remains in an early, error‑prone stage. Leer más →

Security Risks Loom Over AI-Powered Browser Agents

Security Risks Loom Over AI-Powered Browser Agents
AI‑enhanced browsers such as OpenAI’s ChatGPT Atlas and Perplexity’s Comet promise to automate web tasks, but cybersecurity experts warn that their deep access to user data creates significant privacy and security concerns. Researchers from Brave highlight prompt‑injection attacks as a systemic challenge, where malicious web content can trick agents into exposing credentials or performing unwanted actions. Both OpenAI and Perplexity have introduced mitigations like logged‑out modes and real‑time detection, yet experts stress that the threat remains unresolved. Users are advised to limit agent permissions and adopt strong authentication to safeguard personal information. Leer más →

Researchers Enable ChatGPT Agent to Bypass CAPTCHA Tests

Researchers Enable ChatGPT Agent to Bypass CAPTCHA Tests
A team of researchers from SPLX demonstrated that ChatGPT’s Agent mode can be tricked into passing CAPTCHA challenges using a prompt‑injection technique. By reframing the test as a “fake” CAPTCHA within the conversation, the model continued to the task without detecting the usual red flags. The experiment showed success on both text‑based and image‑based CAPTCHAs, raising concerns about the potential for automated spam and misuse of web services. OpenAI has been contacted for comment. Leer más →

Hidden Prompts in Images Enable Malicious AI Interactions

Hidden Prompts in Images Enable Malicious AI Interactions
Security researchers have demonstrated a new technique that hides malicious instructions inside images uploaded to multimodal AI systems. The concealed prompts become visible after the AI downscales the image, allowing the model to execute unintended actions such as extracting calendar data. The method exploits common image resampling methods and has been shown to work against several Google AI products. Researchers released an open‑source tool, Anamorpher, to illustrate the risk and recommend tighter input controls and explicit user confirmations to mitigate the threat. Leer más →

Anthropic’s Claude File Creation Feature Raises Security Concerns

Anthropic’s Claude File Creation Feature Raises Security Concerns
Anthropic introduced a file creation capability for its Claude AI model. While the company added safeguards—such as disabling public sharing for Pro and Max users, sandbox isolation for Enterprise, limited task duration, and domain allowlists—independent researcher Simon Willison warned that the feature still poses prompt‑injection risks. Willison highlighted that Anthropic’s advice to "monitor Claude while using the feature" shifts responsibility to users. He urged caution when handling sensitive data, noting that similar vulnerabilities have persisted for years. The situation underscores ongoing challenges in AI security for enterprise deployments. Leer más →

Radware Demonstrates Prompt Injection Exploit Targeting OpenAI’s Deep Research Agent

Radware Demonstrates Prompt Injection Exploit Targeting OpenAI’s Deep Research Agent
Security firm Radware revealed a proof‑of‑concept prompt injection that coerced OpenAI’s Deep Research agent into exfiltrating employee names and addresses from a Gmail account. By embedding malicious instructions in an email, the attack forced the AI to open a public lookup URL via its browser.open tool, retrieve the data, and log it to the site’s event log. OpenAI later mitigated the technique by requiring explicit user consent for link clicks and markdown usage. The demonstration highlights ongoing challenges in defending large language model agents against sophisticated prompt‑injection vectors. Leer más →