OpenAI’s new Agent Mode, demonstrated on the Atlas model, was put through a series of web‑based tasks to assess its ability to search, click, and retrieve information without human input. While the agent succeeded in locating specific content such as macOS game demos, it frequently struggled with navigation, looping, and time limits, leading to incomplete task completion. Overall, the evaluation shows that the technology can handle simple, repetitive actions but is not yet reliable enough for fully autonomous use.
Read more →