Anthropic has released two revised models, Claude 3.5 Sonnet and Claude 3.5 Haiku, which are generally more powerful and are expected to make significant progress in programming. However, the most interesting feature is a new function: computer usage.
Available as a public beta, developers can now use the Claude models to instruct a computer to perform specific tasks via an API. This includes tasks such as viewing the screen, moving the mouse cursor, clicking buttons, and typing text.
In a demo video, Anthropic demonstrates how Claude fills out a form by searching through various data sources. The AI assistant first takes screenshots, recognizes that the required information is not in a table, and then, following the prompt, is able to search and retrieve the information needed to fill out the form.
A New Approach for Agent Systems
Computer usage is a new approach to developing generative AI systems as agents that can automate more complex and multi-step tasks. The system is designed to learn how to use existing computer infrastructure, rather than being trained for specific tasks.
This feature is still in its early stages of development. Claude 3.5 Sonnet is the first model to support computer usage. However, it has its weaknesses. Tasks such as scrolling or zooming, which humans can easily do, are challenging for Claude. Anthropic recommends that developers initially choose tasks with a lower risk of failure for their experiments.
Security aspects are also important for this AI system. Anthropic explains that computer usage could be a new attack vector for known threats such as spam, misinformation, and fraud. To prevent this, the system is designed to proactively detect if it is being used for malicious purposes, and Anthropic provides more detailed information on this process.
Revised Models with Enhanced Performance
The new models, Claude 3.5 Sonnet and Claude 3.5 Haiku, offer improved performance, particularly in the field of programming. According to the benchmarks released by Anthropic, the models show promising results.
Claude 3.5 Sonnet outperforms its predecessor and can be compared to both GPT-4o and Gemini 1.5 Pro. On the other hand, Claude 3.5 Haiku is a smaller model designed for speed, delivering benchmark performance comparable to that of GPT-4o mini.
Claude 3.5 Haiku will be available later this month through Amazon Bedrock and Google Cloud Vertex AI. Initially, it will be released as a text model, with image input capabilities to follow later on.