Anthropic’s Claude 3.5 Sonnet Outperforms GPT-4o

LexyJune 23, 2024

0 24 2 minutes read

Anthropic has introduced the Claude 3.5 Sonnet, a mid-tier model that performs better than rivals and, in some assessments, even outperforms the company’s top-tier Claude 3 Opus.

With increased rate restrictions for Claude Pro and Team plan members, Claude 3.5 Sonnet is now available for free via Claude.ai and the Claude iOS app. Additionally, Google Cloud’s Vertex AI, Amazon Bedrock, and the Anthropic API offer it. With a 200K token context window, the model costs $3 for every million input tokens and $15 for every million output tokens.

“Sets new industry benchmarks for graduate-level reasoning (GPQA), undergraduate-level knowledge (MMLU), and coding proficiency (HumanEval)” is how Anthropic describes Claude 3.5 Sonnet. The model shows improved comprehension of subtlety, humor, and intricate directions. It also does a great job of creating natural-sounding, high-quality content.

Claude 3.5 Sonnet

With double the speed of Claude 3 Opus, Claude 3.5 Sonnet excels at intricate jobs like orchestrating multi-step workflows and providing context-sensitive customer care. It performed much better than Claude 3 Opus (38%), solving 64% of the problems in an internal agentic coding examination.

Additionally, the model demonstrates enhanced vision capabilities, outperforming Claude 3 Opus on common vision metrics. This development is especially apparent in tasks that call on visual reasoning, including reading graphs and charts. The ability of Claude 3.5 Sonnet to precisely translate text from blurry photos is a useful tool for financial institutions, retail, and logistical companies.

Anthropic unveiled Artifacts on Claude.ai, a new feature that improves user engagement with the AI, concurrently with the model introduction. This tool fosters a more collaborative work atmosphere by enabling users to watch, edit, and expand upon information provided by Claude in real time.

Anthropic’s dedication to security and privacy is upheld by Claude 3.5 Sonnet, despite its notable advancement in intelligence. “Our models have been trained to reduce misuse and are subjected to rigorous testing,” the company claims.

The model’s safety measures have been tested and improved by outside experts, such as the UK’s AI Safety Institute (UK AISI) and Thorn’s child safety experts.

With the statement, “We do not train our generative models on user-submitted data unless a user gives us explicit permission to do so,” Anthropic highlights its commitment to user privacy. We haven’t trained our generative models with any user- or customer-submitted data as of yet.

In order to complete the Claude 3.5 model family, Anthropic intends to release Claude 3.5 Haiku and Claude 3.5 Opus later this year. In order to accommodate more corporate use cases, the company is also creating new modalities and features. Examples of these include enterprise application connectors and a memory function that allows for more customized user experiences.

LexyJune 23, 2024

0 24 2 minutes read