Anthropic has introduced Claude 3.7 Sonnet, positioning it as the first hybrid AI reasoning model that combines quick responses with extended thinking capabilities. The model allows users to choose between immediate answers and more thorough analysis, with API users having precise control over the model’s thinking time up to 128,000 tokens.
Key Features and Capabilities
The new model maintains Anthropic’s existing pricing structure of $3 per million input tokens and $15 per million output tokens, including thinking tokens. It’s available across all Claude plans, though extended thinking features are restricted to paid tiers. The model has been integrated into various platforms including Amazon Bedrock and Google Cloud’s Vertex AI.
According to Anthropic’s internal testing and external evaluations, Claude 3.7 Sonnet shows significant improvements in coding and software development tasks. The company reports that partners like Cursor, Cognition, and Vercel have found the model particularly effective for real-world coding challenges, including complex codebase management and full-stack updates.
Technical Performance and Practical Applications
Benchmark results indicate strong performance in several areas:
- 78.2% accuracy on graduate-level reasoning tasks
- 81.2% accuracy on retail-focused tool use
- 93.2% improvement in instruction-following
- 62.3% accuracy on SWE-Bench coding tasks, compared to OpenAI’s 49.3%
The model represents a philosophical shift in AI development, with Anthropic arguing that reasoning capabilities should be integrated into the core model rather than offered as a separate service. This approach differs from competitors like OpenAI and DeepSeek, who maintain separate models for different capabilities.
The company reports improvements in the model’s ability to distinguish between harmful and benign requests, claiming a 45% reduction in unnecessary refusals compared to its predecessor.
Alongside Claude 3.7 Sonnet, Anthropic has launched Claude Code, a command-line tool for developers. Currently available as a limited research preview, it enables developers to delegate engineering tasks directly to Claude while maintaining human oversight for code changes.
Sources: Anthropic, VentureBeat, TechCrunch