Claude 3.7 Thinking Sonnet

Claude 3.7 Thinking Sonnet

Claude 3.7 Thinking Sonnet exposes the full chain-of-thought process during problem-solving, including error backtracking and alternative solution exploration. Scores 86.1% on GPQA Diamond benchmark for expert-level Q&A.

ConversationReasoningAnalysisSummarization
Provider
Anthropic
Release Date
February 26, 2025
Size
LARGE
Parameters
Not disclosed

Benchmark Performance

Performance metrics on industry standard AI benchmarks that measure capabilities across reasoning, knowledge, and specialized tasks.

MMLU

77.1%

GPQA Diamond

84.8%

MATH

96.2%

AIME

80.0%

HellaSwag (10-shot)

89.0%

Model Insights

All Model Responses