AI Models Showcase
Explore a comprehensive collection of cutting-edge AI models from leading providers.
Anthropic Models(5)
Claude 3.7 Sonnet
anthropicClaude 3.7 Sonnet offers Extended Thinking Scaffolds that boost SWE-bench coding accuracy from 62.3% to 70.3%, with 81.2% accuracy in retail automation tasks, outperforming Claude Sonnet 3.6 (2022-10-22) by 13.6%.
Claude 3.7 Thinking Sonnet
anthropicClaude 3.7 Thinking Sonnet exposes the full chain-of-thought process during problem-solving, including error backtracking and alternative solution exploration. Scores 86.1% on GPQA Diamond benchmark for expert-level Q&A.
Claude Sonnet 3.6 (2022-10-22)
anthropicClaude 3.5 Sonnet offers a cost-efficient API ($3/million input tokens vs. $5 for GPT-4o) and uses embedded alignment techniques that reduce harmful outputs by 34% compared to Claude 2.1.
Claude 3 Haiku
anthropicClaude 3 Haiku is Anthropic's fastest model with 21 ms response time for real-time applications and 98.7% accuracy on JLPT N1 benchmarks for Japanese language specialization.
Claude 3 Opus
anthropicClaude 3 Opus is Anthropic's most powerful model with versatile capabilities ranging from complex reasoning to advanced problem-solving.
Deepseek Models(2)
DeepSeek R1
deepseekDeepSeek R1 is the world's first reasoning model developed entirely via reinforcement learning, offering cost efficiency at $0.14/million tokens vs. OpenAI o1's $15, and reducing Python runtime errors by 71% via static analysis integration.
DeepSeek V3 (March 2024)
deepseekDeepSeek V3 (March 2024) shows significant improvements in reasoning capabilities with enhanced MMLU-Pro (81.2%), GPQA (68.4%), AIME (59.4%), and LiveCodeBench (49.2%) scores. Features improved front-end web development, Chinese writing proficiency, and function calling accuracy.
Google Models(6)
Gemini 1.5 Pro
googleGemini 1.5 Pro handles infinite context with 99% retrieval accuracy at 750k tokens via Mixture-of-Experts and generates chapter summaries for 2-hour videos with 92% accuracy.
Gemini 2.5 Pro Experimental
googleGemini 2.5 Pro Experimental is Google's advanced model with improved multimodal reasoning, long context understanding with 1 million tokens, and specialized video comprehension.
Gemini 2.0 Pro Experimental
googleGemini 2.0 Pro builds interactive 3D environments from text descriptions and offers hypothetical reasoning for scientific simulations.
Gemini 2.0 Flash Thinking
googleGemini 2.0 Flash Thinking offers subsecond reasoning with 840 ms median response time for financial forecasting and an energy-efficient architecture using 0.8 kWh per million tokens (40% less than Gemini 1.5).
Gemini 2.5 Flash Preview
googleGoogle's state-of-the-art workhorse model, designed for advanced reasoning, coding, mathematics, and scientific tasks. Features hybrid reasoning (thinking on/off) with configurable budgets, balancing quality, cost, and latency.
Gemini 2.5 Flash Preview (thinking)
googleGemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and scientific tasks. It includes built-in "thinking" capabilities, enabling it to provide responses with greater accuracy and nuanced context handling.
Meta Models(6)
Llama 3 70B
metaLlama 3 70B is a large language model from Meta with strong performance and efficiency for real-time interactions.
Llama 3.1 70B
metaLlama 3.1 70B offers a dramatically expanded context window and improved performance on mathematical reasoning and general knowledge tasks.
Llama 3.1 405B
metaLlama 3.1 405B is Meta's most powerful open-source model, outperforming even proprietary models on various benchmarks.
Llama 4 Maverick
metaLlama 4 Maverick is Meta's multimodal expert model with 17B active parameters and 128 experts (400B total parameters). It outperforms GPT-4o and Gemini 2.0 Flash across various benchmarks, achieving an ELO of 1417 on LMArena. Designed for sophisticated AI applications with excellent image understanding and creative writing.
Llama 4 Scout
metaLlama 4 Scout is Meta's compact yet powerful multimodal model with 17B active parameters and 16 experts (109B total parameters). It fits on a single H100 GPU with Int4 quantization and offers an industry-leading 10M token context window, outperforming Gemma 3, Gemini 2.0 Flash-Lite, and Mistral 3.1 across various benchmarks.
Llama 4 Behemoth
metaLlama 4 Behemoth is Meta's most powerful model yet with 288B active parameters and 16 experts (nearly 2T total parameters), outperforming GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on several STEM benchmarks.
Midjourney Models(7)
Midjourney v1
midjourneyThe first public release of Midjourney, introducing AI image generation to a wider audience through its Discord-based interface.
Midjourney v2
midjourneyMidjourney v2 improved on the original model with better coherence, detail, and more consistent style application.
Midjourney v3
midjourneyMidjourney v3 introduced significantly improved artistic capabilities with better understanding of prompt nuances and artistic styles.
Midjourney v4
midjourneyMidjourney v4 marked a major leap forward with dramatically improved photorealism, coherence, and prompt understanding, trained on Google TPUs for the first time.
Midjourney v5
midjourneyMidjourney v5 produces realistic images.
Midjourney v6
midjourneyMidjourney v6 produces realistic images.
Midjourney v6.1
midjourneyMidjourney v6.1 introduced a native web interface alongside Discord, with improved detail rendering, better text handling, and enhanced image coherence.
Mistral Models(2)
Mistral Large
mistralMistral Large is a powerful model with strong multilingual capabilities and reasoning, featuring a 32K token context window.
Mistral Large 2
mistralMistral Large 2 features a 128K context window with enhanced code generation, mathematics, reasoning, and multilingual support.
Openai Models(16)
OpenAI o3
openaiOpenAI's most powerful reasoning model, pushing the frontier across coding, math, science, and visual perception. Trained to think longer before responding and agentically use tools (web search, code execution, image generation) to solve complex problems. Sets new SOTA on benchmarks like Codeforces and MMMU.
OpenAI o4-mini
openaiA smaller, cost-efficient reasoning model from OpenAI optimized for speed. Achieves remarkable performance for its size, particularly in math, coding, and visual tasks. Supports significantly higher usage limits than o3 and can agentically use tools.
OpenAI o4 Mini High
openaiOpenAI o4-mini-high is the same model as o4-mini but defaults to a high reasoning effort setting. It's a compact reasoning model optimized for speed and cost-efficiency, retaining strong multimodal and agentic capabilities, especially in math, coding, and visual tasks.
DALL-E 3
openaiDALL-E 3 auto-improves user inputs via ChatGPT integration and blocks prohibited content with 99.9% precision using multimodal classifiers.
GPT-4o (Omni)
openaiGPT-4o processes text, images, and audio through a unified transformer architecture and offers real-time translation for 154 languages with 89.2% BLEU score on low-resource languages.
GPT-4.1
openaiGPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.
o3 Mini
openaio3 Mini is a smaller, more efficient version of the o3 model, optimized for faster response times and lower computational costs while maintaining high-quality outputs.
o1
openaio1 achieves 86% accuracy on Mathematics Olympiad benchmarks (vs. GPT-4o's 13%), offers PhD-level STEM proficiency, and maintains a 0.17% deceptive response rate in synthetic testing.
GPT-4.5
openaiGPT-4.5 is a step forward in scaling up pre-training and post-training. With broader knowledge, improved intent understanding, and greater 'EQ', it excels at natural conversations, writing, programming, and practical problem solving with reduced hallucinations. GPT-4.5 achieved 62.5% accuracy on SimpleQA and a 37.1% hallucination rate, significantly outperforming GPT-4o and other models.
ChatGPT-4o (March 2025)
openaiAn updated version of GPT-4o that feels more intuitive, creative, and collaborative. Follows instructions more accurately, handles coding tasks more smoothly, and communicates in a clearer, more natural way with more concise responses and fewer markdown levels.
GPT-4o mini
openaiGPT-4o mini is OpenAI's newest model after GPT-4 Omni, supporting both text and image inputs with text outputs. As their most advanced small model, it is many multiples more affordable than other recent frontier models, and more than 60% cheaper than GPT-3.5 Turbo. It maintains SOTA intelligence, while being significantly more cost-effective.
GPT-4.1 Nano
openaiFor tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It's ideal for tasks like classification or autocompletion.
GPT-4.1 Mini
openaiGPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruction evals, 35.8% on MultiChallenge, and 84.1% on IFEval. Mini also shows strong coding ability (e.g., 31.6% on Aider's polyglot diff benchmark) and vision understanding, making it suitable for interactive applications with tight performance constraints.
GPT-3.5 Turbo
openaiGPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural language or code, and is optimized for chat and traditional completion tasks.
GPT-4
openaiOpenAI's flagship model, GPT-4 is a large-scale multimodal language model capable of solving difficult problems with greater accuracy than previous models due to its broader general knowledge and advanced reasoning capabilities. Training data: up to Sep 2021.
GPT-2
openaiA direct scale-up of GPT-1 with 1.5 billion parameters, trained on 8 million web pages. Known for its ability to generate coherent text, sometimes indistinguishable from humans, but could be repetitive.
Openrouter Models(3)
Quasar Alpha
openrouterThis is a cloaked model provided to the community to gather feedback. It's a powerful, all-purpose model supporting long-context tasks, including code generation. All prompts and completions for this model are logged by the provider as well as OpenRouter.
Optimus Alpha
openrouterA stealth, powerful, all-purpose model supporting long-context tasks, including code generation. Based on community feedback.
QwQ 32B
openrouterQwQ is the reasoning model of the Qwen series. Compared with conventional instruction-tuned models, QwQ, which is capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks, especially hard problems. QwQ-32B is the medium-sized reasoning model, which is capable of achieving competitive performance against state-of-the-art reasoning models, e.g., DeepSeek-R1, o1-mini.
Xai Models(4)
Grok 3
xaiGrok 3 is a cutting-edge AI model from xAI with Big Brain Mode for complex problems, Colossus Supercomputer integration, and Reinforcement Learning optimization. Achieves 1402 Elo on LMArena benchmarks and 93.3% on AIME 2025 mathematics competition.
Grok 3 Thinking
xaiGrok 3 Thinking exposes the full chain-of-thought process during problem-solving, including error backtracking and alternative solution exploration. Scores 84.6% on GPQA Diamond benchmark for expert-level Q&A.
Grok 3 Mini Beta
xaiGrok 3 Mini is a lightweight, smaller thinking model ideal for reasoning-heavy tasks that don't demand extensive domain knowledge. It shines in math-specific and quantitative use cases. Transparent 'thinking' traces accessible.
Grok 3 Beta
xaiGrok 3 Beta is xAI's flagship model excelling at enterprise use cases like data extraction, coding, and text summarization. Possesses deep domain knowledge in finance, healthcare, law, and science. Outperforms Grok 3 Mini on high thinking tasks.