DeepSeek

R1 Distill Qwen 32B

DeepSeek R1 Distill Qwen 32B is a distilled large language model based on [Qwen 2.5 32B](https://huggingface.co/Qwen/Qwen2.5-32B), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). It outperforms OpenAI's o1-mini across various benchmarks, achieving new state-of-the-art results for dense models.\n\nOther benchmark results include:\n\n- AIME 2024 pass@1: 72.6\n- MATH-500 pass@1: 94.3\n- CodeForces Rating: 1691\n\nThe model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models.

Input / 1M tokens: $0.290
Output / 1M tokens: $0.290
Context window: 33K tokens
Provider: DeepSeek
Knowledge cutoff: 2024-07-31

Performance

Median streaming throughput and first-token latency measured by Artificial Analysis.

Output tokens / sec: 43 t/s
Time to first token: 0.46s

Benchmarks

Intelligence, coding, and math indexes plus the underlying evaluation scores.

Intelligence Index: 17
Coding Index: —
Math Index: 63
MMLU-Pro: 73.9%
GPQA: 61.5%
HLE: 5.5%
LiveCodeBench: 27.0%
SciCode: 37.6%
MATH-500: 94.1%
AIME: 68.7%

Benchmarks via Artificial Analysis