X Ai

Grok 4.1 Fast

Grok 4.1 Fast is xAI's best agentic tool calling model that shines in real-world use cases like customer support and deep research. 2M context window. Reasoning can be enabled/disabled using the `reasoning` `enabled` parameter in the API. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#controlling-reasoning-tokens)

Input / 1M tokens: $0.200
Output / 1M tokens: $0.500
Context window: 2M tokens
Provider: X Ai
Cached input / 1M: $0.050

Performance

Median streaming throughput and first-token latency measured by Artificial Analysis.

Output tokens / sec: 140 t/s
Time to first token: 0.40s

Benchmarks

Intelligence, coding, and math indexes plus the underlying evaluation scores.

Intelligence Index: 24
Coding Index: 20
Math Index: 34
MMLU-Pro: 74.3%
GPQA: 63.7%
HLE: 5.0%
LiveCodeBench: 39.9%
SciCode: 29.6%
MATH-500: —
AIME: —

Benchmarks via Artificial Analysis