OpenAI
GPT-5.1 Chat
GPT-5.1 Chat (AKA Instant is the fast, lightweight member of the 5.1 family, optimized for low-latency chat while retaining strong general intelligence. It uses adaptive reasoning to selectively “think” on harder queries, improving accuracy on math, coding, and multi-step tasks without slowing down typical conversations. The model is warmer and more conversational by default, with better instruction following and more stable short-form reasoning. GPT-5.1 Chat is designed for high-throughput, interactive workloads where responsiveness and consistency matter more than deep deliberation.
- Input / 1M tokens
- $1.25
- Output / 1M tokens
- $10.00
- Context window
- 128K tokens
- Provider
- OpenAI
- Cached input / 1M
- $0.125
Performance
Median streaming throughput and first-token latency measured by Artificial Analysis.
- Output tokens / sec
- —
- Time to first token
- —