Z AI
GLM 4.7 Flash
As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further optimized for agentic coding use cases, strengthening coding capabilities, long-horizon task planning, and tool collaboration, and has achieved leading performance among open-source models of the same size on several current public benchmark leaderboards.
- Input / 1M tokens
- $0.060
- Output / 1M tokens
- $0.400
- Context window
- 203K tokens
- Provider
- Z AI
- Cached input / 1M
- $0.010
Performance
Median streaming throughput and first-token latency measured by Artificial Analysis.
- Output tokens / sec
- 91 t/s
- Time to first token
- 0.87s
Benchmarks
Intelligence, coding, and math indexes plus the underlying evaluation scores.
- Intelligence Index
- 30
- Coding Index
- 26
- Math Index
- —
- MMLU-Pro
- —
- GPQA
- 58.1%
- HLE
- 7.1%
- LiveCodeBench
- —
- SciCode
- 33.7%
- MATH-500
- —
- AIME
- —
Benchmarks via Artificial Analysis