Z AI

GLM 4.7 Flash

As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further optimized for agentic coding use cases, strengthening coding capabilities, long-horizon task planning, and tool collaboration, and has achieved leading performance among open-source models of the same size on several current public benchmark leaderboards.

Input / 1M tokens: $0.060
Output / 1M tokens: $0.400
Context window: 203K tokens
Provider: Z AI
Cached input / 1M: $0.010

Performance

Median streaming throughput and first-token latency measured by Artificial Analysis.

Output tokens / sec: 91 t/s
Time to first token: 0.87s

Benchmarks

Intelligence, coding, and math indexes plus the underlying evaluation scores.

Intelligence Index: 30
Coding Index: 26
Math Index: —
MMLU-Pro: —
GPQA: 58.1%
HLE: 7.1%
LiveCodeBench: —
SciCode: 33.7%
MATH-500: —
AIME: —

Benchmarks via Artificial Analysis