Google

Gemma 4 26B A4B (free)

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B quality at...

Input / 1M tokens: Free
Output / 1M tokens: Free
Context window: 262K tokens
Provider: Google

Performance

Median streaming throughput and first-token latency measured by Artificial Analysis.

Output tokens / sec: —
Time to first token: —