DeepSeek: DeepSeek V4 Flash

deepseek/deepseek-v4-flash

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and high-throughput workloads, while maintaining strong reasoning and coding performance.

The model includes hybrid attention for efficient long-context processing. Reasoning efforts high and xhigh are supported; xhigh maps to max reasoning. It is well suited for applications such as coding assistants, chat systems, and agent workflows where responsiveness and cost efficiency are important.

Modalities

Input Price

$0.14per 1M

Output Price

$0.28per 1M

Context

1M

Weekly Tokens

1.27T

Released

Apr 24, 2026

DeepSeek: DeepSeek V4 Flash