Chinese AI startup DeepSeek has made its 75% price cut for the flagship V4-Pro AI model permanent, a move that is being seen as one of the biggest disruptions in the AI industry so far. The company said that after its promotional pricing period ends on May 31, the lower rates will remain permanent instead of returning to the original prices. This means DeepSeek V4-Pro will continue to operate at only 25% of its launch cost, further intensifying the growing AI price war between Chinese and American AI firms.

The scale of the reduction is unusually aggressive for a frontier-grade AI system. DeepSeek originally priced V4-Pro at around $1.74 per million uncached input tokens and $3.48 per million output tokens. Under the permanent structure, those prices fall to about $0.435 and $0.87, respectively. Input cache-hit pricing – crucial for repeated prompts and long-running AI agents – was reduced even more sharply, in some cases to one-tenth of previous costs. In yuan terms, DeepSeek’s API pricing now reportedly ranges from 0.025 yuan to 6 yuan per million tokens, down from earlier rates of 0.1 yuan to 24 yuan. Importantly, for enterprise developers processing billions of tokens monthly, the cost savings could amount to millions of dollars annually.

The move becomes especially significant as V4-Pro is not a lightweight or budget model. DeepSeek designed the system as a high-end reasoning and coding model capable of competing with leading Western systems from OpenAI, Anthropic, and Google. The model reportedly uses a Mixture-of-Experts architecture with an estimated 1.6 trillion total parameters while activating around 49 billion parameters during inference, allowing the system to maintain large-scale intelligence while reducing compute costs. V4-Pro also supports a one-million-token context window and outputs up to 384,000 tokens in a single request, allowing developers to process entire codebases, long legal archives, scientific datasets, and persistent AI agent memory in one session.

Estimates suggest that after the permanent price cut, DeepSeek V4-Pro may now be between 20 and 35 times cheaper than premium AI models offered by OpenAI, Anthropic, and Google for certain workloads. By comparison, premium frontier models like GPT-5.5 are estimated to cost around $8-15 per million input tokens and $30-50 per million output tokens, while Anthropic’s Claude Opus series is believed to be even more expensive for large reasoning and long-context tasks.

A major factor behind the decision is infrastructure. DeepSeek’s V4 series became the company’s first major AI family optimized to run on Huawei’s Ascend AI accelerators instead of relying primarily on Nvidia hardware. Reports indicate that the rising availability of Huawei’s Ascend 950 and 950PR AI supernode systems likely played a major role in DeepSeek’s confidence to sustain lower pricing permanently. Chinese tech giants, including Tencent, Alibaba, and ByteDance, are reportedly racing to secure those chips following the V4 launch, although production remains constrained because export controls continue limiting China’s access to advanced chipmaking equipment. Huawei reportedly aims to ship around 750,000 Ascend 950PR units during 2026.

The Tech Portal is published by Blue Box Media Private Limited. Our investors have no influence over our reporting. Read our full Ownership and Funding Disclosure →