[News] Huawei Unveils UCM Algorithm to Cut HBM Reliance, Reportedly Goes Open-Source in September

2025-08-13 Emerging Technologies / Semiconductors editor

While local media spotlight Huawei’s push to cut China’s HBM dependency for AI inference, the tech giant made waves on August 12th with the launch of UCM (Unified Computing Memory)—an AI inference breakthrough that slashes latency and costs while turbocharing efficiency, according to mydrivers and Securities Times.

Notably, the reports suggest Huawei will open-source UCM in September 2025, launching first on MagicEngine community before contributing to mainstream inference engines and sharing with Share Everything storage vendors and ecosystem partners.

UCM’s Game-Changing Features

Securities Times, citing Jason Cao, CEO of Huawei Digital Finance, notes that high latency and high costs remain the primary challenges facing AI inference development today. As the report points out, leading international models currently achieve single-user output speeds of 200 tokens per second (5ms latency), while China’s models typically fall below 60 tokens per second (50-100ms latency).

As per the reports, Huawei describes UCM as an AI inference acceleration toolkit centered on KV (Key Value) Cache technology. The system are said to be combining multiple cache optimization algorithms to intelligently manage KV Cache memory data produced during AI processing. This method expands inference context windows, achieving high-throughput, low-latency performance while lowering per-token inference costs, the reports add.

Securities Times reports that UCM automatically distributes cached data across HBM, DRAM, and SSD storage based on memory heat patterns. By combining multiple sparse attention algorithms, the system reportedly optimizes computing and storage coordination, delivering 2-22x higher TPS (tokens per second) in long-sequence scenarios while cutting per-token costs.

On the other hand, Huawei officials cited by the report explain that in multi-turn conversations and knowledge search applications, the system directly accesses previously stored data instead of recalculating everything, reducing initial response delays by up to 90%.

Less HBM Dependency

As per EETimes China, Huawei’s new technology not only boosts AI inference efficiency but could also reduce reliance on HBM memory, enhancing domestic AI large-model inference performance and strengthening China’s AI inference ecosystem.

EETimes China notes that starting January 2, 2025, the U.S. bans exports of HBM2E and higher-grade HBM chips to China. This ban covers not only HBM chips made in the U.S. but also those produced overseas using American technology.

Huawei’s breakthroughs in AI inference are not new. According to the report, the company has achieved multiple milestones, including the DeepSeek open-source inference solution developed with Peking University and several performance improvements on its Ascend platform. Additionally, Huawei’s partnership with iFlytek has delivered notable results, enabling large-scale expert distribution for MoE (Mixture of Experts) models on domestic computing infrastructure, tripling inference speed while cutting response delays by half, the report adds.

Read more

(Photo credit: Huawei)

Please note that this article cites information from Securities Times, EETimes China, and mydrivers.

HBM Huawei

[Insights] Memory Spot Price Update: DDR4 Price Surge Slows Amid Samsung EOL Delay Rumors

[News] Samsung Reportedly Pushes SoP Packaging on Ultra-Large Panel, Challenging TSMC and Intel

Popular Keywords

About TrendForce News

[News] Huawei Unveils UCM Algorithm to Cut HBM Reliance, Reportedly Goes Open-Source in September

Please note that this article cites information from Securities Times, EETimes China, and mydrivers.

Subject

Related Articles

[News] Innoscience Scores Major Patent Win Against Infineon as ITC Rules No Infringement

[News] Chinese Semiconductor Equipment Maker PSR Completed Series A Funding

[News] China’s AI Toy Market Surges With 30+ Funding Rounds and Hot Launches From Huawei and UBTECH

Research Directory

Semiconductor Research

Display Research

Optoelectronics Research

Green Energy Research

ICT Applications Research

Semiconductors

Display

LED

Energy

Consumer Electronics

Emerging Technologies

Selected Topics

About TrendForce News

[News] Huawei Unveils UCM Algorithm to Cut HBM Reliance, Reportedly Goes Open-Source in September

Please note that this article cites information from Securities Times, EETimes China, and mydrivers.

Subject

Related Articles

[News] Innoscience Scores Major Patent Win Against Infineon as ITC Rules No Infringement

[News] Chinese Semiconductor Equipment Maker PSR Completed Series A Funding

[News] China’s AI Toy Market Surges With 30+ Funding Rounds and Hot Launches From Huawei and UBTECH

Get in touch with us