[News] Cooling Response to NVIDIA’s Exclusive Chips for China, Lack of Interest in Downgraded Models by Customers

In order to comply with new regulations on the export of chips to the United States, NVIDIA has been consistently releasing AI chips and graphics cards tailored for the Chinese market.

However, according to sources cited by The Wall Street Journal, since November 2023, major cloud service provider (CSP) in China such as Alibaba and Tencent have been testing samples of NVIDIA’s special chips. These Chinese enterprises have conveyed to NVIDIA that the quantity of chips they plan to order in 2024 will be significantly lower than their initial plans.

According to a report from The Wall Street Journal, in October 2023, the United States announced new regulations preventing NVIDIA from selling advanced AI chips to China. However, NVIDIA swiftly developed a “special edition” chip for China, allowing them to continue selling chips in the Chinese market without violating regulations.

Nevertheless, NVIDIA is facing another challenge: major Chinese CSPs are not actively purchasing the “downgraded” performance versions of the chips.

Chinese enterprises have been testing the highest-performance version, H20, of NVIDIA’s “special edition” AI chips. Some testers have mentioned that this chip enables efficient data transfer among multiple processors, making it a better choice than domestic alternatives for building chip clusters required for processing AI computational workloads.

However, testers also indicate that they need more H20 to compensate for the performance gap compared to previous NVIDIA chips, which increases their costs.

The report indicates that in the short term, the performance advantage of NVIDIA’s “downgraded” chips over domestic Chinese products is diminishing, making Chinese-made chips increasingly attractive to buyers.

Informed sources cited from the report suggest that major players like Alibaba and Tencent are redirecting some advanced semiconductor orders to domestic companies and relying more on internally developed chips. This trend is also observed with the other two major chip buyers, Baidu and ByteDance.

Looking ahead in the long term, Chinese customers are uncertain about NVIDIA’s ability to continue supplying them with chips, as U.S. regulatory authorities have committed to regularly reviewing chip export controls, potentially tightening restrictions on chip performance further.

From the perspective of China’s efforts in the independent development of AI chips, TrendForce previously highlighted in its press release that Chinese CSPs like Baidu and Alibaba are actively investing in autonomous AI chip development.

Baidu developed its first self-researched ASIC AI chip, Kunlunxin, in early 2020, with its second generation scheduled for mass production in 2021 and the third expected to launch in 2024. Post-2023, Baidu aimed to use Huawei’s Ascend 910B acceleration chips and expand the use of Kunlunxin chips for its AI infrastructure.

After Alibaba’s acquisition of CPU IP supplier Zhongtian Micro Systems in April 2018 and the establishment of T-Head Semiconductor in September of the same year, the company began developing its own ASIC AI chips, including the Hanguang 800.

TrendForce reports that T-Head’s initial ASIC chips were co-designed with external companies like GUC. However, after 2023, Alibaba is expected to increasingly leverage its internal resources to enhance the independent design capabilities of its next-gen ASIC chips, primarily for Alibaba Cloud’s AI infrastructure.

According to the data from TrendForce, currently, around 80% of the high-end AI chips used by Chinese cloud computing companies are sourced from NVIDIA. However, in the next five years, this proportion may decrease to 50% to 60%.

Read more

(Photo credit: NVIDIA)

Please note that this article cites information from The Wall Street Journal


[Insights] China Advances In-House AI Chip Development Despite U.S. Controls

On October 17th, the U.S. Department of Commerce announced an expansion of export control, tightening further restrictions. In addition to the previously restricted products like NVIDIA A100, H100, and AMD MI200 series, the updated measures now include a broader range, encompassing NVIDA A800, H800, L40S, L40, L42, AMD MI300 series, Intel Gaudi 2/3, and more, hindering their import into China. This move is expected to hasten the adoption of domestically developed chips by Chinese communications service providers (CSPs).

TrendForce’s Insights:

  1. Chinese CSPs Strategically Invest in Both In-House Chip Development and Related Companies

In terms of the in-house chip development strategy of Chinese CSPs, Baidu announced the completion of tape out for the first generation Kunlun Chip in 2019, utilizing the XPU. It entered mass production in early 2020, with the second generation in production by 2021, boasting a 2-3 times performance improvement. The third generation is expected to be released in 2024. Aside from independent R&D, Baidu has invested in related companies like Nebula-Matrix, Phytium, Smartnvy, and. In March 2021, Baidu also established Kunlunxin through the split of its AI chip business.

Alibaba, in April 2018, fully acquired Chinese CPU IP supplier C-Sky and established T-head semiconductor in September of the same year. Their first self-developed chip, Hanguang 800, was launched in September 2020. Alibaba also invested in Chinese memory giant CXMT, AI IC design companies Vastaitech, Cambricon and others.

Tencent initially adopted an investment strategy, investing in Chinese AI chip company Enflame Tech in 2018. In 2020, it established Tencent Cloud and Smart Industries Group(CSIG), focusing on IC design and R&D. In November 2021, Tencent introduced AI inference chip Zixiao, utilizing 2.5D packaging for image and video processing, natural language processing, and search recommendation.

Huawei’s Hisilicon unveiled Ascend 910 in August 2019, accompanied by the AI open-source computing framework MindSpore. However, due to being included in the U.S. entity list, Ascend 910 faced production restrictions. In August 2023, iFLYTEK, a Chinese tech company, jointly introduced the “StarDesk AI Workstation” with Huawei, featuring the new AI chip Ascend 910B. This is likely manufactured using SMIC’s N+2 process, signifying Huawei’s return to self-developed AI chips.

  1. Some Chinese Companies Turn to Purchasing Huawei’s Ascend 910B, Yet It Lags Behind A800

Huawei’s AI chips are not solely for internal use but are also sold to other Chinese companies. Baidu reportedly ordered 1,600 Ascend 910B chips from Huawei in August, valued at approximately 450 million RMB, to be used in 200 Baidu data center servers. The delivery is expected to be completed by the end of 2023, with over 60% of orders delivered as of October. This indicates Huawei’s capability to sell AI chips to other Chinese companies.

Huawei’s Ascend 910B, expected to be released in the second half of 2024, boasts hardware figures comparable to NVIDIA A800. According to tests conducted by Chinese companies, its performance is around 80% of A800. However, in terms of software ecosystem, Huawei still faces a significant gap compared to NVIDIA.

Overall, using Ascend 910B for AI training may be less efficient than A800. Yet with the tightening U.S. policies, Chinese companies are compelled to turn to Ascend 910B. As user adoption increases, Huawei’s ecosystem is expected to improve gradually, leading more Chinese companies to adopt its AI chips. Nevertheless, this will be a protracted process.



[News] China’s Related Companies Brace by Stockpiling Due to New U.S. Chip Ban

The United States has elevated its efforts to curtail the advancement of high-end chips in China. As reported by the CLS News, various companies within China have indicated they received advance notifications and have already amassed chip stockpiles. Analysts suggest that this new wave of bans implies a further restriction by the U.S. on China’s computational capabilities, making the development of domestically-manufactured GPUs in China a matter of utmost importance.

According to the latest regulations, chips, including Nvidia’s A800 and H800, will be impacted by the export ban to China. An insider from a Chinese server company revealed they received the ban notice at the beginning of October and have already stockpiled a sufficient quantity. Nevertheless, they anticipate substantial pressure in the near future. The procurement manager for a downstream customer of Inspur noted that they had proactively shared this information and urged potential buyers to act promptly if they require related products.

Larger companies like Tencent and Baidu are less affected by the ban due to their ample stockpiles. On October 17th, HiRain Technologies announced that its subsidiary had purchased 75 units of H800 and 22 units of A800 from supplier A and had resolved this issue two weeks ago.

(Image: NVIDIA)


HBM/CXL Emerge in Response to Demand for Optimized Hardware Used in AI-driven HPC Applications, Says TrendForce

According to TrendForce’s latest report on the server industry, not only have emerging applications in recent years accelerated the pace of AI and HPC development, but the complexity of models built from machine learning applications and inferences that involve increasingly sophisticated calculations has also undergone a corresponding growth as well, resulting in more data to be processed. While users are confronted with an ever-growing volume of data along with constraints placed by existing hardware, they must make tradeoffs among performance, memory capacity, latency, and cost. HBM (High Bandwidth Memory) and CXL (Compute Express Link) have thus emerged in response to the aforementioned conundrum. In terms of functionality, HBM is a new type of DRAM that addresses more diverse and complex computational needs via its high I/O speeds, whereas CXL is an interconnect standard that allows different processors, or xPUs, to more easily share the same memory resources.

HBM breaks through bandwidth limitations of traditional DRAM solutions through vertical stacking of DRAM dies

Memory suppliers developed HBM in order to be free from the previous bandwidth constraints posed by traditional memory solutions. Regarding memory architecture, HBM consists of a base logic die with DRAM dies vertically stacked on top of the logic die. The 3D-stacked DRAM dies are interconnected with TSV and microbumps, thereby enabling HBM’s high-bandwidth design. The mainstream HBM memory stacks involve four or eight DRAM die layers, which are referred to as “4-hi” or “8-hi”, respectively. Notably, the latest HBM product currently in mass production is HBM2e. This generation of HBM contains four or eight layers of 16Gb DRAM dies, resulting in a memory capacity of 8GB or 16GB per single HBM stack, respectively, with a bandwidth of 410-460GB/s. Samples of the next generation of HBM products, named HBM3, have already been submitted to relevant organizations for validation, and these products will likely enter mass production in 2022.

TrendForce’s investigations indicate that HBM comprises less than 1% of total DRAM bit demand for 2021 primarily because of two reasons. First, the vast majority of consumer applications have yet to adopt HBM due to cost considerations. Second, the server industry allocates less than 1% of its hardware to AI applications; more specifically, servers that are equipped with AI accelerators account for less than 1% of all servers currently in use, not to mention the fact that most AI accelerators still use GDDR5(x) and GDDR6 memories, as opposed to HBM, to support their data processing needs.

Although HBM currently remains in the developmental phase, as applications become increasingly reliant on AI usage (more precise AI needs to be supported by more complex models), computing hardware will then require the integration of HBM to operate these applications effectively. In particular, FPGA and ASIC represent the two hardware categories that are most closely related to AI development, with Intel’s Stratix and Agilex-M as well as Xilinx’s Versal HBM being examples of FPGA with onboard HBM. Regarding ASIC, on the other hand, most CSPs are gradually adopting their own self-designed ASICs, such Google’s TPU, Tencent’s Enflame DTU, and Baidu’s Kunlun – all of which are equipped with HBM – for AI deployments. In addition, Intel will also release a high-end version of its Sapphire Rapids server CPU equipped with HBM by the end of 2022. Taking these developments into account, TrendForce believes that an increasing number of HBM applications will emerge going forward due to HBM’s critical role in overcoming hardware-related bottlenecks in AI development.

A new memory standard born out of demand from high-speed computing, CXL will be more effective in integrating resources of whole system

Evolved from PCIe Gen5, CXL is a memory standard that provides high-speed and low-latency interconnections between the CPU and other accelerators such as the GPU and FPGA. It enables memory virtualization so that different devices can share the same memory pool, thereby raising the performance of a whole computer system while reducing its cost. Hence, CXL can effectively deal with the heavy workloads related to AI and HPC applications.

CXL is just one of several interconnection technologies that feature memory sharing. Other examples that are also in the market include NVLink from NVIDIA and Gen-Z from AMD and Xilinx. Their existence is an indication that the major ICT vendors are increasingly attentive to the integration of various resources within a computer system. TrendForce currently believes that CXL will come out on top in the competition mainly because it is introduced and promoted by Intel, which has an enormous advantage with respect to the market share for CPUs. With Intel’s support in the area of processors, CXL advocates and hardware providers that back the standard will be effective in organizing themselves into a supply chain for the related solutions. The major ICT companies that have in turn joined the CXL Consortium include AMD, ARM, NVIDIA, Google, Microsoft, Facebook (Meta), Alibaba, and Dell. All in all, CXL appears to be the most favored among memory protocols.

The consolidation of memory resources among the CPU and other devices can reduce communication latency and boost the computing performance needed for AI and HPC applications. For this reason, Intel will provide CXL support for its next-generation server CPU Sapphire Rapids. Likewise, memory suppliers have also incorporated CXL support into their respective product roadmaps. Samsung has announced that it will be launching CXL-supported DDR5 DRAM modules that will further expand server memory capacity so as to meet the enormous resource demand of AI computing. There is also a chance that CXL support will be extended to NAND Flash solutions in the future, thus benefiting the development of both types of memory products.

Synergy between HBM and CXL will contribute significantly to AI development; their visibility will increase across different applications starting in 2023

TrendForce believes that the market penetration rate of CXL will rise going forward as this interface standard is built into more and more CPUs. Also, the combination of HBM and CXL will be increasingly visible in the future hardware designs of AI servers. In the case of HBM, it will contribute to a further ramp-up of data processing speed by increasing the memory bandwidth of the CPU or the accelerator. As for CXL, it will enable high-speed interconnections among CPU and other devices. By working together, HBM and CXL will raise computing power and thereby expedite the development of AI applications.

The latest advances in memory pooling and sharing will help overcome the current hardware bottlenecks in the designs of different AI models and continue the trend of more sophisticated architectures. TrendForce anticipates that the adoption rate of CXL-supported Sapphire Rapids processors will reach a certain level, and memory suppliers will also have put their HBM3 products and their CXL-supported DRAM and SSD products into mass production. Hence, examples of HBM-CXL synergy in different applications will become increasingly visible from 2023 onward.

For more information on reports and market data from TrendForce’s Department of Semiconductor Research, please click here, or email Ms. Latte Chung from the Sales Department at lattechung@trendforce.com


GCP, AWS Projected to Become Main Drivers of Global Server Demand with 25-30% YoY Increase in Server Procurement, Says TrendForce

Thanks to their flexible pricing schemes and diverse service offerings, CSPs have been a direct, major driver of enterprise demand for cloud services, according to TrendForce’s latest investigations. As such, the rise of CSPs have in turn brought about a gradual shift in the prevailing business model of server supply chains from sales of traditional branded servers (that is, server OEMs) to ODM Direct sales instead.

Incidentally, the global public cloud market operates as an oligopoly dominated by North American companies including Microsoft Azure, Amazon Web Services (AWS), and Google Cloud Platform (GCP), which collectively possess an above-50% share in this market. More specifically, GCP and AWS are the most aggressive in their data center build-outs. Each of these two companies is expected to increase its server procurement by 25-30% YoY this year, followed closely by Azure.

TrendForce indicates that, in order to expand the presence of their respective ecosystems in the cloud services market, the aforementioned three CSPs have begun collaborating with various countries’ domestic CSPs and telecom operators in compliance with data residency and data sovereignty regulations. For instance, thanks to the accelerating data transformation efforts taking place in the APAC regions, Google is ramping up its supply chain strategies for 2021.

As part of Google’s efforts at building out and refreshing its data centers, not only is the company stocking up on more weeks’ worth of memory products, but it has also been increasing its server orders since 4Q20, in turn leading its ODM partners to expand their SMT capacities. As for AWS, the company has benefitted from activities driven by the post-pandemic new normal, including WFH and enterprise cloud migrations, both of which are major sources of data consumption for AWS’ public cloud.

Conversely, Microsoft Azure will adopt a relatively more cautious and conservative approach to server procurement, likely because the Ice Lake-based server platforms used to power Azure services have yet to enter mass production. In other words, only after these Ice Lake servers enter mass production will Microsoft likely ramp up its server procurement in 2H21, during which TrendForce expects Microsoft’s peak server demand to take place, resulting in a 10-15% YoY growth in server procurement for the entirety of 2021.

Finally, compared to its three competitors, Facebook will experience a relatively more stable growth in server procurement owing to two factors. First, the implementation of GDPR in the EU and the resultant data sovereignty implications mean that data gathered on EU residents are now subject to their respective country’s legal regulations, and therefore more servers are now required to keep up the domestic data processing and storage needs that arise from the GDPR. Secondly, most servers used by Facebook are custom spec’ed to the company’s requirements, and Facebook’s server needs are accordingly higher than its competitors’. As such, TrendForce forecasts a double-digit YoY growth in Facebook’s server procurement this year.

Chinese CSPs are limited in their pace of expansions, while Tencent stands out with a 10% YoY increase in server demand

On the other hand, Chinese CSPs are expected to be relatively weak in terms of server demand this year due to their relatively limited pace of expansion and service areas. Case in point, Alicloud is currently planning to procure the same volume of servers as it did last year, and the company will ramp up its server procurement going forward only after the Chinese government implements its new infrastructure policies. Tencent, which is the other dominant Chinese CSP, will benefit from increased commercial activities from domestic online service platforms, including JD, Meituan, and Kuaishou, and therefore experience a corresponding growth in its server colocation business.

Tencent’s demand for servers this year is expected to increase by about 10% YoY. Baidu will primarily focus on autonomous driving projects this year. There will be a slight YoY increase in Baidu’s server procurement for 2021, mostly thanks to its increased demand for roadside servers used in autonomous driving applications. Finally, with regards to Bytedance, its server procurement will undergo a 10-15% YoY decrease since it will look to adopt colocation services rather than run its own servers in the overseas markets due to its shrinking presence in those markets.

Looking ahead, TrendForce believes that as enterprise clients become more familiar with various cloud services and related technologies, the competition in the cloud market will no longer be confined within the traditional segments of computing, storage, and networking infrastructure. The major CSPs will pay greater attention to the emerging fields such as edge computing as well as the software-hardware integration for the related services.

With the commercialization of 5G services that is taking place worldwide, the concept of “cloud, edge, and device” will replace the current “cloud” framework. This means that cloud services will not be limited to software in the future because cloud service providers may also want to offer their branded hardware in order to make their solutions more comprehensive or all-encompassing. Hence, TrendForce expects hardware to be the next battleground for CSPs.

For more information on reports and market data from TrendForce’s Department of Semiconductor Research, please click here, or email Ms. Latte Chung from the Sales Department at lattechung@trendforce.com

  • Page 1
  • 2 page(s)
  • 6 result(s)