[Insights] Microsoft Unveils In-House AI Chip, Poised for Competitive Edge with a Powerful Ecosystem

Microsoft announced the in-house AI chip, Azure Maia 100, at the Ignite developer conference in Seattle on November 15, 2023. This chip is designed to handle OpenAI models, Bing, GitHub Copilot, ChatGPT, and other AI services. Support for Copilot, Azure OpenAI is expected to commence in early 2024.

TrendForce’s Insights:

  1. Speculating on the Emphasis of Maia 100 on Inference, Microsoft’s Robust Ecosystem Advantage is Poised to Emerge Gradually

Microsoft has not disclosed detailed specifications for Azure Maia 100. Currently, it is known that the chip will be manufactured using TSMC’s 5nm process, featuring 105 billion transistors and supporting at least INT8 and INT4 precision formats. While Microsoft has indicated that the chip will be used for both training and inference, the computational formats it supports suggest a focus on inference applications.

This emphasis is driven by its incorporation of the less common INT4 low-precision computational format in comparison to other CSP manufacturers’ AI ASICs. Additionally, the lower precision contributes to reduced power consumption, shortening inference times, enhancing efficiency. However, the drawback lies in the sacrifice of accuracy.

Microsoft initiated its in-house AI chip project, “Athena,” in 2019. Developed in collaboration with OpenAI. Azure Maia 100, like other CSP manufacturers, aims to reduce costs and decrease dependency on NVIDIA. Despite Microsoft entering the field of proprietary AI chips later than its primary competitors, its formidable ecosystem is expected to gradually demonstrate a competitive advantage in this regard.

  1. U.S. CSP Manufacturers Unveil In-House AI Chips, Meta Exclusively Adopts RISC-V Architecture

Google led the way with its first in-house AI chip, TPU v1, introduced as early as 2016, and has since iterated to the fifth generation with TPU v5e. Amazon followed suit in 2018 with Inferentia for inference, introduced Trainium for training in 2020, and launched the second generation, Inferentia2, in 2023, with Trainium2 expected in 2024.

Meta plans to debut its inaugural in-house AI chip, MTIA v1, in 2025. Given the releases from major competitors, Meta has expedited its timeline and is set to unveil the second-generation in-house AI chip, MTIA v2, in 2026.

Unlike other CSP manufacturers, both MTIA v1 and MTIA v2 adopt the RISC-V architecture, while other CSP manufacturers opt for the ARM architecture. RISC-V is a fully open-source architecture, requiring no instruction set licensing fees. The number of instructions (approximately 200) in RISC-V is lower than ARM (approximately 1,000).

This choice allows chips utilizing the RISC-V architecture to achieve lower power consumption. However, the RISC-V ecosystem is currently less mature, resulting in fewer manufacturers adopting it. Nevertheless, with the growing trend in data centers towards energy efficiency, it is anticipated that more companies will start incorporating RISC-V architecture into their in-house AI chips in the future.

  1. The Battle of AI Chips Ultimately Relies on Ecosystems, Microsoft Poised for Competitive Edge

The competition among AI chips will ultimately hinge on the competition of ecosystems. Since 2006, NVIDIA has introduced the CUDA architecture, nearly ubiquitous in educational institutions. Thus, almost all AI engineers encounter CUDA during their academic tenure.

In 2017, NVIDIA further solidified its ecosystem by launching the RAPIDS AI acceleration integration solution and the GPU Cloud service platform. Notably, over 70% of NVIDIA’s workforce comprises software engineers, emphasizing its status as a software company. The performance of NVIDIA’s AI chips can be further enhanced through software innovations.

On the contrary, Microsoft possess a robust ecosystem like Windows. The recent Intel Arc GPU A770 showcased a 1.7x performance improvement in AI-driven Stable Diffusion on Microsoft Olive, this demonstrates that, similar to NVIDIA, Microsoft has the capability to enhance GPU performance through software.

Consequently, Microsoft’s in-house AI chips are poised to achieve superior performance in software collaboration compared to other CSP manufacturers, providing Microsoft with a competitive advantage in the AI competition.

Read more


[News] Microsoft First In-House AI Chip “Maia” Produced by TSMC’s 5nm

On the 15th, Microsoft introducing its first in-house AI chip, “Maia.” This move signifies the entry of the world’s second-largest cloud service provider (CSP) into the domain of self-developed AI chips. Concurrently, Microsoft introduced the cloud computing processor “Cobalt,” set to be deployed alongside Maia in selected Microsoft data centers early next year. Both cutting-edge chips are produced using TSMC’s advanced 5nm process, as reported by UDN News.

Amidst the global AI fervor, the trend of CSPs developing their own AI chips has gained momentum. Key players like Amazon, Google, and Meta have already ventured into this territory. Microsoft, positioned as the second-largest CSP globally, joined the league on the 15th, unveiling its inaugural self-developed AI chip, Maia, at the annual Ignite developer conference.

These AI chips developed by CSPs are not intended for external sale; rather, they are exclusively reserved for in-house use. However, given the commanding presence of the top four CSPs in the global market, a significant business opportunity unfolds. Market analysts anticipate that, with the exception of Google—aligned with Samsung for chip production—other major CSPs will likely turn to TSMC for the production of their AI self-developed chips.

TSMC maintains its consistent policy of not commenting on specific customer products and order details.

TSMC’s recent earnings call disclosed that 5nm process shipments constituted 37% of Q3 shipments this year, making the most substantial contribution. Having first 5nm plant mass production in 2020, TSMC has introduced various technologies such as N4, N4P, N4X, and N5A in recent years, continually reinforcing its 5nm family capabilities.

Maia is tailored for processing extensive language models. According to Microsoft, it initially serves the company’s services such as $30 per month AI assistant, “Copilot,” which offers Azure cloud customers a customizable alternative to Nvidia chips.

Borkar, Corporate VP, Azure Hardware Systems & Infrastructure at Microsoft, revealed that Microsoft has been testing the Maia chip in Bing search engine and Office AI products. Notably, Microsoft has been relying on Nvidia chips for training GPT models in collaboration with OpenAI, and Maia is currently undergoing testing.

Gulia, Executive VP of Microsoft Cloud and AI Group, emphasized that starting next year, Microsoft customers using Bing, Microsoft 365, and Azure OpenAI services will witness the performance capabilities of Maia.

While actively advancing its in-house AI chip development, Microsoft underscores its commitment to offering cloud services to Azure customers utilizing the latest flagship chips from Nvidia and AMD, sustaining existing collaborations.

Regarding the cloud computing processor Cobalt, adopting the Arm architecture with 128 core chip, it boasts capabilities comparable to Intel and AMD. Developed with chip designs from devices like smartphones for enhanced energy efficiency, Cobalt aims to challenge major cloud competitors, including Amazon.
(Image: Microsoft)


[News] Has the AI Chip Buying Frenzy Cooled Off? Microsoft Rumored to Decrease Nvidia H100 Orders

According to a report by Taiwanese media TechNews, industry sources have indicated that Microsoft has recently reduced its orders for Nvidia’s H100 graphics cards. This move suggests that the demand for H100 graphics cards in the large-scale artificial intelligence computing market has tapered off, and the frenzy of orders from previous customers is no longer as prominent.

In this wave of artificial intelligence trends, the major purchasers of related AI servers come from large-scale cloud computing service providers. Regarding Microsoft’s reported reduction in orders for Nvidia’s H100 graphics cards, market experts point to a key factor being the usage of Microsoft’s AI collaboration tool, Microsoft 365 Copilot, which did not perform as expected.

Another critical factor affecting Microsoft’s decision to reduce orders for Nvidia’s H100 graphics cards is the usage statistics of ChatGPT. Since its launch in November 2022, this generative AI application has experienced explosive growth in usage and has been a pioneer in the current artificial intelligence trend. However, ChatGPT experienced a usage decline for the first time in June 2023.

Industry insiders have noted that the reduction in Microsoft’s H100 graphics card orders was predictable. In May, both server manufacturers and direct customers stated that they would have to wait for over six months to receive Nvidia’s H100 graphics cards. However, in August, Tesla announced the deployment of a cluster of ten thousand H100 graphics cards, meaning that even those who placed orders later were able to receive sufficient chips within a few months. This indicates that the demand for H100 graphics cards, including from customers like Microsoft, has already been met, signifying that the fervent demand observed several months ago has waned.

(Photo credit: Nvidia)


[News] US Tech Giants Unite for AI Server Domination, Boosting Taiwan Supply Chain

According to the news from Commercial Times, in a recent press conference, the four major American cloud service providers (CSPs) collectively expressed their intention to expand their investment in AI application services. Simultaneously, they are continuing to enhance their cloud infrastructure. Apple has also initiated its foray into AI development, and both Intel and AMD have emphasized the robust demand for AI servers. These developments are expected to provide a significant boost to the post-market prospects of Taiwan’s AI server supply chain.

Industry insiders have highlighted the ongoing growth of the AI spillover effect, benefiting various sectors ranging from GPU modules, substrates, cooling systems, power supplies, chassis, and rails, to PCB manufacturers.

The American CSP players, including Microsoft, Google, Meta, and Amazon, which recently released their financial reports, have demonstrated growth in their cloud computing and AI-related service segments in their latest quarterly performance reports. Microsoft, Google, and Amazon are particularly competitive in the cloud services arena, and all have expressed optimistic outlooks for future operations.

The direct beneficiaries among Taiwan’s cloud data center suppliers are those in Tier 1, who are poised to reap positive effects on their average selling prices (ASP) and gross margins, driven by the strong demand for AI servers from these CSP giants in the latter half of the year.

Among them, the ODM manufacturers with over six years of collaboration with NVIDIA in multi-GPU architecture AI high-performance computing/cloud computing, including Quanta, Wistron, Wistron, Inventec, Foxconn, and Gigabyte, are expected to see operational benefits further reflected in the latter half of the year. Foxconn and Inventec are the main suppliers of GPU modules and GPU substrates, respectively, and are likely to witness noticeable shipment growth starting in the third quarter.

Furthermore, AI servers not only incorporate multiple GPU modules but also exhibit improvements in aspects such as chassis height, weight, and thermal design power (TDP) compared to standard servers. As a result, cooling solution providers like Asia Vital Components, Auras Technology, and SUNON; power supply companies such as Delta Electronics and Lite-On Technology; chassis manufacturers Chenbro; rail industry players like King Slide, and PCB/CCL manufacturers such as EMC, GCE are also poised to benefit from the increasing demand for AI servers.



AI Sparks a Revolution Up In the Cloud

OpenAI’s ChapGPT, Microsoft’s Copilot, Google’s Bard, and latest Elon Musk’s TruthGPT – what will be the next buzzword for AI? In just under six months, the AI competition has heated up, stirring up ripples in the once-calm AI server market, as AI-generated content (AIGC) models take center stage.

The convenience unprecedentedly brought by AIGC has attracted a massive number of users, with OpenAI’s mainstream model, GPT-3, receiving up to 25 million daily visits, often resulting in server overload and disconnection issues.

Given the evolution of these models has led to an increase in training parameters and data volume, making computational power even more scarce, OpenAI has reluctantly adopted measures such as paid access and traffic restriction to stabilize the server load.

High-end Cloud Computing is gaining momentum

According to Trendforce, AI servers currently have a merely 1% penetration rate in global data centers, which is far from sufficient to cope with the surge in data demand from the usage side. Therefore, besides optimizing software to reduce computational load, increasing the number of high-end AI servers in hardware will be another crucial solution.

Take GPT-3 for instance. The model requires at least 4,750 AI servers with 8 GPUs for each, and every similarly large language model like ChatGPT will need 3,125 to 5,000 units. Considering ChapGPT and Microsoft’s other applications as a whole, the need for AI servers is estimated to reach some 25,000 units in order to meet the basic computing power.

As the emerging applications of AIGC and its vast commercial potential have both revealed the technical roadmap moving forward, it also shed light on the bottlenecks in the supply chain.

The down-to-earth problem: cost

Compared to general-purpose servers that use CPUs as their main computational power, AI servers heavily rely on GPUs, and DGX A100 and H100, with computational performance up to 5 PetaFLOPS, serve as primary AI server computing power. Given that GPU costs account for over 70% of server costs, the increase in the adoption of high-end GPUs has made the architecture more expansive.

Moreover, a significant amount of data transmission occurs during the operation, which drives up the demand for DDR5 and High Bandwidth Memory (HBM). The high power consumption generated during operation also promotes the upgrade of components such as PCBs and cooling systems, which further raises the overall cost.

Not to mention the technical hurdles posed by the complex design architecture – for example, a new approach for heterogeneous computing architecture is urgently required to enhance the overall computing efficiency.

The high cost and complexity of AI servers has inevitably limited their development to only large manufacturers. Two leading companies, HPE and Dell, have taken different strategies to enter the market:

  • HPE has continuously strengthened its cooperation with Google and plans to convert all products to service form by 2022. It also acquired startup Pachyderm in January 2023 to launch cloud-based supercomputing services, making it easier to train and develop large models.
  • In March 2023, Dell launched its latest PowerEdge series servers, which offers options equipped with NVIDIA H100 or A100 Tensor Core GPUs and NVIDIA AI Enterprise. They use the 4th generation Intel Xeon Scalable processor and introduce Dell software Smart Flow, catering to different demands such as data centers, large public clouds, AI, and edge computing.

With the booming market for AIGC applications, we seem to be one step closer to a future metaverse centered around fully virtualized content. However, it remains unclear whether the hardware infrastructure can keep up with the surge in demand. This persistent challenge will continue to test the capabilities of cloud server manufacturers to balance cost and performance.

(Photo credit: Google)

  • Page 1
  • 3 page(s)
  • 14 result(s)