About TrendForce News

TrendForce News operates independently from our research team, curating key semiconductor and tech updates to support timely, informed decisions.

[News] Beyond Computing Power: AI Hardware Selection as a New Enterprise Challenge


2026-03-09 Emerging Technologies editor

In the past, generative AI swept across the globe. Large language models such as ChatGPT rapidly penetrated both consumer and enterprise markets, triggering an arms race centered on computing power and GPU acquisition. However, as AI applications gradually move from experimentation to real-world deployment, the spotlight is shifting away from generative AI toward agentic AI and physical AI. The industry is increasingly recognizing that, rather than blindly pursuing raw computing power, the real challenge lies in how effectively that power is utilized—matching AI workloads to the right tasks and deploying them in the most suitable environments.

Broadly speaking, current AI workloads can be categorized into two core types: centralized data center AI and distributed edge AI. Data center AI workloads are typically characterized by massive model parameters and extraordinary data throughput, making them highly sensitive to computational performance and memory bandwidth. By contrast, edge AI emphasizes local, real-time data processing, aligning more closely with actual industry requirements. This has opened up significant opportunities across smart manufacturing, intelligent transportation, smart healthcare, and other application domains.

For common workloads, AI training or large-scale inference in data centers typically uses NVIDIA B200 Tensor Core GPUs (Blackwell) or GB200 NVL72. Mainstream inference tasks often rely on NVIDIA L4 Tensor Core GPUs, while multi-task acceleration scenarios can be supported with NVIDIA L40S GPUs. For R&D validation or workstation-level needs, the NVIDIA RTX PRO 6000 Blackwell Workstation Edition is a common choice. At the edge, where real-time computing is required, deployment can be flexible, ranging from entry-level Jetson Orin Nano modules to high-end NVIDIA Jetson Thor modules, depending on the specific environment and performance requirements.

Overcoming Deployment Challenges: CES 2026 Puts the Spotlight on Edge and Physical AI

In fact, CES 2026—often regarded as a barometer of technology trends—clearly reflects the industry’s focus on physical AI and edge AI. Physical AI enables vehicles, robots, and various devices to perceive and understand the real world, allowing them to operate safely and reliably in real-world environments. Edge AI, meanwhile, continues to push intelligence closer to end users, enabling more responsive, privacy-preserving, and personalized experiences on everyday devices.

However, deploying AI at the edge still faces certain challenges. Many application scenarios have strict data security requirements—for example, personal information in healthcare or production data in factories—making it unsuitable to send all data to the cloud. Deploying models locally not only reduces the risk of data leakage and enhances cybersecurity, but also significantly lowers latency, ensuring real-time computing needs are met. Moreover, processing data on-site gives companies greater control over their information. For fields such as smart manufacturing and smart healthcare, this approach offers the dual benefits of security and efficiency.

Beyond hardware costs, enterprises must also contend with questions around hardware selection, large-scale deployment and management of edge devices, and ongoing model updates and iteration. Enabling edge AI at scale therefore requires more than simply relocating data center equipment—it demands edge AI hardware with a cost structure and form factor that align with real enterprise needs.

Encouragingly, solutions unveiled at CES suggest the industry is moving closer to this goal. Phison, for example, showcased the world’s first iGPU PC platform powered by its aiDAPTIV+ solution, enabling laptops, desktops, and mini PCs to run large AI models with a lower barrier to entry. Qualcomm introduced the Dragonwing IQ10 platform, partnering with industrial computing leader Advantech to target opportunities in physical robotics. Meta also demonstrated how its latest-generation Ray-Ban smart glasses can seamlessly integrate into everyday life.

Across these solutions, the emphasis has clearly shifted toward performance per watt, cost efficiency, long-term stability, and real-world integration into industrial and consumer environments.

Tech Giants Accelerate Their AI Hardware Strategies

On the data center front, one of the most closely watched developments is NVIDIA’s Vera Rubin supercomputing platform, a major upgrade following the Blackwell platform launched in March 2024. NVIDIA describes Vera Rubin as a milestone marking AI’s transition into an “industrialized” phase—one that prioritizes not only higher performance but also sustained stability, reliability, and endurance under long-term, high-load operations.

The Vera Rubin platform comprises six newly designed components, including the Vera CPU, Rubin GPU, NVLink 6 switch, ConnectX-9 SuperNIC, BlueField-4 data processing unit (DPU), and Spectrum-6 Ethernet switch. Compared with the previous generation, these components deliver significant gains in performance, efficiency, and memory capabilities.

NVIDIA CEO Jensen Huang has emphasized that demand for both AI training and inference is growing explosively, and that the launch of Rubin comes at a pivotal moment. With NVIDIA’s annual cadence of new AI supercomputing platforms and its tightly integrated co-design across six major chips, Rubin represents a major step toward the next frontier of AI computing.

Competitor AMD is responding with its MI400 and MI450 series GPUs, positioning the MI450 as a direct challenger to NVIDIA’s Vera Rubin platform. AMD’s upcoming AI hardware roadmap is expected to be a key area of competition this year.

Cloud service providers (CSPs), meanwhile, are increasingly turning to custom AI chips (ASICs) to reduce costs and lessen their dependence on NVIDIA. Google’s TPU ecosystem has expanded particularly rapidly, now fully integrated with Google Cloud and the Gemini AI ecosystem. Today, approximately 75% of Gemini’s computational workloads are already running on TPUs.

Qualcomm has also entered the data center race, challenging both NVIDIA and AMD—but from the opposite direction. Rather than expanding from the cloud to the edge, Qualcomm is extending its edge AI expertise into the data center. Last year, the company announced its AI200 and AI250 accelerators, officially entering the AI server market. These chips are primarily designed for inference workloads—executing AI models rather than training them—with mass production of the AI200 expected to begin in 2026.

Taken together, the rise of ASICs and Qualcomm’s strategy highlights a broader industry shift toward the AI inference era, where efficiency, cost, flexibility, and workload suitability matter more than peak performance alone. NVIDIA has responded by launching its latest inference-focused platform, the BlueField-4 DPU, aimed at enabling high-efficiency inference processing.

Addressing the growing momentum behind ASICs, Jensen Huang has expressed confidence that NVIDIA’s rapid pace of innovation and superior efficiency translate into lower overall costs. In his view, the highest performance ultimately delivers the lowest cost, reinforcing NVIDIA’s competitive advantage—and he does not believe ASIC shipment volumes will surpass those of NVIDIA GPUs.

The True Cost Is Not Computing Power, but Using the Wrong AI

As edge AI deployments continue to expand, the AI ecosystem will become increasingly layered, with clearer divisions of labor. Not every company needs to train foundation models, and not every country is suited to building hyperscale data centers. Yet nearly every industry has its own edge AI use cases. The key lies in task definition and deploying the right tools for the right jobs.

For most companies, rather than racing to acquire more computing power, it is more important to first consider the real-world problems AI is meant to solve. True competitiveness does not lie in the size of the model, but in understanding the scenario and matching the right hardware and software. In the AI era, the real cost is not computing power itself, but the consequences of applying the wrong compute to the wrong scenario. Companies that can provide solutions and products tailored to specific needs are better able to help avoid wasted resources, making them the key choice for collaboration.

More AI Solution👉: https://bit.ly/4aaCFoy

(Photo credit: FREEPIK)


Get in touch with us