AI Era Challenges Server Makers' Profitability

Advertisements

The rise of artificial intelligence (AI) has ignited a new wave of demand for computing power, referred to as “computing power thirst.” This dramatic increase is primarily driven by advancements in large language models, specifically OpenAI's recent release of the o1 modelThis new model has surpassed previous iterations, particularly the GPT-4o, in various aspects of reasoning capabilities.

The key enhancement in o1 lies in its introduction of thought chains into its reasoning processesThis enables the model to dissect questions and provide answers in a more interconnected manner, resulting in responses that are more reliable and comprehensive compared to its predecessors.

However, it is important to note that with improved reasoning capabilities comes a significant increase in the demand for computing powerThis has led to a growing concern within the AI research community; as Huawei’s Vice Chairman Xu Zhijun aptly stated, “The biggest challenge we face in AI research is the lack of computing power—AI is fundamentally about brute-force computation.”

In recent years, technology companies have ramped up investments in AI infrastructure to address this issue

This is evident not only in the soaring stock prices of companies like Nvidia, which provide the necessary technological “shovels,” but also in the impressive earnings reported in the latest quarters by AI server manufacturers that support the AI ecosystem.

As the demand for computing power continues to surge, AI server vendors are strategizing to maximize their earnings through innovative solutionsThe rapid growth of these companies’ revenues is a testament to the deep integration between servers and AI.

During the training phase of AI development, numerous server manufacturers have adopted various methodologies to expedite the entire training processBy creating heterogeneous computing environments, AI servers have transformed into efficient distribution systems for AI training tasksAdditionally, to confront the shortage of computing hardware, AI server manufacturers have utilized their extensive experiences with large server clusters to implement platforms that facilitate mixed training of models using GPUs from major players like Nvidia, AMD, Huawei, and Intel.

As more server companies gain insights into AI, their roles have evolved from merely selling hardware to enhancing their value within the AI industry chain

In the architecture of AI computing centers, several server manufacturers have adapted their hardware infrastructure based on the specific needs of AI, focusing on achieving a deep integration of domestic computing chips.

In tandem with the hardware advancements, many server vendors have begun to explore the production capabilities of AI in the infrastructure spaceWith the introduction of AI models and agents, the collaboration between server providers and AI application clients is becoming increasingly robust, which, in turn, leads to more revenue from software solution provisions.

Undoubtedly, the transformation sparked by AI is reshaping the industry's logic regarding computing power supplyAI server providers are working hard to deliver denser and more efficient computing power, effectively positioning themselves as vital “water sellers” in this era of computing power thirst.

As for the AI industry, those who produce the supporting tools—often referred to as “the shovels”—are reaping significant rewards

The acceleration of investments by major AI corporations has ushered in a profitable era for AI server vendors.

Insights gleaned from IT JiZi demonstrate that, as of September 1, the majority of publicly listed AI companies continue to operate at a lossAmong the 15 profitable AI companies, the aggregate net profit stands at 2.78 billion yuan, while 19 loss-making companies are reporting an overall deficit of 6.24 billion yuan.

The struggle for profitability can be attributed, in part, to the massive investments being poured into AI by industry giants who are still in the expansion phase.

Statistics reveal that, in the first half of the year, three key Chinese AI players—often referred to as the “BAT” (Baidu, Alibaba, Tencent)—collectively spent a whopping 50 billion yuan on AI infrastructure, more than doubling their spending compared to 23 billion yuan during the same period last year

alefox

Globally, Amazon’s capital expenditure rose by 18% in the last quarter, signifying a renewed phase of capital expansionOther major players, including Microsoft, Google, and Meta, have reached a consensus to intensify their investments in AI.

Google’s parent company, Alphabet's CEO Sundar Pichai, has aggressive views on AI investments, stating that the risks of under-investing in AI far exceed those of over-investing, which he does not perceive as a bubble.

In this landscape of heightened investments, AI server players are capitalizing enormouslyGlobal server manufacturers like HP and Dell are enjoying a renaissance, with HP reporting a 35.1% year-on-year growth in its server business, while Dell’s server and networking revenue surged by an incredible 80% during the same period.

Similarly, domestic players such as Lenovo have showcased impressive results, with their infrastructure solutions group crossing the $3 billion mark in quarterly revenues for the first time, attributing a remarkable 65% growth to increasing AI demands

Inspur’s mid-year report highlighted a net profit of 597 million yuan belonging to shareholders, marking a staggering 90.56% increase from the previous year, while Digital China reported a net profit of 510 million yuan, rising by 17.5%, with its AI server revenue skyrocketing by 273.3% to 560 million yuan.

A growth exceeding 50% is a direct result of the large-scale deployment of AI servers.

Beyond cloud companies, telecom operators have emerged as major consumers of AI serversSince 2023, there has been a noticeable increase in AI computing power investments from telecom companies, with demand from China Telecom and China Mobile more than doubling.

Simultaneously, the need for intelligent computing centers is rapidly accelerating the deployment of AI serversAccording to Yin Yang, head of China's region for Habana—an AI chip company owned by Intel—approximately 50 government-led intelligent computing centers have been established in the past three years, with another 60 projects currently in planning and construction.

The heightened demand for AI servers has fundamentally altered the growth structure within the server industry, as illustrated in a recent TrendForce report

The surge in procurement, especially among large cloud service providers (CSPs), is projected to yield an estimated value of 187 billion dollars for AI servers by 2024, marking a growth rate of 69%. In stark contrast, the projected annual growth for standard servers is a mere 1.9%.

In the future, as CSPs finalize the setup of innovative computing centers, AI server demand is expected to soar alongside an increased requirement for edge computingThe sales dynamics for AI servers will inevitably shift from large-scale purchases by CSPs to smaller, enterprise-level purchases targeting edge computing.

This transformation suggests that AI server manufacturers will see a significant increase in both bargaining power and profitability as procurement methods evolve.

Moving forward, AI companies are projected to continue earning more from AIThis upcoming trajectory could widen the gap significantly in the return-on-investment cycle, which tends to be lengthy for clients utilizing AI servers.

In reference to the computing rental model, industry insiders have long calculated the following: considering the supporting equipment needed for intelligent computing centers, such as storage and networking devices, the investment return period for an Nvidia H100-based system can extend up to five years, while the investment return period for the cost-effective Nvidia 4090 is also over two years.

This reality places considerable emphasis on how effectively clients can utilize AI servers, thus becoming the core area for competition across the server industry.

To navigate the complexities associated with large model deployment, encompassing elements like distributed parallel computing, power scheduling, storage allocation, and large-scale networking, various AI server vendors are employing innovative strategies

Executives like Feng Lianglei from New H3C Group have summarized the complexities faced in the practical application of AI servers.

These challenges can be grouped into two categories: optimizing computing power and ensuring large-scale usabilitySales representatives from various companies have noted that typical client demands revolve around hardware specifications, AI training support capabilities, and large-scale clustering functionalities.

The optimization of computing power correlates closely with the heterogeneous computing challenges that AI servers faceCurrent industry solutions largely fall into two broad categories—optimizing power distribution and managing heterogeneous chip collaborations.

Modern AI servers rely heavily on collaborative operations between CPUs and specialized computing hardware (such as GPUs, NPUs, and TPUs). The mainstream model for resource allocation within the industry tasks CPUs with delegating computational tasks to dedicated hardware components.

This model mirrors the fundamental principles of NVIDIA's CUDA, whereby the larger the pool of computing hardware “driven” by the CPU, the greater the overall computing power.

As AI servers have evolved, the industry has adapted by developing hardware that allows the stacking of computing power components

The physical dimensions of AI servers are expanding, shifting from the traditional 1U size of generic servers to more common configurations like 4U and 7U.

Different server manufacturers have introduced various solutions to further optimize computing power allocationFor instance, New H3C’s AoFei computing platform allows for fine-grained division of computing power and memory based on 1% and MB specifications, enabling smart allocations as neededSimilarly, Lenovo's Wanquan Heterogeneous Intelligence Computing Platform employs a knowledge-base approach, wherein it automatically recognizes AI scenarios, algorithms, and resource clustersClients can input specific contexts and datasets, prompting the system to automatically select the best algorithm and optimal cluster configurations.

On the chip collaboration front, the focus remains on synchronizing operations between different hardware servers

Given the ongoing supply-demand mismatch surrounding NVIDIA GPUs, numerous intelligent computing centers have opted to adopt mixed training models using GPUs from various manufacturers, including AMD, Huawei, and Intel, resulting in challenges relating to communication efficiency, interoperability, and scheduling.

Training AI using correlated server clusters offers its own set of complications, as underscored by a technical expert from Guanghuan IntelligenceThe challenge lies in how tasks are broken down across various computing power devices, and any inefficiencies or malfunctions in one of the GPUs can significantly hinder the training cycle for the entire system.

To address these issues, the prevailing strategy focuses on employing cloud management systems that comprehensively dissect AI training and neural network tasks.

For example, New H3C's solution revolves around the construction of a heterogeneous resource management platform, which integrates a unified communication library to accommodate various GPU manufacturers, effectively standardizing operations across hardware discrepancies

Baidu’s Baibei Heterogeneous Computing Platform’s approach involves consolidating various chip types into a singular large cluster that supports comprehensive task training.

Despite the similarities in solution methods, the shared objective among players is to enable seamless operations without inquiry into the origin of the technological components involved.

Once heterogeneous computing issues are adequately resolved, a newfound versatility emerges regarding the selection of hardware for intelligent computing centersThe cooperative efforts of different server, chip, and AI infrastructure manufacturers also yield synergies, playing a crucial role in maintaining the stability of building expansive computing power clusters.

Reflecting on Meta's experiences with their computing clusters, it's clear that the training process for AI models can be fraught with complications

Reports indicate that during a synchronized training task involving 16,000 H100 GPUs, Meta encountered 466 operational anomalies over a span of 54 daysEstablishing rapid recovery protocols becomes essential in resolving these occurrences, prompting a trend toward implementing protective measures during training.

For instance, Lenovo's approach involves utilizing AI models to predict potential faults in AI training setups, optimizing backups before issues can occurOther players like Super Fusion and Huawei Ascend have adopted clear, straightforward strategies—automatically isolating faulty nodes and resuming training from the most recent checkpoints upon detection of node failures.

In a broader context, as AI server manufacturers deepen their understanding of AI, optimize computing power, and enhance stability, they simultaneously elevate their own value propositions within the marketplace.

Leveraging AI's transformative potential, players in the AI server sector are now able to revitalize the traditional ToB market by providing renewed value propositions that resonate in the age of intelligent computing.

The question of whether AI truly adds value for server vendors merits exploration through the lens of historical context

Throughout the evolution of the server industry, manufacturers have often felt confined within the mid-range of the "smile curve."

After the Third Industrial Revolution, as the market for servers expanded, numerous companies emerged in this domainThe rise of the Wintel alliance in the PC era manifested in major players like Dell and HP, while the increasing demand for digitalized resources fostered the emergence of OEMs like Inspur and Foxconn during the cloud computing era.

Yet, despite achieving hundreds of billions in revenue annually, the profit margins for server manufacturers have often remained disappointingly lowUnder the JDM (Joint Design Manufacturing) model popularized by Inspur, extreme manufacturing efficacy yielded net profit margins languishing between one to two percent.

The inherent struggle faced by these manufacturers has been attributed not merely to production issues but rather to their inability to control core technologies and patents within the industry

Standardized production processes left them with limited uniqueness.

Enter the AI era, where the value derived from server manufacturers is being redefined against the backdrop of innovative AI applicationsThe capacity for vertical integration in AI has become the focal point of competition among server vendors.

On the hardware side, several manufacturers have delved into the construction of intelligent computing centers to drive their efforts even further.

For instance, New H3C, Inspur, Super Fusion, and Lenovo are all introducing liquid-cooled server cabinets aimed at improving Power Usage Effectiveness (PUE). New H3C, in particular, has not only launched silicon photonic switches (CPOs) to reduce overall energy consumption, but has also undergone AI optimizations across its product line.

In parallel, other companies are actively promoting the implementation of domestic computing chips, such as Digital China and Lenovo, contributing towards an acceleration in the Chinese chip industry's evolution.

On the software front, server vendors are consistently exploring the productivity attributes of AI, steering their operations toward broader horizons beyond mere hardware sales.

The emergence of AI-powered platforms stands as a testament to this trend

Digital China’s integrated question-asking platform incorporates modules for model performance management, enterprise-based knowledge, and AI application engineeringBy leveraging its homegrown AI platform, Digital China embeds agent capabilities into server operations, ensuring that usage is consistently refined and optimized over time.

According to Digital China’s Vice President Li Gang, the purpose of this technology is to cultivate a platform that embeds previously validated agent frameworks within enterprises while continuously evolving and refining new agent concepts, encapsulating the essence of Digital China’s AI application engineering platform.

On the other hand, New H3C has harnessed the advantages of its networking productline, employing AI-generated content to facilitate anomaly detection, trend forecasting, failure diagnostics, and smart adjustments within the telecommunications realm

Additionally, New H3C has introduced a generalizable large AI model capable of driving industry-specific models, penetrating different customer business segments and extending the functionalities of its original hardware-focused services.

Through relentless technological innovation and iterative product refinement, the industry is seeking to capture new opportunities within the AI surge, unlocking fresh growth dynamics for foundational AI infrastructure.

In conclusion, as Lenovo Group's Vice President and general manager of China’s Infrastructure Business Unit Chen Zhenkuan aptly notes, server manufacturers are currently reaping benefits from the rising profitability associated with their increasingly verticalized integration in the AI landscape.

By venturing beyond mere manufacturing, these server companies are poised to embrace their own AI-driven golden age.

Share:

Leave a comments