Huawei Cloud AI Infrastructure Breakthrough combines 15,000 chips as one system

The challenge of building efficient AI cloud infrastructure has always been about scope – not only by adding more servers, but also Nucka, but because these servers work together. In Huawei Connect 2025, the Chinese technology giant introduced an approach that changes how cloud and businesses can combine computing resources.

Instead of managing thousands of independent servers that communicate through traditional networks, Huawei technology creates what the executives describe as united systems where physical infrastructure behaves as individual logical machines. For cloud providers of AI and enterprises, the AI ​​private clouds are a significant shift in the way the infrastructure can be architerated, managed and modified.

The problem of cloud infrastructure solves superpod

Traditional cloud infrastructure AI faces permanent challenge: As clusters increase, computing efficiency actually decreases. This happens because individual servers in the cluster remain somewhat independent and communicate through network protocols that introduce latency and complexity. The result is what experts in the field call “scaling of fines” – where the addition of larger hardware does not proportionally increase the usable computing power.

Yang Chaobin, director of the Huawei Board and CEO, explained that the company has developed “a pioneering architecture of superpod based on our Unifiedbus InterConnect. Architecture deeply connects physical servers.

This is not just a faster network; It is a reworking of how the cloud infrastructure AI can be constructed.

Technical Foundation: Unifiedbus protocol

The Huawei Cloud AI infrastructure is at the core of the Huawei Cloud AI infrastructure, the interconnection protocol specifically designed for associating huge resources. The protocol deals with two important challenges in the field of infrastructure that have limited cloud AI: maintaining reliability at long distances in data centers and optimizing compromise with bandwidth that affects performance.

The traditional connection of the data center relies on copper cables (high bandwidth, short reach, usually connecting only two shelves) or optical cables (longer range, but with concern for reliability in scale). For cloud providers of building infrastructure to support thousands of AI processors, none of the options shows ideal.

Eric XU, Vice -President of Huawei and the rotating chairman, said that the solution to these basic problems with the connection was essential for the AI ​​cloud infrastructure strategy. Drawing on what described as huawei’s three decades of Connectivity Expertise, Xu Detailed the Breakthrough Solutions: “We have Built reliability into Every Layer of Ourconnect Protocol, Fysical Layer and Data Link Layer, All The Way Up Transmission Layers.

The result is what Huawei describes as an optical connection that is 100 times more reliable than conventional approaches, which supports a connection of over 200 meters in data centers while maintaining reliability characteristics usually associated with the connection of copper.

SuperPod Configuration: From the company to hyperscale

The product line of Huawei cloud and the Huawei cloud includes multiple scales, each designed for different deployment scenarios. The Atlas 950 Superod is a flagship that contains up to 8 192 Ascend 950dt AI processors configured in 160 cabinets occupying 1,000 square meters of data center.

The system provides 8 EFLOPS in the accuracy of FP8 and 16 EFLOPS in the accuracy of FP4 with 1 152 TB of total memory capacity. The interconnection specifications reveal the ambitions of architecture: 16 pb/s bandwidth across the system.

As noted XU, “This means that the only SuperPod Atlas 950 will have a bandwidth of the band more than 10 times higher than the total top -of -the -range bandwidth of the world.” The level of internal connectivity allows the system to maintain linear performance – adding additional processors actually increase the applicable computing power proportionally.

For greater cloud deployment includes atlas 960 Superpod 15 488 Ascend 960 processors up to 220 cabinets at 2,200 square meters, adds 30 EFLOPs in FP8 and 60 EFLAps in FP4, with 4,460 TB of memory and 34 pb/s bandwidth. Atlas 960 will be available in the fourth quarter of 2027.

The consequences for the delivery of cloud services

In addition to the flagship SuperPod Products, Huawei introduced cloud AI infrastructure configuration designed specifically for business data centers. The Atlas 850 Superod, located as “the first air-cooled SuperPod server in the field for businesses”, contains eight NPUs ascend and supports a flexible deployment of multi-kabinet up to 128 units with 1,024 NPU.

It is important that this configuration can be deployed in the standard rooms of the air with air and prevent infrastructure modifications needed for liquid cooling systems. For cloud and enterprises providers, this represents practical commitment flexibility. Organizations can implement SuperPod architecture without necessarily require a complete overwork of the data center, which potentially accelerated the adoption timeline.

SuperCluster Architecture: Hyperscale Cloud deployment

Huawei’s vision goes beyond the individual superpatients to what the company calls SuperClusters – a massive deployment of cloud AI infrastructure, including several interconnected superpatients. The SuperCluster Atlas 950 will include 64 Atlas 950 SuperPods and create a system with more than 520,000 AI processors in more than 10,000 cabinets and adds 524 EFLOPS in FP8 accuracy.

Important technical decision affects how cloud providers can deploy these systems. SuperCluster Atlas 950 supports both Uboe (Unifiedbus Over Ethernet) and RDMA over Converged Ethernet. Uboe allows unifiedbus to start the standard Ethernet infrastructure, allowing cloud providers potentially integrate SuperPod technology into existing data center networks.

According to the Huawei cluster Spots, they show lower static latent and higher reliability compared to clusters of the year, requiring fewer switches and optical modules. For cloud providers planning extensive deployment, this could convert this into performance and economic benefits.

The SuperCluster Atlas 960, scheduled for the availability of the fourth quarter of 2027, will integrate more than one million NPU to deliver 2 Zettafops into FP8 and 4 ZFlops in FP4. Specifications are located by a system of what the XU has described as the future AI models “with more than 1 trillion or 10 trillion parameters”.

Beyond AI: cloud infrastructure of the general purpose

The consequences of superpod architecture exceed the workload AI into the general cloud computing through the Taishan 950 superpod. This system, built on Kunpeng 950 processors with 192 cores and 384 fibers, solves business requirements for important missions that are traditionally operated on mainframes, Oracle Exadata database servers.

The SuperPod Taishan 950 supports up to 16 nodes with 32 processors and 48 TB of memory, including memory association, SSD association and DPU association (data processing unit). When the system is integrated with the distributed GAUSDB database of Huawei, the system provides what the company claims to improve performance 2.9x compared to traditional architectures without required to modify applications.

For cloud providers serving for business customers, this is a significant opportunity for cloud infrastructure. In addition to Huawei databases, the Taishan 950 Superodus increases the use of memory by 20% in virtualized environments and speeds up the Spark workload by 30%.

An open architecture strategy

Perhaps the most important for the wider AI AI infrastructure market, Huawei announced that the technical specifications of Unifiedbus 2.0 would be published as open standards. The company provides open access to hardware and software components: NPU modules, air-cooled and liquid chilled servers, AI cards, CPU boards, cascading cards, Cann compiler, Mind’s sets and OpenPang models all up to 31 December 2025.

Yang framed it as an ecosystem development: “We are committed to our open and open-source software access that will help more partners develop their own Superpod-based solution.

For cloud and system integrate providers, this open approach potentially reduces obstacles to the use of superpod infrastructure. Rather than being locked in a solution for one supplier, partners can develop adapted implementation using Unifiedbus specifications.

Reality on the market and deployment

Cloud AI’s architecture has already seen the deployment of the real world. In 2025, more than 300 units atlas 900 A3 SuperPod were delivered for more than 20 customers on the Internet, finance, carriers, electricity and production industry. The scale of deployment provides some verification that architecture works beyond laboratory demonstrations.

XU has recognized the context forming the Huawei infrastructure strategy: “The Chinese mainland will be lagging behind for a relatively long period of time,” he adds that “sustainable computing forces can only be achieved by procedural nodes that are practically available.”

The statement frames SuperPod architecture as a strategic reaction to restrictions – to achieve competitive performance through architectural innovation rather than only through advanced semiconductor production.

What does this mean for the development of cloud infrastructure

The Huawei SuperPOD architecture is a specific bet on how cloud AI infrastructure should evolve: towards stricter integration and association of resources on a massive scale, made possible with efficiently created connection technology. Whether this approach will prove more effective than alternatives – as loosely bound clusters with sophisticated software orchestration – it remains proven in Hyperscale.

For cloud providers, open architecture strategies introduce the possibilities for building AI infrastructure without necessarily accepting firmly integrated hardware software approaches dominant among western competitors. For the AI ​​Infrastructure Infrastructure Infrastructure Infrastructure AI Infrastructure Infrastructure, the AI ​​infrastructure in the cloud AI are, such as air -cooled Atlas 850, which represents deployment paths that do not require a complete overwork of the data center.

Wider implication concerns how the cloud AI infrastructure could be architectized in the markets, where access to the most modern semiconductor production remains limited. Huawei’s approach suggests that architectural innovations in interconnection, resource association and system design can potentially compensate for limitations in individual processor capabilities – a design that will be tested because these systems are scaling for production workload in various cloud deployments.

(Photos taken from a video of XU key speech at Huawei Connect 2025)

Want to learn more about cloud computing from industry leaders? Check out Cyber ​​Security & Cloud Expo in Amsterdam, California and London. The complex event is part of TechEx and together with other leading technological events. Click here for more information.

Techforge Media is powered by News. Explore other upcoming business technology and webinars here.

(Tagstotranslate) ai

Leave a Comment