Build What‘s Next in AI

Build What‘s Next in AI

Development Direction of AI Infrastructure: CPO and Liquid Cooling

The rapid growth of artificial intelligence—particularly large language models and generative AI—has significantly increased the computational demands placed on data centers. As GPU clusters scale to tens of thousands of accelerators, traditional data center architectures are facing bottlenecks in power consumption, heat dissipation, and network bandwidth. Two critical technologies emerging to address these challenges are Co-Packaged Optics (CPO) and liquid cooling systems. Together, they are expected to become foundational components of next-generation AI infrastructure.

1. Co-Packaged Optics (CPO): Solving the Bandwidth and Power Bottleneck

As AI training clusters expand, network bandwidth requirements are growing exponentially. Conventional pluggable optical modules face limitations in power efficiency, latency, and signal integrity at higher speeds such as 800G and beyond.

CPO integrates optical engines directly with switching ASICs inside the same package. This architectural shift offers several advantages:

  • Higher bandwidth density: Enables scalable interconnects for large AI clusters.

  • Lower power consumption: Eliminates long electrical traces between switch chips and optical modules.

  • Improved signal integrity: Reduces electrical losses and enables higher-speed interconnect standards such as 1.6T and beyond.

  • Lower latency: Direct optical coupling improves overall network performance.

Major hyperscalers and semiconductor companies are actively investing in CPO technology as AI clusters move toward ultra-large-scale distributed computing architectures. Over the next decade, CPO is expected to gradually complement—and in some scenarios replace—traditional pluggable optics in high-performance AI networking environments.

However, several challenges remain, including thermal management, manufacturing complexity, and ecosystem maturity. As packaging technologies advance, CPO adoption is expected to accelerate particularly in hyperscale AI data centers.

2. Liquid Cooling: Addressing the Thermal Crisis of AI Data Centers

The power density of AI servers has increased dramatically with the introduction of high-performance GPUs. Modern AI accelerators can consume 700–1000W per chip, and next-generation systems are expected to exceed 1.5 kW per accelerator. As a result, rack power density in AI data centers is rapidly approaching 100 kW or higher, far beyond the capabilities of traditional air cooling.

Liquid cooling has emerged as the most viable solution to this thermal challenge. Compared to air cooling, liquid cooling offers:

  • Significantly higher heat transfer efficiency

  • Support for ultra-high-density racks

  • Lower overall energy consumption (improved PUE)

  • Reduced data center footprint

There are two primary liquid cooling approaches gaining traction:

Direct-to-Chip (D2C) liquid cooling

  • Cold plates directly remove heat from CPUs, GPUs, and other high-power components.

  • Currently the most widely adopted liquid cooling solution in AI data centers.

Immersion cooling

  • Servers are submerged in dielectric fluids.

  • Provides even greater thermal efficiency but faces challenges in serviceability and ecosystem standardization.

As hyperscale AI deployments accelerate, liquid cooling is expected to transition from a niche solution to a mainstream infrastructure standard.

3. Convergence of Networking and Thermal Infrastructure

CPO and liquid cooling represent two sides of the same infrastructure transformation: the need to support massive AI compute clusters efficiently.

Their development trajectories are closely interconnected:

  • Higher network bandwidth from CPO enables larger GPU clusters.

  • Larger clusters drive higher power density.

  • Higher power density increases the need for advanced cooling technologies.

In addition, advanced packaging technologies used in CPO systems will likely require more sophisticated thermal management, further reinforcing the role of liquid cooling.

4. Long-Term Industry Outlook

Looking ahead, the evolution of AI infrastructure is expected to follow several key trends:

  1. Ultra-high-density AI clusters exceeding hundreds of thousands of accelerators.

  2. Optical networking integration, with CPO becoming a critical component of future switching architectures.

  3. Liquid cooling becoming the default cooling method for AI-focused data centers.

  4. Integrated infrastructure design, where compute, networking, and thermal systems are optimized together.

As AI continues to scale, the data center will increasingly resemble a high-performance computing system optimized for power efficiency, bandwidth, and thermal management. Technologies such as CPO and liquid cooling will play a central role in enabling this next generation of AI infrastructure.