What this article covers

Microsoft recently detailed the "AI superfactory," a planet-scale architecture linking next-gen Fairwater datacenters into one fungible fleet. Below, you'll get: (1) a clear architecture walkthrough, (2) component-level insights (accelerators, networks, power/cooling), and (3) zoom out to how you, as an architect or AI practitioner, can design workloads that actually land on this capacity via Azure AI services.

None

1. From Foxconn shell to AI superfactory

The story starts in Mount Pleasant, Wisconsin, on a 315-acre site originally built for a different mega-project. Microsoft is repurposing that campus into what it calls the world's most powerful AI datacenter, the anchor of its first AI superfactory. The Official Microsoft Blog

A few key numbers from Microsoft and local reporting:

  • Roughly 1.2 million square feet of datacenter space across Fairwater buildings.
  • Steel, concrete, and cabling on the order of dozens of miles of pipe and fiber, enough optical fiber to circle the Earth several times.
  • A power envelope measured in hundreds of megawatts, with expansion planned via additional facilities in Wisconsin by 2028.

Now add Atlanta. Fairwater 2, a new two-story AI datacenter outside the city, is being built and directly connected to Fairwater Wisconsin over a high-speed backbone. Public reporting describes the pair as a "massive supercomputer" running on hundreds of thousands of NVIDIA GPUs. WABE

The Azure AI superfactory isn't one building; it's a system of datacenters treated as a single continuum of compute.

2. Fairwater physical architecture: racks, cooling, and power

2.1 Two-story, ultra-dense halls

Fairwater uses a two-story design with liquid-cooled racks stacked in three dimensions. Instead of traditional long aisles, Microsoft can place racks above and below each other, minimizing cable length and improving latency and effective bandwidth between GPUs.

"We've tried to 10x the training capacity every 18 to 24 months. So this would be effectively be a 10x increase from what GPT-5 was trained with. So to put in perspective the number of network optics in this building is almost as much as all of Azure across all our datacenters two and half years ago." — Scott Guthrie, Microsof's EVP of Cloud and AI.

Key design ideas:

  • 3D rack layout: shorter fiber and copper runs, lower latency across pods.
  • Standardized "GPU blocks": repeated rack-scale building blocks (like NVIDIA GB200 NVL72-class systems) wired into a leaf-spine fabric.
  • Tight integration with power and cooling risers: so power distribution and cold plate plumbing stay as short and efficient as possible.

2.2 Liquid cooling and closed-loop efficiency

These GPU racks are liquid-cooled end-to-end. According to Microsoft and local news, roughly 90% of the cooling system is closed-loop: water circulates continuously, rejecting heat through heat exchangers rather than evaporative towers.

Why that matters:

  • Higher rack density: tens of kilowatts per rack is normal; AI racks can push into the 80–120 kW range.
  • Better PUE (Power Usage Effectiveness): less wasted power on fans and chillers, more electrons feeding GPUs.
  • Lower water use: on the order of a restaurant or a single golf course per week, even at this scale.

2.3 Power delivery and resilience

The Wisconsin site pulls power from multiple high-voltage feeds and backs it with redundant transformers, UPS, and generators sized for hyperscale. Microsoft pairs this with large renewable investments and power purchase agreements to match consumption with carbon-free electricity.

Engineering trade-offs:

  • Shorter AC paths from substation to rack to reduce losses.
  • DC busbars in some segments to simplify conversion to GPU DC rails.
  • Zonal fault containment so a failure in one electrical "pod" doesn't ripple across the whole supercluster.

3. GPU fabric: from GB200 racks to a fungible fleet

3.1 Rack-scale GPU systems

Fairwater is being built around NVIDIA's latest accelerators, including GB200-class systems. A typical rack-scale design like NVL72 packs 72 tightly coupled GPUs with high-bandwidth NVLink interconnect, treated almost as a single giant GPU for large model training.

Multiply that:

  • Thousands of these rack systems per campus.
  • Hundreds of thousands of GPUs combined into a single logical fleet per Fairwater site.

Within a rack:

  • GPUs talk over NVLink/NVSwitch, with bandwidth on the order of multi-TB/s.
  • Each GPU node has local HBM and access to shared NVMe over PCIe or NVMe-oF.

Across racks:

  • A multi-tier fabric (often a fat-tree design) built on NVIDIA Spectrum-class Ethernet or InfiniBand switches connects racks into one coherent low-latency cluster.

3.2 AI WAN: inter-datacenter supercluster

Microsoft then stitches Fairwater Wisconsin, Fairwater Atlanta, and prior generations of Azure AI supercomputers into a single elastic system using a continent-spanning AI WAN.

Conceptually:

  1. Inside each campus: leaf-spine GPU fabric provides microsecond-level latency.
  2. Between buildings: datacenter interconnects (DCI) with 400G/800G coherent optics.
  3. Between regions (WI ↔ GA ↔ others): AI WAN — long-haul fiber with dedicated wavelengths for AI traffic.

The AI WAN is designed as a fungible pool: workloads don't care where GPUs sit, only that the scheduler can satisfy their locality and latency constraints.

Architecture at a glance

Goal: unify generations of AI supercomputers into a single elastic system that can land the right workload on the right silicon, at the right site, at the right time — maximizing performance per watt and per dollar. Microsoft calls this a fungible fleet. The Official Microsoft Blog

Three layers shape the system:

  1. Maximum-density compute (within a site)
  • Two-story datacenter design places racks in three dimensions to minimize cable length (read: lower latency, higher effective bandwidth, better reliability/cost). Liquid cooling enables ~140 kW per rack and ~1,360 kW per row with a closed-loop system designed for 6+ years between water replacements. Initial fill ≈ usage of ~20 homes per year. Heat is rejected by one of the largest chiller plants on the planet.
None

2. Single flat AI network (across racks/pods/clusters inside a site)

  • Each rack houses up to 72 NVIDIA Blackwell GPUs linked with NVLink (scale-up). Above that, a two-tier ethernet-based scale-out network provides 800 Gbps GPU-to-GPU connectivity and scales beyond traditional Clos limits, using SONiC on commodity switches. Features like packet trimming/spray, high-frequency telemetry, improved congestion control, and agile load balancing sustain ultra-low latency at rack-to-fleet scale.

3. Planet-scale AI WAN (across sites/regions/generations)

  • A dedicated optical backbone interconnects Fairwater sites (Atlanta, Wisconsin, and others coming online) and prior-generation AI supercomputers, with ~120,000 new fiber miles deployed across the U.S. This lets massive jobs burst beyond a single campus and dynamically place workloads across sites — turning multiple datacenters into a single AI superfactory.

Put together, you get racks → pods → clusters → sites → multi-site "superfactory," all presented to developers as elastic AI capacity that supports pre-training, fine-tuning, RL, synthetic data generation, evaluation, and inference — not just monolithic pre-training.

Components and why they matter

Accelerators & scale-up networking

  • GPUs: Latest NVIDIA GB200/GB300 "Blackwell" accelerators, with up to 72 GPUs per rack and NVLink for ultra-low latency intra-rack comms. Blackwell adds efficient low-precision math (e.g., FP4) and pooled memory so each GPU can tap >14 TB of memory across the rack for giant training graphs.
None
  • At-scale clusters: Microsoft has already stood up large GB300 NVL72 clusters for production workloads, reducing time-to-train from months to weeks. Read more here.
None

Scale-out ethernet fabric inside the site

  • Two-tier backend, 800 Gbps links and SONiC enable massive flat clusters without vendor lock-in, improving price/perf and operational control. Advanced telemetry + packet tricks help maintain throughput under extreme load.

Power & cooling engineered for AI

  • Closed-loop liquid cooling maximizes heat transfer and uptime while minimizing water use; rack/row power budgets unlock extreme density.
  • Power economics: Atlanta was sited for grid resiliency; target 4×9 availability at ~3×9 cost. Microsoft avoids traditional UPS/on-site generation for the GPU fleet by combining resilient utility power, GPU-level power caps, supplementary workloads that smooth demand, and on-site energy storage to mask oscillations.

AI WAN & fungibility across generations

  • The AI WAN links Fairwater sites and earlier supercomputers so developers scale beyond a single facility and land jobs on the most suitable hardware/network path — all exposed as one elastic platform. This is the essence of the fungible fleet concept.

Get started today.

You can use Azure's AI platforms, which abstract underlying capacity and route to available supercomputer clusters.

1) Choose your control plane

2) Secure capacity and right-size compute

3) Engineer the data plane for saturation

  • Co-locate data with compute to avoid cross-region bottlenecks. Use Azure Blob Storage/ADLS Gen2 with multi-stream readers and sharded datasets; pre-stage hot shards. (Microsoft notes heavy investment to sustain multi-GB/s per-GPU read/write patterns required by frontier-scale training.)

Resources:

Final Thoughts

Azure's AI superfactory represents more than a leap in hardware capability — it reflects a deeper shift in how we think about building and scaling intelligent systems. Fairwater's design shows what happens when engineering discipline, long-term planning, and practical lessons from real-world AI workloads converge into a single architectural philosophy.

What stands out most is not just the density of GPUs or the scale of the cooling systems, but the intention behind the entire platform: make AI infrastructure feel limitless, predictable, and accessible to the people who build with it. For developers, architects, and teams shaping modern AI applications, this matters. Faster iteration means more room to explore. Lower friction means more time spent creating. Better efficiency means more opportunity to experiment without hesitation.

As we enter an era where AI interacts with every part of software development — agents, copilots, simulation systems, evaluation pipelines — the environments we build on will increasingly shape the outcomes we can achieve. Fairwater provides that foundation, but the creativity and curiosity of its users will define what's possible.

If there's one message to leave with, it's this: infrastructure is no longer the constraint — imagination is. Use these new capabilities boldly. Test ideas that once felt too big. Let your work stretch into the space that this new architecture makes available.

And most importantly, keep learning. The horizon is moving fast, but the people who stay curious tend to shape where it goes next.

-Dave R.