Our Atlas Datacenters will represent the cutting edge of AI infrastructure, powered by the groundbreaking Cerebras CS-3 systems featuring WSE-3 chips—the world’s largest and fastest AI processors. Each Atlas datacenter will host a cluster of 64 CS-3 servers, with 900,000 AI-optimized cores and 44GB of on-chip memory per chip, enabling processing capabilities that will outperform entire traditional supercomputing installations.
These datacenters will ensure O sovereignty while being community-owned through our innovative RWA NFT program. From Atlas DC1 to DC2, we will build a decentralized network of compute power that will scale to house the most advanced AI infrastructure in the world. Unlike traditional cloud providers or tech giants' datacenters, our Atlas facilities will be owned by our community through fractional NFTs, with rewards distributed in $O tokens over a 3-year period. This unique model will ensure that our computational resources remain independent and aligned with our community's interests rather than corporate objectives.
The performance capabilities will be extraordinary - we will achieve inference speeds 20 times faster than traditional cloud providers and 3 times faster than even Groq's LPU solutions. This won't just be about raw speed though - it will be about building an unstoppable foundation for truly sovereign artificial intelligence that will serve humanity's collective interests.
Cerebras will provide significant advantages over model-to-model routing systems and server-less inference layers, even though it will not yet offer model-to-model communication on the same chip (a feature expected in Q2 2025).
With 900,000 AI-optimized cores and 4 trillion transistors on a single wafer-scale chip, Cerebras will deliver unmatched computational power. This will eliminate the need for complex orchestration across multiple GPUs or servers that routing systems and serverless layers will require. Instead of managing distributed resources, Cerebras will offer a single, massively powerful chip that can handle large-scale AI workloads.
Traditional systems will often require intricate routing logic and distributed programming to connect models across servers. In contrast, Cerebras will scale from 1 billion to 24 trillion parameters without changing code, making it easier to manage large AI models. While it currently lacks model-to-model communication on-chip, the simplicity of scaling within the WSE-3 architecture will offer a clear operational advantage.
Cerebras’s 21 PB/s memory bandwidth will far exceed that of traditional GPUs, allowing for fast, efficient processing of AI tasks. This level of bandwidth, combined with 44 GB of on-chip memory, will ensure that data flows smoothly within the chip, avoiding the latency issues that distributed systems face when moving data between nodes.
While the initial investment in Cerebras might be higher, the efficiency gained through reduced power consumption, simplified infrastructure, and lower operational complexity will lead to lower overall costs. Serverless systems, while flexible, will often incur ongoing costs and resource management challenges that Cerebras will mitigate with its integrated design.
Cerebras will provide white-glove installation and continuous software upgrades, reducing the burden on in-house teams and ensuring optimal performance over time. This will contrast with the self-managed nature of model-to-model routing systems and server-less layers, which will be resource-intensive to maintain and scale.
Wafer-Scale Engine (WSE-3)
Deterministic Processor (SIMD)
46,225 mm$^2$ (56x larger than a GPU)
Standard GPU size (814 mm$^2$)
900,000 AI-optimized cores
16,896 FP32 + Tensor Cores
Up to 80 TB/s on-die memory bandwidth
16 integrated RealScale interconnects
Max: 300W, TDP: 215W, Avg: 185W
Scales from 1B to 24T parameters with no code changes
Scalable with chip-to-chip interconnects
Optimized for sparse matrix computations
INT8, INT16, INT32, FP32, FP16
White-glove installation and support services
Custom cooling solutions integrated
Standard data center cooling solutions
Large Language Models, multimodal support, Supercomputer Clusters
AI,ML, and HPC workloads with ultra-low latency
AI model training, inference, HPC
$1.5M per unit; $1.218M with bulk purchase (64 units)
$30,000 - $40,000 per GPU
Expert installation and validation testing
Easy-to-use software suite for fast integration
Requires user or integrator setup
Continuous software upgrades and managed services
End-to-end on-chip protection, error-correction code (ECC)
64-Node Wafer Scale Cluster (supports GPT model tasks (pre-training, fine-tuning, inference) with one year of included support and software upgrades)
Installation (Installation at Facilities)
Delivery (Delivery of hardware to the installation site)
Professional Services (Consulting Services for Machine Learning and/or
Datacenter Facility readiness; Contracted in 100-hour
blocks; example contracting configuration shown)
SOC2 Datacenter Hosting Ops
$5,000 x 64 node = $320,000
Datacenter Fiberline Cost
Total $ 103,851,000