NVIDIA’s Blackwell architecture is powerful to the point where it is a few times better than its predecessor Ada Lovelace. However, it has been two years since the big B took over the AI computing market, and referencing their release schedule, Team Green is ready to talk about the upcoming architecture, dubbed the Rubin.
Since its first public mention during COMPUTEX 2024, what we got recently is actually a new platform announcement called Rubin CPX, wholly designed specifically for massive-context processing, which basically means it can handle million-token workloads like software coding and generative video in ways current systems just can’t.
This chip sits alongside NVIDIA’s Vera CPUs and Rubin GPUs inside the new Vera Rubin NVL144 CPX platform capable of up to 8 exaflops of AI compute in a single rack, 100TB of fast memory, and a jaw-dropping 1.7 petabytes per second of memory bandwidth. To put that in perspective, it’s delivering about 7.5x the AI performance of NVIDIA’s already heavy-hitting GB300 NVL72 systems.
Under the hood, Rubin CPX is built on NVIDIA’s Rubin architecture with a monolithic die design packed with NVFP4 computing resources with unseen performance and energy efficiency, hitting up to 30 petaflops with NVFP4 precision, 128GB of GDDR7 memory, and 3x faster attention compared to the GB300 NVL72. It’s also highly flexible, scaling with either NVIDIA’s Quantum-X800 InfiniBand or Spectrum-XGS Ethernet networking.
Team Green expected Rubin CPX will be able to provide new levels of processing capability, such as coding assistants gaining more understanding in coding optimizations, while video models could host built-in decoders and encoders to do long-context inference in one GPU.
Leading industry players are already eyeing it, with Cursor envisioning Rubin CPX as a way to deliver lightning-fast code generation and smarter developer tools, while Runway is lining up Rubin CPX to power cinematic-quality generative video and flexible, agent-driven workflows. And Magic, an AI research outfit, is looking at Rubin CPX’s ability to handle 100-million-token context windows as a breakthrough for training autonomous software agents.
NVIDIA Rubin will also get access to all the latest AI software stack as expected, like Dynamo platform to scale inference, Nemotron multimodal models for enterprise-ready reasoning, and NVIDIA AI Enterprise with NIM microservices for production deployments.
For now, everyone’s gotta wait, since Rubin CPX will only be available from the end of 2026 onwards, and one could hope that there will be more Rubin news in GTC 2026 or COMPUTEX 2026.
The catch? You’ll have to wait. Rubin CPX won’t be available until the end of 2026, but based on what NVIDIA showed off, it looks like they’re aiming to set the standard for the next wave of large-context AI compute.











