doubleAI Claims 3.6x Speedup Over NVIDIA’s GPU Code

By Yohai Schweiger

Israeli startup doubleAI, founded by CEO Prof. Amnon Shashua and CTO Prof. Shai Shalev-Shwartz, announced that its AI system, WarpSpeed, has successfully rewritten and re-optimized CUDA kernels in NVIDIA’s cuGraph library — part of the RAPIDS software ecosystem for GPU-accelerated data science — achieving an average 3.6× speed improvement over versions refined by NVIDIA’s CUDA engineers over the past decade.

According to the company, every tested kernel showed some degree of improvement, with more than half delivering over 2× speedups. The optimized code has been published on GitHub, allowing users to deploy the accelerated version without modifying existing application code. The announcement was accompanied by a public post from Shashua on X and a detailed technical blog.

cuGraph is a core component of NVIDIA’s RAPIDS suite and is widely regarded as one of the leading GPU libraries for graph analytics — a critical domain for network analysis, recommendation engines, cybersecurity, bioinformatics, and financial systems. Its kernels were developed over years by engineers specializing in hardware-level performance optimization, where decisions about memory layout, thread scheduling, warp structure, and cache behavior can dramatically affect results.

Unlike conventional application development, GPU performance engineering operates in a deeply contextual decision space with no single “correct” solution — only delicate trade-offs between physical and computational constraints.

Were LLM “Gold Medals” Misleading?

Beyond the engineering achievement itself, Shashua frames the milestone as part of a broader debate over the limits of modern AI — particularly whether large language models, scaled through massive training, can truly tackle deep, complex problems where data is limited, validation is difficult, and reasoning chains are long and context-dependent.

In his post, Shashua notes that AI systems have recently “won gold medals at the IMO” and “outperformed top programmers on CodeForces,” but argues that these victories rely on unusually favorable conditions. He describes what he calls “three hidden crutches: abundant training data, trivial verification, and short reasoning chains.”

“When all three are present,” he writes, “today’s AI excels. Remove even one — and it collapses.”

GPU performance engineering, he argues, is a stress test where none of those conditions hold. “Data is scarce. Correctness is hard to verify. And performance emerges from a long chain of interdependent decisions — memory layout, warp behavior, caching, scheduling, graph structure.” In such environments, there is no synthetic benchmark with a clear answer, but rather a vast and tightly coupled search space where each design choice influences many others.

Shashua further claims that even advanced coding agents struggle in this domain. “Even sophisticated agents like Claude Code, Codex, and Gemini CLI fail dramatically here,” he writes, “often producing incorrect implementations even when provided with cuGraph’s full test suite.” According to him, “scaling alone cannot break this barrier,” and new algorithmic ideas were required to address this level of complexity.

AEI Instead of AGI

Founded in late 2023, doubleAI has raised hundreds of millions of dollars at a valuation reportedly approaching $1 billion. The company focuses on building AI systems tailored to solving particularly complex engineering and scientific problems, where — it claims — expert-level or superhuman performance can be achieved through deep algorithmic search rather than brute-force scaling of language models.

doubleAI positions the current achievement as part of a broader vision it calls Artificial Expert Intelligence (AEI): systems that consistently outperform human experts in narrow but critical domains where expertise is scarce and expensive. Rather than pursuing generalized AGI, the company concentrates on solving deep optimization problems, combining learning from limited data, probabilistic validation methodologies, and agentic search structures that navigate complex decision spaces.

The approach resembles an advanced algorithmic search system more than a conventional one-shot language model — and, if the performance gains hold up under community scrutiny, may signal a shift in how AI tackles some of computing’s most demanding low-level challenges.

NVIDIA invests in Coherent and Lumentum

By Yohai Schweiger

Chip giant NVIDIA announced this week a strategic $4 billion investment in two U.S.-based photonics companies — Coherent Corp. and Lumentum Holdings — in a move designed to support the accelerated expansion of its connectivity division and ensure the availability of critical components for next-generation AI switches and advanced optical interconnect solutions.

NVIDIA will invest $2 billion in each company through direct equity stakes, alongside multi-year supply agreements that include significant purchasing commitments and preferential access to future manufacturing capacity. The structure combines capital investment with long-term supply chain guarantees, deepening collaboration on next-generation optical connectivity components. Beyond the capital infusion, the partnerships are designed to secure steady access to critical optical and laser components for AI data centers, while expanding advanced manufacturing capabilities in the United States and increasing domestic industrial output.

Critical Links in the Optical Value Chain

Coherent, headquartered in Pennsylvania, is one of the long-standing players in the global laser and photonic materials industry. The company develops and manufactures III-V laser sources, silicon photonics components, advanced optical engines and high-precision chip packaging technologies. In data centers, lasers are far from a peripheral component — they are the heart of the optical system. Integrated into optical transceiver modules installed in network interface cards and InfiniBand and Ethernet switches, they generate the light that enables high-speed data transmission between servers, racks and large-scale GPU clusters at speeds reaching 800 gigabits per second today and advancing toward 1.6 and 3.2 terabits.

Coherent operates advanced manufacturing facilities in the U.S., and their expansion under the agreement aligns with broader American efforts to strengthen domestic production of critical infrastructure technologies. The company supplies laser and optical components to networking equipment manufacturers, semiconductor firms, hyperscale cloud providers, and to industrial, defense and aerospace customers.

Lumentum, which became an independent company in 2015 following its spin-off from JDSU — once one of the world’s largest optical communications component suppliers in the 1990s and 2000s — focuses on optical communication modules and laser components for data centers and high-speed networks. It develops high-speed transceivers, VCSEL lasers and modulation components integrated into network cards and data center switches, enabling optical connectivity between servers, racks and entire computing clusters.

If Coherent provides the light source and foundational photonic layer, Lumentum delivers the modules that transform that light into a functioning data network. As AI infrastructure scales and bottlenecks shift from compute to bandwidth, the optical layer becomes just as critical as the GPU itself.

The Fastest-Growing Division

In recent years, NVIDIA has introduced silicon photonics solutions in its InfiniBand and Ethernet switches and has been advancing co-packaged optics technologies. Yet even when optical integration occurs inside the switch, laser sources, raw materials and large-scale manufacturing of optical components are handled by specialized suppliers. The new investments secure preferred capacity access, influence over the technology roadmap and deeper control of the supply chain — a critical advantage as AI data centers expand at exponential scale.

The strategic context of the move lies in NVIDIA’s connectivity division, built on the 2019 acquisition of Mellanox and headquartered in Israel. This division now oversees InfiniBand, Spectrum Ethernet, BlueField DPUs and advanced fabric systems that connect tens of thousands of GPUs into a single computing cluster. In its latest fiscal 2026 results, the networking segment reached approximately $31 billion in annual revenue — a dramatic expansion that has turned it into one of NVIDIA’s primary growth engines and roughly a tenfold increase in scale since the Mellanox acquisition.

As data centers grow from 10,000 GPUs to 100,000 and beyond, demand for bandwidth, low latency and energy efficiency surges. Copper links are no longer sufficient, and the transition to advanced optics becomes unavoidable. The investments in Coherent and Lumentum allow NVIDIA to avoid manufacturing bottlenecks as AI infrastructure demand accelerates, while enabling faster transitions to next-generation optical connectivity. At the same time, deeper control over the optical layer strengthens NVIDIA’s competitive differentiation in networking and supports high margins in full-stack AI systems.

From an Israeli perspective, the move further reinforces NVIDIA’s connectivity arm. Future generations of InfiniBand and Ethernet switches developed in Israel are expected to rely increasingly on integrated photonics solutions. Expanded optical manufacturing capacity and closer collaboration with laser and module suppliers directly support the division’s ability to sustain its growth trajectory, meet rising demand and deliver increasingly advanced products.

Whoever controls the GPU, the memory, the switch — and the optical layer connecting them — effectively controls the architecture of next-generation data centers. NVIDIA’s investment in photonics is another step in that direction, reinforcing a growth wave that is also being driven from Israel.

NVIDIA’s Driving Model Poses a Challenge to Mobileye

By Yohai Schweiger

While NVIDIA’s Rubin platform for next-generation AI infrastructure captured most of the attention at CES 2026 in Las Vegas last week, the company quietly unveiled another move with potentially far-reaching strategic implications for the automotive industry: the launch of Alpamayo, an open foundation model for autonomous driving designed to serve as the planning and decision-making layer in future driving systems.

The announcement is expected to influence not only how autonomous driving systems are developed, but also the balance of power among technology suppliers in the automotive value chain — with particular implications for Israeli auto-tech companies.

Most Israeli players, including sensor makers Innoviz and Arbe, as well as simulation and validation specialists Cognata and Foretellix, do not provide full vehicle systems but rather core components within the broader stack. For them, NVIDIA’s move could prove supportive. By contrast, the availability of an open and flexible planning model that allows automakers to assemble software-hardware stacks around a unified computing platform poses a strategic challenge to Mobileye, which has built its market position around a vertically integrated, end-to-end solution and full system responsibility.

NVIDIA DRIVE: An AI-First Ecosystem for Automotive

Alpamayo now joins the broader set of solutions NVIDIA groups under its NVIDIA DRIVE platform — a comprehensive ecosystem for developing intelligent vehicle systems. DRIVE includes dedicated automotive processors such as Orin and Thor, an automotive operating system, sensor data processing and fusion tools, simulation platforms based on Omniverse and DRIVE Sim, and cloud infrastructure for training and managing AI models. In other words, it is a full-stack platform designed to support automakers from development and validation through real-time deployment on the vehicle itself.

This aligns with NVIDIA’s broader push toward an AI-first vehicle stack — shifting away from systems built primarily around hand-crafted rules and task-specific algorithms toward architectures where large AI models become central components, even in layers traditionally handled by “classical” algorithms, such as decision-making.

In this context, Alpamayo plays a strategic role. For the first time, NVIDIA is offering its own foundation model for planning and decision-making, effectively re-centering the DRIVE platform around an end-to-end AI-driven architecture — from cloud training to execution on the in-vehicle computer.

The Vehicle’s Tactical Brain

Alpamayo is a large multimodal Vision-Language-Action (VLA) model that ingests data from multiple video cameras, LiDAR and radar sensors, as well as vehicle state information, and converts it into an internal representation that enables reasoning and action planning. Based on this, the model generates a future driving trajectory several seconds ahead. It does not directly control actuators such as steering or braking, but it determines the vehicle’s tactical behavior.

Unlike general-purpose language models, Alpamayo operates in a physical environment and combines perception with spatial and contextual reasoning. Its inputs include video sequences, motion data, and in some cases maps and navigation goals. The model performs scene understanding, risk assessment, and path planning as part of a single decision chain. Its primary output is a continuous trajectory passed to the vehicle’s classical control layer, which handles physical actuation and safety constraints.

Training such a model relies on a combination of real-world data and massive amounts of synthetic data generated using NVIDIA’s simulation platforms, Omniverse and DRIVE Sim.

The model is released as open source, including weights and training code, allowing automakers and Tier-1 suppliers to retrain it on their own data, adapt it to their system architectures, and integrate it into existing stacks — not as a closed product, but as a foundation for internal development. NVIDIA has also announced partnerships with industry players including Lucid Motors, Jaguar Land Rover (JLR), Uber, and research collaborations such as Berkeley DeepDrive to explore advanced autonomous driving technologies using Alpamayo.

Mobileye: A Challenge to the Full-Stack Model

An autonomous driving stack typically consists of several layers: sensors, perception, planning and decision-making, and control. Alpamayo sits squarely in the planning layer. It does not replace perception, nor does it replace safety-critical control systems — but it does replace, or at least challenge, the traditional algorithmic decision-making layer.

This enables a more modular system design: perception from one supplier, planning from NVIDIA’s model, and control from another Tier-1. This represents a conceptual shift away from closed, end-to-end “black box” solutions.

That is where the tension with Mobileye emerges. For years, Mobileye has offered a nearly complete stack — sensors, perception, mapping, planning, and proprietary EyeQ chips running the entire system with high energy efficiency. This model fits well with ADAS and L2+ systems, and even more advanced autonomous configurations.

However, foundation models for planning shift the balance. They require more flexible and powerful compute than dedicated ADAS chips typically provide, pushing architectures toward GPU-based computing.

While in some scenarios Mobileye perception components can be integrated into broader stacks, most of the company’s advanced autonomy solutions are offered as tightly integrated system units, which in practice limits the ability to swap out individual layers. Moreover, the very presence of an open planning model weakens the value proposition of proprietary planning software. Instead of developing or licensing dedicated planning algorithms, automakers can adapt an existing foundation model to their own data and operational requirements.

This is not an immediate threat to Mobileye’s core business, but over the longer term — as the market moves toward L3 and L4 autonomy and the decision layer becomes increasingly AI-driven — it represents a genuine strategic challenge to the closed, end-to-end model.

That said, Mobileye retains a significant structural advantage: it delivers a complete system and assumes full responsibility for safety and regulatory compliance. For many automakers, especially those without deep in-house AI and software capabilities, this is critical. They prefer a single supplier accountable for system performance rather than assembling and maintaining a complex “puzzle” of components from multiple vendors, with fragmented liability and higher regulatory risk.

Innoviz and Arbe: Sensors Gain Strategic Importance

For Israeli sensor suppliers such as Innoviz and Arbe, NVIDIA’s move could be distinctly positive. Advanced planning models benefit from rich, reliable, multi-sensor input. LiDAR provides precise three-dimensional geometry and depth, while advanced radar excels at detecting objects in poor lighting and adverse weather conditions.

This sensor data is essential for planning layers and decision-making models operating in dynamic physical environments. As a result, both companies are positioning themselves as part of NVIDIA’s ecosystem rather than alternatives to it. Both have demonstrated integration of their sensing and perception pipelines with NVIDIA’s DRIVE AGX Orin computing platform.

In a stack where decision-making becomes more computationally intensive and AI-driven, the value of high-quality sensing only increases. No matter how advanced the model, limited input inevitably leads to limited decisions.

Cognata and Foretellix: Who Verifies AI Safety?

Another layer gaining importance is simulation, verification and validation — where Israeli firms Cognata and Foretellix operate.

Cognata focuses on building synthetic worlds and complex driving scenarios for training and testing, while Foretellix provides verification and validation tools that measure scenario coverage, detect behavioral gaps, and generate quantitative safety metrics for regulators and safety engineers.

As AI models become central to driving stacks, the need for scenario-based safety validation grows, beyond simply accumulating road miles.

Both companies are aligned with NVIDIA’s simulation-centric development approach. Cognata integrates with DRIVE simulation and Hardware-in-the-Loop environments (where real vehicle hardware is connected to virtual scenarios) for large-scale testing, while Foretellix connects its validation tools to Omniverse and DRIVE to assess AI-based driving systems under diverse physical conditions.

Open Source, Semi-Closed Platform

Although Alpamayo is released as open source, it is deeply optimized for NVIDIA’s hardware platforms. Optimization for CUDA, TensorRT, and low-precision compute enables real-time execution on DRIVE computers, which are architecturally closer to GPUs than to traditional ADAS chips.

This fits into NVIDIA’s broader open-model strategy: the company releases open models for robotics, climate science, healthcare and automotive — but after deep optimization for its own computing platforms. The approach enables broad ecosystem adoption while preserving a performance advantage for those building on NVIDIA hardware.

In practice, this allows NVIDIA to expand AI into physical industries while shaping the computing infrastructure those industries will rely on.

A Threat to One Model, an Opportunity for Others

NVIDIA’s driving model does not herald an immediate transformation on public roads, but it does signal a deeper shift in how the automotive industry approaches autonomy: fewer hand-crafted rules, more general AI models, more in-vehicle compute, and heavier reliance on simulation and validation.

For much of the Israeli auto-tech sector — sensor providers, simulation vendors and validation specialists — this trajectory aligns well with existing products and strategies, and could accelerate adoption and partnerships within the DRIVE ecosystem. For Mobileye, by contrast, it signals the emergence of an alternative path to building the “driving brain” — one that does not necessarily rely on a closed, vertically integrated stack.

If autonomous driving once appeared destined to be dominated by a small number of players controlling entire systems, NVIDIA’s move points toward a more modular future — with different layers supplied by different vendors around a central AI platform. At least in the Israeli auto-tech landscape, many players appear well positioned for that scenario.

Lightricks Goes Open Source with LTX-2, Taking on Big Tech in AI Video

Photo above: Lightricks CEO and co-founder Dr. Zeev Farbman. Credit: Riki Rahman. Photo illustration

Lightricks announced at CES the full open-source release of its generative video-and-audio model, LTX-2, including model weights and training code. The move is unusual in a market where advanced video models are largely controlled by closed cloud platforms. Announced in partnership with NVIDIA, the launch positions Lightricks as an open alternative to approaches led by companies such as OpenAI and Google, and signals a potential shift in how generative video technology is deployed and adopted.

LTX-2 can generate synchronized video and audio at up to 4K resolution, with clip lengths of up to 20 seconds and high frame rates. The model is optimized to run locally on RTX-powered workstations as well as on enterprise DGX systems, and is positioned as production-ready rather than a research demo. Unlike closed platforms such as Sora or Veo, Lightricks allows developers and organizations not only to use the model, but also to retrain, customize and integrate it directly into products and internal workflows.

Full open-source availability

While open video models already exist, most suffer from significant limitations, including lack of audio, lower visual quality or poor suitability for commercial use. LTX-2 is the first to combine full open-source availability with capabilities designed for real-world production, positioning it as a bridge between open research and the operational needs of the media and creative industries.

Lightricks is an Israeli company best known for its popular creative and editing apps, including photo and video tools used by millions of users worldwide. In recent years, the company has been expanding beyond consumer applications into the development of AI models and creative infrastructure aimed at professional creators and enterprise customers.

Behind the decision to open-source the model lies a clear business strategy. Lightricks is giving up exclusive control over the core technology in order to establish it as a standard platform others can build on. Rather than monetizing usage of the model itself, the company is positioning LTX-2 as the foundation for commercial tools, platforms and paid services developed on top of it. The approach mirrors familiar open-source business models in which economic value is created around the code rather than within it.

NVIDIA is not involved in developing the model itself, but plays a central role in positioning LTX-2 as a natural workload for RTX hardware and DGX systems. The partnership reflects a broader vision in which advanced generative video can and should run outside the cloud, on local workstations and within enterprise environments.

The release of LTX-2 reflects a broader shift in the generative video market, from closed models optimized for demonstrations and limited cloud-based access, toward open infrastructure designed for deep adoption and large-scale product development. Rather than focusing on producing the most eye-catching demo, Lightricks is aiming to provide the foundation on which the next generation of video creation tools will be built.

Rubin Pushes the GPU Off Its Pedestal

Above: The full Rubin platform. Source: Nvidia

At CES 2026 in Las Vegas, Nvidia unveiled Rubin, a platform it describes as “the next generation of AI infrastructure.” Rubin’s existence, and the fact that it is scheduled to reach the market in the second half of 2026, were already known. What this announcement revealed for the first time, however, was the idea behind it: not another generation of GPUs, but a deep conceptual shift in how large-scale AI systems are built and operated. Instead of a massive GPU at the center supported by surrounding components, Nvidia presented a complete architecture that functions as a single system, tightly integrating compute, memory, networking and security.

The recurring message is that Rubin is not a chip but a full rack-scale computing system, designed for a world in which AI is no longer a one-off chatbot but a constellation of agents operating over time, maintaining context, sharing memory and reasoning within a changing environment. In that sense, Rubin marks Nvidia’s transition from selling raw compute power to selling what is effectively a cognitive infrastructure.

Codesign as a principle, not a slogan

Nvidia has used the term “full stack” for years, but in practice it usually meant a collection of components built around the GPU. With Rubin, the concept of codesign takes on a very different meaning. This is not about tighter integration of existing parts, but about designing every element of the system—CPU, GPU, networking, interconnect, storage and security—together from the outset, as a single unit built to serve entirely new types of workloads.

The practical implication of this approach is that the GPU is no longer the architectural starting point. It remains a powerful and central component, but it is no longer the system’s unquestioned master. Rubin is designed around the assumption that the next AI bottleneck is not raw compute, but context management, persistent memory and orchestration across processes and agents. These are not problems solved by a faster GPU alone, but by redistributing responsibilities across the system.

In Rubin, architectural decisions are driven not by what the GPU needs, but by what the system as a whole must accomplish. This is a turning point for Nvidia, as it effectively moves away from the GPU-first mindset that has defined the company since the early CUDA era, replacing it with a system-level view in which compute is only one layer of a broader architecture.

The role of the CPU, and what it means for the x86 world

One of the most intriguing components in Rubin is the new Vera CPU. Unlike traditional data center CPUs, whose main role has been to host and schedule GPU workloads, Vera is designed from the ground up as an integral part of the inference and reasoning pipeline. It is not a passive host, but an active processor responsible for coordinating agents, managing multi-stage workflows and executing logic that is poorly suited to GPUs.

In doing so, Nvidia signals a profound shift in how it views the CPU in the AI era. Where the CPU was once largely a bottleneck on the path to the GPU, it now reemerges as a meaningful compute element—one that operates in symbiosis with the GPU rather than beneath it. The choice of an Arm-based architecture, and the fact that the CPU was designed alongside the GPU and networking rather than as a standalone component, point to Nvidia’s ambition to control the orchestration and control layer, not just the compute layer.

More broadly, the decision to use Arm reflects the need for flexibility and deep control over CPU design. Unlike general-purpose processors built to handle a wide variety of workloads, Arm allows Nvidia to tailor a processor precisely to the needs of modern AI systems, stripping away logic that is irrelevant to inference and agent orchestration. The implication is that the classic data center model—built around general-purpose x86 CPUs as the default foundation—is no longer a given for systems designed as AI-first from the ground up.

Memory, storage and the birth of a context layer

Perhaps the most significant architectural shift in Rubin lies in how inference context memory is handled. Nvidia introduced a new approach to managing the context memory of large models, particularly the KV cache generated during multi-step inference. In classical architectures, designed for short and isolated workloads, this memory had to reside in GPU HBM to maintain performance, making it expensive, scarce and ill-suited for long-running, multi-agent systems.

Rubin breaks this assumption by moving a substantial portion of context memory out of the GPU and into a dedicated layer that behaves like memory rather than traditional storage. This is also where the role of BlueField-4—the DPU derived from Mellanox networking technology—changes fundamentally. It no longer serves merely as an infrastructure offload engine, but becomes an active participant in managing context memory and coordinating access to it as part of the inference pipeline itself.

This shift reflects the gap between architectures built for training or one-off inference, and the needs of agent-based systems that operate continuously, preserve state and share context across components. In Rubin, memory and context management become integral to the inference performance path, not an external I/O layer—an adjustment that aligns closely with how modern AI systems are expected to function.

Connectivity also takes on a new role in Rubin. NVLink continues to serve as the high-speed internal interconnect between GPUs, but the Ethernet layer—embodied by Spectrum-6 and Spectrum-X—assumes a very different function than in traditional data centers. Instead of merely moving data between servers, the network becomes part of how the system manages compute and memory.

In this architecture, connectivity allows GPUs, CPUs and DPUs to access shared context memory, exchange state and operate as if they were part of a single continuous system, even when distributed across multiple servers or racks. Technologies such as RDMA enable direct memory access over the network without CPU involvement, turning the network into an active participant in the inference flow rather than a passive transport layer.

As a result, data movement, context management and inter-component coordination no longer happen “around” computation—they become part of computation itself. This is a prerequisite for distributed AI systems and long-running agents, where memory and state are as critical as raw compute.

This brings us back to the central theme of Nvidia’s announcement: the shift from training as the center of gravity to continuous, multi-agent inference. Rubin is designed primarily for a world in which most AI costs and business value reside in deployment, not training. In such a world, what matters is not only how fast you can compute, but how effectively you can remember, share and respond.

Rubin is, ultimately, Nvidia’s attempt to redefine the rules of AI infrastructure. No longer a race for TFLOPS alone, but a competition over who controls the entire architecture. If the strategy succeeds, Nvidia will not merely be an accelerator vendor, but a provider of full cognitive infrastructure.

NVIDIA Acquires SchedMD, Deepening Its Grip on the AI Infrastructure Scheduling Layer

[Pictured: NVIDIA founder and CEO Jensen Huang]

NVIDIA has announced the acquisition of SchedMD, the company behind Slurm, the world’s most widely used workload manager for high-performance computing (HPC) and AI. While the financial terms were not disclosed, the move marks another step in NVIDIA’s broader strategy to extend its control beyond acceleration hardware and into the critical software layers that govern how the most valuable compute resources in AI are actually used.

SchedMD is a U.S.-based company founded in 2010 by the original developers of Slurm, though the technology itself dates back even further. Slurm was first developed in the early 2000s at Lawrence Livermore National Laboratory, as an open alternative to proprietary schedulers for large-scale computing clusters. Since then, it has become the de facto standard: today, Slurm runs on roughly half of the world’s top supercomputers listed in the TOP500, and is used by universities, research institutes, defense organizations, pharmaceutical companies, financial institutions—and increasingly by enterprises operating in-house AI infrastructure.

At its core, Slurm is the engine that decides who gets compute resources, when, and how. It manages queues, allocates CPUs, memory, and GPUs, and ensures workloads are executed fairly and efficiently across clusters that may span thousands of servers. In the AI era—where model training consumes massive amounts of GPU capacity—Slurm has become a mission-critical component of the workflow. Without intelligent scheduling, a significant portion of these extremely expensive resources would simply go to waste.

Slurm’s primary users are not application developers, but infrastructure teams—the operators of data centers and compute clusters. AI developers typically encounter Slurm only indirectly, when submitting jobs, without visibility into the allocation logic running behind the scenes. In public cloud environments, similar scheduling mechanisms usually exist as internal systems, largely opaque to customers.

It is also important to distinguish Slurm from platforms such as Run:AI, which NVIDIA acquired earlier. While Slurm operates as the foundational scheduler of a cluster—a low-level infrastructure layer that is aware of physical resources—Run:AI sits above Kubernetes as an intelligent optimization layer, with awareness of teams, projects, experiments, and business priorities. Put simply: Slurm allocates the “iron,” while Run:AI allocates it in an organizational and business context. Together, they form a continuous stack—from hardware all the way up to enterprise-level AI workload management.

This is where the strategic significance of the acquisition becomes clear. Although Slurm is open source, control over the organization that leads its development gives NVIDIA substantial influence over the project’s direction, development velocity, and hardware optimization priorities. Slurm is already well tuned for NVIDIA GPUs, but the acquisition paves the way for even tighter integration with CUDA, NVLink, InfiniBand, and capabilities such as MIG (Multi-Instance GPU), which allows a single GPU to be partitioned for parallel workloads. The result is higher GPU utilization—ultimately translating into greater demand for NVIDIA hardware.

More broadly, NVIDIA continues to assemble end-to-end vertical control over the AI infrastructure stack: processors, networking, software libraries, workload scheduling, and enterprise management. While the SchedMD acquisition may appear modest compared to some of NVIDIA’s blockbuster deals, it targets one of the most critical choke points in the AI world: who controls compute time. In a domain where every minute of GPU usage carries significant economic value, that level of control is nothing short of strategic.

NVIDIA Unveils an Open and Transparent Autonomous Driving Model

At this week’s NeurIPS conference, NVIDIA launched DRIVE Alpamayo-R1, a new autonomous-driving model described as the first industry-scale VLA (Vision-Language-Action) system to be released in open source. VLA refers to a model architecture that integrates visual perception, scene understanding, causal reasoning, and action planning into a single continuous framework.

The announcement marks a significant shift for the company. While NVIDIA has spent recent years building its AV efforts around dedicated hardware platforms such as DRIVE Orin and DRIVE Thor, it had never before opened a core driving module to the broader research community. For the autonomous-driving world — where closed, proprietary decision-making systems dominate — this is a notable milestone.

A Unified Model With Causal Reasoning at Its Core
Alpamayo-R1 is an end-to-end autonomous-driving model that simultaneously performs computer vision, scene comprehension, causal reasoning, and trajectory planning. Unlike traditional AV architectures that separate perception, prediction, and planning, AR1 uses a unified VLA structure that stitches the layers together into a single, continuous decision pipeline.

At the heart of the model lies causal reasoning — the ability to break down complex driving scenarios, evaluate multiple “thought paths,” and select a final trajectory based on interpretable internal logic.

According to NVIDIA, AR1 was trained on a blend of real-world data, simulation, and open datasets, including a newly introduced Chain-of-Causation dataset in which every action is annotated with a structured explanation for why it was taken. In the post-training phase, researchers used reinforcement learning, yielding a measurable improvement in reasoning quality compared with the pretrained model.

The model will be released for non-commercial use on GitHub and Hugging Face. NVIDIA will also publish companion tools, including AlpaSim, a testing framework, and an accompanying open dataset for AV research.

Open vs. Closed Models

Today’s autonomous-driving systems largely fall into two categories. Tesla uses an end-to-end Vision → Control approach, in which a single model processes camera input and outputs steering and braking commands. Tesla’s model is not open, does not provide explicit reasoning, and is not structured around a clear division between reasoning and action.

Mobileye, by contrast, maintains a more “classic” perception-prediction-planning stack built on semantic maps, deterministic algorithms, and safety rules. But Mobileye’s models are also closed systems that offer no external visibility into their decision-making logic.

This is where AR1 stands apart: it provides explicit, interpretable reasoning traces explaining why a particular trajectory was chosen — something rarely seen in AV systems, and never before at industrial scale.

The significance of making such a model open extends far beyond academia. Commercial AV stacks are black boxes, which makes regulatory evaluation, cross-model comparison, and stress-testing in rare scenarios difficult. By opening a reasoning-based driving model, NVIDIA enables transparent, reproducible experimentation — much like what Llama and Mistral have done for language models.

A Shift Toward a New Paradigm
AR1 signals a broader shift: autonomous driving is evolving toward a domain where general-purpose intelligence models play a central role, replacing rigid, hand-engineered pipelines. While there is no evidence yet that a unified VLA model can replace the entire AV stack, this is the clearest move to date toward what could be called a “physics of behavior” — an effort to understand not only what the car sees, but why it should act in a certain way.

The announcement also aligns with NVIDIA’s hardware strategy. As models become larger, more compute-intensive, and increasingly reliant on high-fidelity simulation, the case for using NVIDIA’s platforms only strengthens.

Alpamayo-R1 is not a full autonomous-driving system, but it is the first time that the cognitive heart of such a system — its decision-making logic — is being opened to researchers, OEMs, and startups. In a field long defined by closed-door development, that alone is a meaningful breakthrough.