NVIDIA Blackwell Ultra GPUs Set New AI Training Speed Record
NVIDIA has set a new AI training speed record with its Blackwell Ultra B300 GPUs, training a 1 trillion parameter model in just 12 days using a 16,384-GPU cluster. The achievement represents a 3.5x improvement over the previous record set with Hopper H100 GPUs and demonstrates the dramatic acceleration in AI training capabilities. The B300's key innovations include 288GB of HBM3e memory per GPU (up from 192GB on B200), NVLink 6.0 with 1.8TB/s bandwidth, and new FP4 precision support that doubles training throughput for transformer architectures without quality degradation. For AI companies, the practical implication is that frontier models that previously required months of training and tens of millions of dollars in compute can now be trained in weeks at significantly lower cost per FLOP. NVIDIA reports that all major AI labs — OpenAI, Anthropic, Google, xAI, and Meta — have placed orders for B300-based DGX systems, with deliveries scheduled throughout 2026. The pricing is aggressive: a DGX B300 system with 8 GPUs costs approximately $450,000, down from $500,000 for the equivalent Hopper system, while delivering 3.5x the training throughput. NVIDIA also announced Grace-Blackwell NVL76, a rack-scale system combining 76 B300 GPUs with 76 Grace CPUs, designed for training the largest models. The system is priced at approximately $4 million and can train models up to 2 trillion parameters.
NVIDIA has set a new AI training speed record with its Blackwell Ultra B300 GPUs, training a 1 trillion parameter model in just 12 days using a 16,384-GPU cluster.
The achievement represents a 3.5x improvement over the previous record set with Hopper H100 GPUs and demonstrates the dramatic acceleration in AI training capabilities.
The B300's key innovations include 288GB of HBM3e memory per GPU (up from 192GB on B200), NVLink 6.0 with 1.8TB/s bandwidth, and new FP4 precision support that doubles training throughput for transformer architectures without quality degradation.
For AI companies, the practical implication is that frontier models that previously required months of training and tens of millions of dollars in compute can now be trained in weeks at significantly lower cost per FLOP.
NVIDIA reports that all major AI labs — OpenAI, Anthropic, Google, xAI, and Meta — have placed orders for B300-based DGX systems, with deliveries scheduled throughout 2026.
The pricing is aggressive: a DGX B300 system with 8 GPUs costs approximately $450,000, down from $500,000 for the equivalent Hopper system, while delivering 3.5x the training throughput.
NVIDIA also announced Grace-Blackwell NVL76, a rack-scale system combining 76 B300 GPUs with 76 Grace CPUs, designed for training the largest models. The system is priced at approximately $4 million and can train models up to 2 trillion parameters.
CEO Jensen Huang noted that the AI compute market continues to grow faster than NVIDIA's ability to supply it, with demand exceeding supply by an estimated 2-3x.
More News
NVIDIA Launches NIM Microservices for Enterprise AI Deployment
NVIDIA has launched NIM (NVIDIA Inference Microservices), a suite of containerized AI model serving packages that reduce enterprise AI deployment time from weeks to hours with optimized inference performance.
AI Agents Market Reaches $15 Billion as Enterprise Adoption Surges
The global market for AI agents — autonomous AI systems that can plan, execute, and iterate on complex multi-step tasks — has reached $15 billion in annual spending, according to a new report from McKinsey. This represents a 200% increase from 2025, driven by enterprise adoption of agentic AI for customer service, software development, data analysis, and business process automation. The report identifies three tiers of AI agent adoption: basic agents that handle single-step tasks like email responses and appointment scheduling (adopted by 65% of enterprises), intermediate agents that manage multi-step workflows like report generation and data pipeline management (35% adoption), and advanced agents that autonomously execute complex processes like code deployment and financial analysis (8% adoption). The largest spending categories are customer service agents ($4.2B), coding agents ($3.8B), and data analysis agents ($2.5B). McKinsey projects the market will reach $45 billion by 2028 as agent reliability improves and enterprises become more comfortable delegating complex decisions to AI. Key enabling platforms include OpenAI's Agents SDK, Anthropic's Claude computer-use capabilities, and LangChain's agent framework. The report warns that agent governance and monitoring remain underdeveloped, with most enterprises lacking adequate oversight mechanisms for autonomous AI actions.
Microsoft 365 Copilot Gets Custom AI Agents and Actions
Microsoft has updated 365 Copilot with custom AI agent creation, allowing organizations to build agents that automate complex workflows spanning Word, Excel, Outlook, Teams, and SharePoint without code.
GPT-5.2's Agentic Mode Transforms Enterprise Workflows
OpenAI's GPT-5.2 introduced a fundamentally new approach to agentic task completion that is already transforming enterprise workflows. The model can now maintain coherent plans across 50+ sequential tool calls with parallel execution, reducing latency in complex automation pipelines by up to 60%. Early enterprise adopters report that GPT-5.2's agentic mode handles tasks like multi-step data analysis, cross-platform content publishing, and automated code review workflows that previously required custom orchestration code. The key innovation is what OpenAI calls deliberative alignment — a training approach that lets the model dynamically allocate compute to harder sub-tasks while breezing through simpler ones. This means a single agentic session can handle both quick lookups and deep reasoning without manual configuration. Several Fortune 500 companies have reported 40-70% time savings on analyst workflows by deploying GPT-5.2 agents through the API. However, reliability remains a concern — OpenAI acknowledges a 3-5% failure rate on chains exceeding 30 steps, and enterprise deployments require human-in-the-loop checkpoints for critical decisions.