Nvidia: Saving grace

Abstract

The case examines Nvidia’s rise during 2020 and 2021 as it moved from a graphics chip specialist to a dominant force in data center computing. The analysis focuses on Nvidia’s software moat built through CUDA, its acquisition of Mellanox, and its attempt to acquire Arm. These moves supported its push into accelerated computing and its launch of the Grace CPU. The case contrasts Nvidia’s trajectory with Intel’s manufacturing delays, loss of competitive position, and its strategic shift under the IDM 2.0 plan. The document presents the competitive dynamics among Nvidia, Intel, and AMD, and outlines the strategic dilemmas facing Nvidia as regulatory pressure and renewed competition altered the structure of the semiconductor industry.

Keywords

Arm acquisition data center ecosystem strategy Intel Nvidia semiconductor competition

In April 2021, Jensen Huang, co-founder and chief executive officer (CEO) of Nvidia Corporation, delivered the keynote address¹ for the company’s annual GPU Technology Conference (GTC) from a virtualized reconstruction of his home kitchen. Nvidia had moved from a supplier of graphics chips for PC gamers to the highest-valued semiconductor firm in the United States. Its market capitalization had surpassed Intel Corporation’s, reaching over $325 billion by January 2021.²

Nvidia reported revenue of $16.68 billion for fiscal year 2021, a 53% year-over-year increase. Its Data Center segment grew 124% to $6.70 billion.³ Intel’s results diverged. Full-year 2020 revenue of $77.9 billion remained the largest in the industry, but the Data Center Group (DCG) recorded a 16% year-over-year revenue decline in the fourth quarter, indicating competitive erosion in its highest-margin business.⁴

At the keynote, Huang announced “Grace,” Nvidia’s first data center central processing unit (CPU). Grace was built on the Arm architecture and designed to operate alongside Nvidia’s graphics processing units (GPUs) on artificial intelligence (AI) training and high-performance computing (HPC) workloads at scale.⁵ The announcement placed Nvidia in direct competition with Intel’s highest-margin server CPU business.

The strategic dilemma

Grace arrived under regulatory pressure. Nvidia’s $40 billion proposed acquisition of Arm Holdings, which underpinned the CPU strategy, was under investigation in three jurisdictions. The Federal Trade Commission, the United Kingdom’s Competition and Markets Authority, and the European Commission opened in-depth reviews on the theory that Nvidia-owned Arm would have the incentive to foreclose rival access to a neutral architecture on which much of the industry depended.⁶

Intel had responded earlier in March 2021 under new CEO Pat Gelsinger. The company’s “IDM 2.0” strategy committed $20 billion to new fabrication facilities and established Intel Foundry Services, a unit intended to manufacture chips for external clients and compete with the foundries on which fabless firms including Nvidia depended.⁷

Huang faced three interlinked questions.

On CPU strategy: Was Grace viable without ownership of Arm? If the acquisition failed, Nvidia would remain one of many Arm licensees, including Amazon’s Annapurna Labs and Ampere Computing. Did the CPU strategy require the neutrality and scale of the Arm ecosystem to displace Intel’s x86 installed base?

On Intel’s resurgence: How should Nvidia position itself toward Intel Foundry Services? Gelsinger held credibility and potential political backing under an emerging US semiconductor industrial policy. Would a recapitalized Intel weaken Nvidia’s dependency on Taiwan Semiconductor Manufacturing Company (TSMC), or instead provide Nvidia a second source?

On platform strategy: Had the platform approach reached its limits, requiring vertical integration through acquisitions such as Mellanox and Arm? Or was the more durable path to deepen CUDA (Compute Unified Device Architecture) as the indispensable software layer across all accelerated computing, agnostic to the underlying CPU architecture?

Huang’s decisions over the following months would determine whether Nvidia’s leadership held through the decade or whether regulatory and competitive pressure arrested its trajectory (Figures 1 –5).

Figure 1.

Nvidia Revenue Composition Shift.

Figure 2.

CPU vs GPU Performance for AI Training Tasks.

Figure 3.

CUDA Ecosystem.

Figure 4.

Full Stack Integration Post-Mellanox Acquisition.

Figure 5.

Pat Gelsinger's IDM 2.0 Strategy.

Understanding accelerated computing

Accelerated computing reallocates computational work between a CPU and a specialized processor, typically a GPU. CPUs deploy a small number of high-performance cores optimized for serial, single-threaded workloads.⁸

GPUs deploy thousands of smaller cores built for parallel execution and excel at matrix multiplication and other operations decomposable into simultaneous threads. For AI training, scientific simulation, and data analytics, GPU-accelerated architectures deliver 10 to 100 times the throughput of CPU-only systems.⁸ The transition from CPU-centric to heterogeneous data center architectures constitutes the most consequential shift in server design since virtualization.

The CUDA moat

Nvidia was founded in 1993 as a designer of GPUs for the PC gaming market. Its leadership recognized early that the parallel structure of GPU architecture suited scientific and research computation beyond graphics rendering.⁸

In 2006, Nvidia launched CUDA, a parallel computing platform and programming model giving developers general-purpose access to GPU compute.⁹ Programmers working in C++ or Fortran targeted the GPU directly, with order-of-magnitude performance gains on parallelisable workloads. The launch preceded the deep learning surge by 6 years.⁸

Nvidia invested over the following 15 years in software infrastructure built on CUDA. GPU-accelerated libraries lowered the barrier to GPU programming for domain specialists. cuDNN (CUDA Deep Neural Network library) supplied optimized routines for neural network training. RAPIDS extended acceleration to data science and machine learning pipelines.¹⁰

Developer adoption compounded. Each new GPU-accelerated application drew additional users, which drew additional hardware purchases, which drew additional developer investment. Nvidia established the GTC in 2009 to coordinate this community.

By 2021, CUDA had been downloaded more than 33 million times, with 8 million downloads during 2021 alone.¹¹ Its developer base exceeded three million, and the accelerated-application catalogue ran into the thousands across medical imaging, quantum chemistry, financial modeling, and weather forecasting.¹² Switching costs were substantial. Teams holding millions of lines of CUDA code would need to port and re-optimize them for any competing architecture. Nvidia had built a self-reinforcing platform rather than a faster chip.

AI and the data center shift

The deep learning surge of the 2010s validated the CUDA bet. AI training workloads decomposed cleanly into the parallel primitives on which GPUs dominated. Nvidia’s Tesla V100 delivered deep learning training up to forty-seven times faster than the leading CPU in 2017.⁸ The Ampere architecture and A100 GPU, launched in 2020, widened the gap.¹³

Demand migrated accordingly. Hyperscale cloud providers including Amazon Web Services, Microsoft Azure, and Google Cloud built large fleets of Nvidia GPU-based servers to supply AI and machine learning services to enterprise customers.⁸ Firms outside the cloud sector procured Nvidia’s DGX systems, integrated hardware-software appliances for in-house AI development.⁸

The financial trajectory tracked the shift. Data Center revenue rose from $2.98 billion in fiscal 2020 to $6.70 billion in fiscal 2021, a 124% increase, and was on course to surpass the company’s gaming segment.³ Nvidia did not attempt to substitute the GPU for the CPU across all workloads. It substituted the GPU for the CPU in the fastest-growing and highest-value data center workload, AI training, and from that position reshaped data center design toward heterogeneous acceleration.

Intel’s manufacturing crisis

Intel’s position had three pillars. The x86 instruction set architecture held incumbent status in client and server computing. The “Intel Inside” brand held consumer recognition. The integrated device manufacturing (IDM) model combined chip design with in-house fabrication, and Intel’s sustained execution on Moore’s Law produced a reliable process-node advantage over rivals.⁸

The foundation fractured at the fabrication layer in the mid-2010s. Intel’s 14-nm to 10-nm transition ran multiple years behind schedule. The subsequent 7-nm node slipped further.¹⁴ The Moore’s Law cadence on which customers had long relied broke down.

The node delay opened the competitive field. Advanced Micro Devices (AMD), operating a fabless model with TSMC as its foundry, moved to 7-nm while Intel remained at 14-nm. Under CEO Lisa Su, AMD’s Zen-architecture EPYC server processors overtook Intel’s Xeon on core count and performance-per-watt.¹⁵

Enterprise and cloud customers shifted share. AMD’s server segment share, historically in low single digits, reached 7.1% by end-2020 and crossed 10% by end-2021.¹⁶ The displacement appeared in Intel’s financial disclosures. DCG revenue fell 16% year-over-year in the fourth quarter of 2020, with management attributing the decline to both volume losses and average selling price compression from competition.⁴ Intel’s core competency had become a core rigidity. Vertical integration of design and fabrication, once the source of its process-node advantage, denied the firm access to the frontier nodes available on TSMC. The internal bottleneck admitted two competitive moves: AMD’s direct contest in x86 server CPUs, and Nvidia’s flanking move through accelerators.

Nvidia’s full-stack strategy

As Intel managed its internal constraints, Nvidia moved from selling components to architecting the data center stack end-to-end.

The $6.9 billion acquisition of Mellanox Technologies, closed in April 2020, brought the high-speed networking required for distributed AI training across multiple GPUs into Nvidia’s portfolio.¹⁷ Mellanox’s InfiniBand and Ethernet fabrics carried the low-latency, high-throughput inter-node communication on which multi-GPU training at scale depended. Integrating networking with compute allowed Nvidia to sell data-center-scale platforms rather than component chips, reducing exposure to third-party networking holdups. Mellanox accounted for 10% of Nvidia’s total revenue in fiscal 2021.¹⁸

Networking integrated, Huang moved on the CPU layer. In September 2020, Nvidia announced its intention to acquire Arm from SoftBank for $40 billion. Huang framed the combination as one designed to “create the leading computing company for the age of AI,” joining Nvidia’s AI platform to Arm’s energy-efficient CPU ecosystem. Nvidia’s addressable developer base would expand from its own two million to the more than 15 million developers building on Arm.¹⁹

The announcement precipitated regulatory action. Arm’s licensing business rested on architectural neutrality. The firm licensed CPU designs to hundreds of customers, including Nvidia’s direct competitors Qualcomm, Samsung, and Apple. Regulators concluded Nvidia-owned Arm would have both the capability and the incentive to foreclose rival access. The US Federal Trade Commission (FTC) sued to block the acquisition, citing competition harms in data center and automotive chip markets.²⁰ The UK’s Competition and Markets Authority (CMA) opened an in-depth inquiry on competition and national security grounds.²¹ The European Commission opened its own investigation, stating the deal “could lead to restricted or degraded access to Arm’s IP, with distortive effects in many markets.”²²

Intel strikes back: IDM 2.0

Intel’s counter-strategy followed. In February 2021, the board appointed Pat Gelsinger as CEO. Gelsinger returned to Intel after nearly a decade at VMware and held technical credibility from his earlier career at the firm. In March 2021 he introduced “IDM 2.0,” organized around three components:

Internal manufacturing: Intel committed $20 billion to build two new fabs in Arizona and reaffirmed that its internal factory network would continue to produce the majority of its products.⁷

External foundries: Intel would expand third-party foundry use, including TSMC, for certain products and modular “tiles,” beginning in 2023. Tiles are chiplets combined in advanced packaging to form a single processor, permitting the mixing of different process nodes within one device.⁷

Intel Foundry Services (IFS): Intel would offer foundry capacity to external fabless customers, positioning IFS as a Western alternative to the Asian foundries on which the fabless industry had concentrated. A research partnership with IBM supported the process-technology build-out.²³

IDM 2.0 addressed three problems simultaneously: the internal process-node deficit, the modular-chiplet architecture trend reshaping design economics, and the fabless model Intel had historically treated as a secondary market.²³

Resolving the dilemma

Three choices structured the decision. First, Grace committed Nvidia to CPU competition, but its viability without Arm ownership remained open. A regulatory block would leave Nvidia a licensee competing with other Arm customers. Differentiation would rest on vertical integration of Grace with Nvidia’s GPU and networking stack, and on CUDA’s ecosystem lock-in rather than on architectural control.

Second, Intel’s repositioning under Gelsinger presented both risk and optionality. IFS targeted the fabless model, but Nvidia was in a position to deploy it as second-source capacity to reduce TSMC dependency. The more immediate threat was Intel’s potential entry into AI accelerators, together with a regained process-node position, which would compress the performance margin that had anchored Nvidia’s growth.

Third, Nvidia’s platform strategy had performed. The CUDA ecosystem produced switching costs which held even as competitors built capable hardware. Extending the software stack across multiple CPU architectures, x86, Arm, and eventually RISC-V, would entrench Nvidia as the accelerated computing layer under any CPU regime. This approach was more defensible than vertical integration if the Arm deal failed.

The decision reduced to vertical integration versus platform expansion. Vertical integration, via Arm, offered control of a critical architectural layer at the cost of regulatory exposure and industry backlash. Platform expansion, via a CUDA stack portable across CPU architectures, offered resilience at the cost of sustained software investment and ecosystem coordination. Huang’s execution of the gaming-to-AI transition suggested the firm held the capability to pursue either path. The choice over the following months would determine whether Nvidia’s leadership compounded or whether combined regulatory and competitive pressure arrested it.

Footnotes

ORCID iD

Rushi Anandan Karichalil

Funding

The author received no financial support for the research, authorship, and/or publication of this article.

Declaration of conflicting interests

The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Notes

Author biography

Rushi Anandan Karichalil is an Associate Professor in the General Management Department at KJ Somaiya Institute of Management.