Home News One acquisition made Nvidia worth $4 trillion

One acquisition made Nvidia worth $4 trillion

2025-09-19

Share this article :

It is generally recognized that the currently popular Nvidia has two moats, namely CUDA and NVLink. However, judging from the performance in the most recent quarter, if there had not been a $7 billion acquisition that year, there might not have been a chip giant with a market value of $4 trillion.

Following the release of its second-quarter earnings report, much attention has focused on whether the chipmaker's revenue can continue to justify its meteoric market capitalization. However, behind the headlines, one business segment has stood out: its networking business. Analysts believe it will be the under-the-hood engine driving the company's transformation into a $4 trillion behemoth.

Data indicates that this "networking" business likely contributes far more than 16.1% to Nvidia's overall revenue. Revenue surged 46% quarter-over-quarter and nearly doubled year-over-year, reaching $7.25 billion in the second quarter alone. In other words, in just the last quarter, the R&D center established through the Mellanox acquisition generated more revenue for Nvidia than it cost. This brings the division's annual run rate to $25 billion to $30 billion, a remarkable figure for a unit once considered a supporting player to Nvidia's flagship graphics processors.

This achievement is largely due to the $6.9 billion acquisition of Mellanox.

The unsung hero of Nvidia's success

In the past few years, discussions of challenging NVIDIA have frequently focused on software and networking, in addition to computing. For example, UALink, a consortium formed in recent years, is an effort to break through NVIDIA's barriers. The underlying reason is that single chips or single racks are unable to meet the rapidly increasing demand for AI computing power, prompting the urgent need for scale-up and scale-out.

Nvidia states that, due to physical limitations such as energy supply and chip density, today's data centers are approaching the limits of what a single facility can provide. The new Spectrum-XGS platform addresses obstacles, such as long latency, that have previously prevented independent facilities from operating as a unified system.

Nvidia CEO Jensen Huang also emphasized in a previous earnings call: "We have Spectrum-XGS, which has gigabit scale and can connect multiple data centers and multiple AI factories into a super factory, a massive system. This is precisely why Nvidia is investing so much energy in the networking field. As we mentioned earlier, Spectrum-X is already a considerable business, and it has only been established for about 1.5 years. Therefore, Spectrum-X is a home run."

Earlier, a technological breakthrough by Nvidia's Israeli subsidiary, which was established through the acquisition of Mellanox, will enable geographically distant data centers to operate as if they were in a single location, effectively creating "AI factories" at scale and significantly increasing the maximum computing power available to the industry.

The company stated in a press release: "With advanced, auto-adjusting distance congestion control, precise latency management, and end-to-end telemetry, Spectrum-XGS Ethernet nearly doubles the performance of the NVIDIA Collective Communications Library (CCL), accelerating multi-GPU and multi-node communications to deliver predictable performance in geographically distributed AI clusters. This allows multiple data centers to operate like a single AI superfactory, fully optimized for long-distance connectivity."

As Jensen Huang said, "This is exactly why NVIDIA acquired Mellanox 5.5 years ago."

Founded in 1999 by Eyal Waldman, Mellanox pioneered InfiniBand interconnect technology. At the time of NVIDIA's acquisition, this technology and its high-speed Ethernet products were currently used in more than half of the world's fastest supercomputers and many leading hyperscale data centers.

Mellanox went public in 2007 and first surpassed $1 billion in annual sales in 2018. In 2018, the company achieved a record high GAAP net income of $134.3 million. Prior to the acquisition, the company had been profitable for 10 of the 13 years and had been free cash flow positive since 2005.

Mellanox and NVIDIA also have a long history of collaboration and joint innovation. NVIDIA opened a design center in Israel in 2016 and an AI research center in 2018. The company has previously pledged to "continue investing in the exceptional local talent in Israel, one of the world's foremost technology hubs."

Eyal Waldman previously stated in a podcast, "I believe the synergy between the processor (the brain) and network connectivity is what took Nvidia from a $93 billion company to the $4 trillion company it is today." He further noted that ChatGPT wouldn't exist without Mellanox's InfiniBand:

"OpenAI has been buying state-of-the-art products from us. Without this connectivity, they wouldn't be able to achieve the data processing speeds required for AI," Eyal Waldman said. "This is the most important merger and acquisition in the industry's history," Eyal Waldman emphasized.

Network connectivity is more important than ever

Gilad Shainer, Nvidia's senior vice president of networking

Previously recalled in an interview with HPCwire that Mellanox wasn't building networking components at the time. The company primarily built complete end-to-end infrastructure, focusing on InfiniBand. This included network cards and switches, the connectivity between them, and all the software built on top of it—a complete platform.

"It's a complete infrastructure. InfiniBand is designed specifically for distributed computing applications. Therefore, it's widely used in HPC and scientific computing. All large-scale cluster simulations use InfiniBand because it's designed for disaggregated computing and offers extremely low latency. InfiniBand ensures that all nodes have effective bandwidth. Jitter is something everyone wants to minimize," Shainer continued.

As he explained, it was a great technology for HPC, and when AI began to emerge, it became another example of distributed computing. For example, you could argue that latency sensitivity is higher or lower because there are some differences between AI workloads and scientific computing workloads. Scientific computing workloads are likely more sensitive to latency than in the early days of AI training; back then, it was slightly less so.

"Nanosecond latency is less critical for training, but it still requires significant effective bandwidth," Gilad Shainer noted. He noted that we now consider inference a primary element of AI. Inference relies on latency because you need low latency. Therefore, AI and high-performance computing (HPC) essentially have the same requirements. This is where infrastructure becomes even more important.

Gilad Shainer said that an interesting observation when comparing HPC with AI is that, in HPC, computing power increases with each generation. However, the scale of data centers remains constant. Typically, a data center has a few thousand nodes, and you can get telemetry data from each node, but the scale remains the same.

When it comes to AI, the requirements are much higher. It's not just the computing power per server that increases, but the computing power per new GPU. The scale of the infrastructure has grown significantly.

A few years ago, people were talking about 16,000 or even 30,000 GPUs. This is similar to the comparison with high-performance computing (HPC): the infrastructure is massive. Today, plans for 16,000 GPUs are largely shelved. Large-scale infrastructure typically includes hundreds of thousands of GPUs, but now numbers are reaching 200,000, and cloud providers are discussing migrating to millions of GPUs in a few years. This isn't just a matter of computing power; it's also about infrastructure scale. Achieving this scale requires a properly scalable network and scalable infrastructure. The data center has now become the standard for computing power. It's not just a box, but an entire data center.

"The data center is the network. The network will define how the GPUs function as a single computing element; otherwise, it would just be a cluster of GPU servers. This is why NVIDIA acquired Mellanox. And this is where infrastructure becomes increasingly important," said Gilad Shainer.

With this in mind, NVIDIA is on a constant cadence of new data center launches: new GPUs, new compute engines, new switches, and new infrastructure. Every year, new data centers are commissioned, providing ever-more powerful capabilities for AI applications, both for training and for high-volume inference. These new systems are spawning a vast array of AI frameworks and applications around the world.

CPO is the general trend

As everyone has mentioned, today's infrastructure consists of multiple domains required in the data center. In addition to scale-out (connecting servers), GPUs need to be built or scaled out—combining GPUs to form larger virtual GPUs. To achieve this larger virtual GPU, massive bandwidth is required between the individual GPUs. This is where NVLink comes in, if you want to make it appear as a single entity. This capability represents the scale-up aspect of system networking.

NVLink needs to support massive bandwidth—9 or even 10 times that of scale-out. It also requires very low latency. Therefore, the Mellanox team introduced the Scalable Hierarchical Aggregation and Reduction Protocol (SHARP) to NVLink, enabling scale-down, making the rack a single unit, and allowing us to fit more and more GPUs into that rack.

In the future, NVIDIA plans to deploy 576 GPUs in a single rack. This represents a massive amount of compute, requiring the infrastructure within that rack to scale out. The company is also working to keep this within the rack to maximize copper cabling utilization. From Nvidia's perspective, once you have the massive bandwidth needed to move data between components, you need to build it out in the most cost-effective way possible, and copper cabling is the most efficient way to connect communications.

But you can't stop there, because now you need to connect these racks together. You're talking about having hundreds of thousands of GPUs working as a single unit, or even 200,000 GPUs working as a single unit. Some customers might want 500,000 or even 1 million GPUs.

Now, due to the longer distances, we need to build a fiber-based scale-out infrastructure, but it must have the same characteristics as the OFED layer, including effective bandwidth and determinism.

From Nvidia's perspective, InfiniBand is still considered the gold standard for scale-out infrastructure. Anything you build that isn't InfiniBand can be compared to InfiniBand, because InfiniBand is the gold standard for performance.

From Gilad Shainer's perspective, scaling systems is one aspect of AI. Data centers are growing dramatically in size every year. This means more bandwidth between racks, and more compute in the wires. As a result, the bandwidth in the lines is also greater. Gilad Shaine also pointed out that now more fiber connections need to be deployed, and this suddenly impacts the power budget. "In AI data centers, the limiting factor isn't space or budget, but how much power can be introduced," he said.

As Gilad Shaine mentioned, fiber connections between racks consume a lot of power. This reduces the number of GPUs that can fit in a rack. As a result, the fiber network starts to consume close to 10% of computing power, which is a significant number. So, one factor we need to consider in this situation is, is there a way to reduce the power consumption of the fiber network? This is not just because as the data center scales, more components need to be built—I need to install GPUs, install network cards, connect cables, install transceivers and switches, and do all the necessary configuration. However, the fastest-growing component is the number of optical transceivers. Because each GPU has about six optical transceivers, if I have 100,000 GPUs, I need 600,000 transceivers.

As you know, these transceivers are sensitive to dust and may need to be replaced if they fail. This situation could lead to an increase in the replacement of these components in data centers, as there are now more of them.

Therefore, Nvidia believes that the next big step in data center infrastructure is to improve or take optical connectivity to the next level. This requires integrating optical connectivity, currently a separate component outside the transceiver, into the switch, taking it to the next level.

If I put them in one package, I don't need to send electrical signals through the switch. This means I can reduce power consumption, using less power to drive optical signals through the switch. In this case, I can reduce power consumption by nearly four times. Now, I can actually fit three times more GPUs in the same network.

So, Nvidia is pushing to integrate silicon photonic engines, or optical engines, into the switch, eliminating the need for those external transceivers.

As Gilad Shaine mentioned, co-packaged optical modules (CPOs) are not a new concept. There have been attempts in the market. You can see some equipment everywhere, and there have been some switch systems that have attempted to use CPOs, but they haven't been able to achieve full-scale production with good yields to achieve cost-effectiveness at scale. There are many reasons for this. One reason is that the technology is unproven, resulting in low yields. Previously manufactured optical engines used technology designed for large-scale optical engines. If I had a large-scale switch, I couldn't fit all of these optical engines on the same switch due to size constraints. This necessitated new packaging and even laser technologies.

These achievements are also closely tied to Nvidia's acquisition of Mellanox.

Final Thoughts

In a podcast interview, Eyal Waldman described the Mellanox sale negotiations as a "battleground" between Intel, Nvidia, and other companies. "Ultimately, the connection with Jensen Huang (Nvidia CEO) was a natural progression." "From the beginning, we knew this was the direction. In 2019, Intel's market capitalization far surpassed Nvidia's, and just a year later, Nvidia surpassed it. Since then, its stock price has skyrocketed thanks to its successful bet on AI," Eyal Waldman emphasized.

With the acquisition of Mellanox, Nvidia has established a research and development team in Israel, second only to that in the United States. According to data, the chip giant has over 5,000 employees across seven R&D centers in Israel. The company is also developing central processing units (CPUs) for data centers, system-on-chips (SoCs) for robots and automobiles, and algorithms for self-driving cars.

Thus, this is an unprecedentedly important deal for Nvidia.

Source: Semiconductor Industry Observer


View more at EASELINK

HOT NEWS

Glass substrates, transformed overnight

semi,semiconductor,CPU,SoC

In August 2024, a seemingly ordinary personnel change caused a stir in the semiconductor industry. Dr. Gang Duan, a longtime Intel chi...

2025-08-22

UFS 4.1 standard is commercially available, and industry giants respond positively

The formulation of the UFS 4.1 standard may accelerate the implementation of large-capacity storage such as QLC

2025-01-17

Understanding the Importance of Signal Buffers in Electronics

Have you ever wondered how your electronic devices manage to transmit and receive signals with such precision? The secret lies in a small ...

2023-11-13

Amazon halts development of a chip

Amazon has stopped developing its Inferentia AI chip and is instead focusing on semiconductors for training AI models, an area the com...

2024-12-10

Turkish domestically produced microcontrollers about to be put into production

Turkey has become one of the most important non-EU technology and semiconductor producers and distributors in Europe. The European se...

2024-08-14

US invests $75 million to support glass substrates

US invests $75 million to support glass substrates. In the last few days of the Biden administration in the United States, it has been...

2024-12-12

DRAM prices plummet, and the future of storage is uncertain

The DRAM market will see a notable price decline in the first quarter of 2025, with the PC, server, and GPU VRAM segments expe...

2025-01-06

Basics of Power Supply Rejection Ratio (PSRR)

1 What is PSRRPSRR Power Supply Rejection Ratio, the English name is Power Supply Rejection Ratio, or PSRR for short, ...

2023-09-26

Address: 73 Upper Paya Lebar Road #06-01CCentro Bianco Singapore

semi,semiconductor,CPU,SoC semi,semiconductor,CPU,SoC
semi,semiconductor,CPU,SoC
Copyright © 2023 EASELINK. All rights reserved. Website Map
×

Send request/ Leave your message

Please leave your message here and we will reply to you as soon as possible. Thank you for your support.

send
×

RECYCLE Electronic Components

Sell us your Excess here. We buy ICs, Transistors, Diodes, Capacitors, Connectors, Military&Commercial Electronic components.

BOM File
semi,semiconductor,CPU,SoC
send

Leave Your Message

Send