The Inference Engine: A Most Peculiar Boom

It has come to pass, as these things invariably do, that the feverish rush to create intelligence – a most ambitious undertaking, fraught with the potential for both immense profit and utter ruin – has begun to yield to a more… practical concern. The training of these digital golems, you see, requires prodigious sums. But the using of them? That, my friends, is where the truly substantial coin will be made. Reports whisper of an inference market swelling from a respectable, if unremarkable, $106 billion to a sum approaching $255 billion by the year 2030. A figure, I confess, that requires a strong cup of tea to fully contemplate. It is not the creation, but the constant asking that will drain the coffers.

Nvidia: The Magician and His Assistants

Nvidia, a name now uttered with the same reverence (and, I suspect, the same anxieties) as alchemists of old, is, of course, at the heart of this burgeoning spectacle. They are known, naturally, for conjuring the very foundations of these digital minds – the large language models, the LLMs, as they are called. But to believe they will rest on this accomplishment alone would be a grave error. No, they have turned their gaze to the art of inference – the relentless answering of questions, the tireless processing of data. Their Nvidia Inference Microservices, prebuilt and optimized, are like a legion of diligent clerks, each attending to a specific query.

And then there is the matter of the Blackwell GB300 Ultra, and the upcoming Vera Rubin platform. These are not mere improvements, mind you, but subtle alterations to the very fabric of computation. One might even say they are attempting to coax a more… agreeable response from the silicon. But the true stroke of genius? The acquisition of Groq’s employees and the licensing of their peculiar technology. Groq, you see, had dared to envision a different path – a chip designed specifically for inference, a Language Processing Unit, or LPU, as they call it. A most curious device, and one Nvidia intends to integrate into its CUDA software platform. A digital digestion, if you will. I would not, therefore, dismiss Nvidia from your considerations. They are, after all, masters of illusion, and a most persistent lot.

Advanced Micro Devices: The Quiet Competitor

Now, Nvidia’s dominance is not absolute. The moat, while formidable, is not quite as wide in the realm of inference as it is in the training of these digital beasts. This, naturally, creates an opening. Advanced Micro Devices, a company often overshadowed, has been quietly carving out a niche for itself. They are, if you will, the meticulous craftsman, building solid foundations while others chase fleeting glories. The growth of the inference market will undoubtedly benefit them, particularly given their comparatively modest revenue base. A small ship can turn more quickly than a grand galleon, you see.

The investment from OpenAI, and their commitment to utilize 6 gigawatts of AMD’s GPUs, is a matter of some significance. 6 gigawatts! A figure that requires a moment of quiet contemplation. Based on the current price of Nvidia’s GPUs, that translates to approximately $35 billion. A most substantial sum, even for a company accustomed to dealing with such magnitudes. OpenAI intends to employ these GPUs for inference, which may, in turn, open doors to further contracts. And let us not forget the importance of Central Processing Units, or CPUs, in the age of “agentic AI.” These are, after all, the brains of the operation, and becoming increasingly vital. Between the rising demand for AI inference and data center CPUs, AMD appears well-positioned for the future, though perhaps lacking the… theatrical flair of its competitor.

Broadcom: The Architect of Silicon

As companies scramble to reduce the costs associated with this burgeoning infrastructure, they are turning, increasingly, to Application-Specific Integrated Circuits, or ASICs. These are custom chips, hardwired for specific tasks, and, as such, perform those tasks with remarkable efficiency – and, crucially, with reduced energy consumption. This is particularly important in inference, where the costs are ongoing, and every query demands a portion of the power supply.

Broadcom, a name whispered with reverence among the engineers and architects of silicon, is a leader in this field. They provide the building blocks, the essential components that allow companies to translate their designs into physical reality. They also maintain crucial relationships with memory manufacturers and foundries, securing the necessary components and manufacturing capacity. A most intricate web of dependencies, I assure you.

Broadcom assisted Alphabet in the design of their Tensor Processing Units, or TPUs, a feat of engineering in its own right. And now, Alphabet is allowing customers to deploy these TPUs through Google Cloud. Anthropic has already placed a $21 billion TPU order with Broadcom, while a significant portion of Alphabet’s $180 billion in capital expenditures will likely be devoted to TPUs as well. And, of course, Broadcom is attracting new ASIC customers, including OpenAI, who have committed to 10 gigawatts of chips. A truly staggering number. With the inference market poised for a surge, Broadcom appears destined to be one of the biggest winners in this peculiar, and increasingly profitable, game.

2026-02-26 01:32

Nvidia: The Magician and His Assistants

Advanced Micro Devices: The Quiet Competitor

Broadcom: The Architect of Silicon

Read More