HBM: The Unavoidable Bandwidth Wall in the AI Computing Chain

GPUs determine how fast the compute can run; HBM determines how fast data can be fed. When models get big enough, the bottleneck shifts from 'how fast can we compute' to 'can we feed the data fast enough.'

2026.04.049 min✦原创

HBM: The Unavoidable Bandwidth Wall in the AI Computing Chain

行业研究MINTOVIEW2026.04.04

Industry Research: Storage Series, Part 2. Part 1 explained why this storage cycle isn't like the old ones, and kept circling back to one word—. This piece zooms in on HBM alone, because once you understand it, you understand where this storage cycle's real power comes from.

1. First, let's be clear: What problem does HBM actually solve?

To get why HBM suddenly matters so much, you need to understand a counterintuitive fact about AI compute—in many cases, GPUs aren't bottlenecked by 'compute power,' they're bottlenecked by 'bandwidth.'

A flagship NVIDIA has staggering theoretical compute. But to actually use that compute, data must be fed in fast. During training or inference, massive models require huge parameter and data sets to be moved repeatedly from storage into compute units and back. If that 'movement' can't keep up, even the strongest compute idles—the units sit waiting for data.

This is what the chip industry has called the 'memory wall' for decades: compute grows rapidly every year, but the speed at which data moves from memory to compute grows much slower. The gap keeps widening, and in the era of large AI models, this wall has become the decisive bottleneck.

HBM (High Bandwidth Memory) was born to smash through this wall. Its core idea is brutally simple—since a single channel doesn't provide enough bandwidth, stack many layers of DRAM chips vertically, use an extremely wide channel, place them right next to the GPU, and let data 'flow in parallel, at close proximity' to the compute unit.

Think of it this way: regular DRAM is like a two-lane highway to a factory. HBM is like laying out dozens of lanes side by side, and moving the warehouse next door to the factory. The same 'goods' (data) are moved, but throughput differs by orders of magnitude.

So the essence of HBM is not 'more memory' but 'a faster data pipeline.' In an era when AI has 'abundant compute but scarce bandwidth,' whoever builds the fastest pipeline controls the throat of the entire compute chain. That's why storage—a supporting role for thirty years—has finally taken center stage.

2. Why is it so expensive and so hard to make?

If HBM were just 'stacking DRAM layers,' it wouldn't have the moat it does today. The real difficulty involves three things, all of which together determine its high price and high barrier to entry.

First, vertical stacking + (Through-Silicon Via) is a manufacturing nightmare.

HBM requires stacking 8, 12, even 16 layers of DRAM chips vertically, and connecting them using Through-Silicon Vias (TSVs)—microscopic vertical holes drilled through the silicon to conduct signals. This demands extremely high yield rates per layer—because if any single layer fails, the entire stack is junk. The combined yield for a 12-layer stack is the product of each layer's yield, a brutally unforgiving math problem. That's why HBM is the most 'manufacturing-intensive' storage product—it's not something you can just throw money at and build.

Second, it consumes three times the wafer and squeezes everything else.

I mentioned this number in the first piece, but here's the full implication—to manufacture the same capacity of HBM, you use roughly 3x the wafer area of regular DRAM (because of stacking, TSVs, larger die area). The knock-on effect: every time a fab converts capacity to HBM, it must pull ~3x the capacity away from regular DRAM. This makes HBM expensive on its own, but it also creates a structural shortage in regular DRAM, driving those prices up too. HBM isn't just a high-priced product; it's a black hole that sucks the entire DRAM capacity pool dry.

Third, advanced packaging is another gate, and that gate has pulled TSMC into the storage game.

Even once HBM is manufactured, it must be packaged together with the GPU—using TSMC's CoWoS or similar advanced packaging technology to 'solder' HBM and GPU dies side by side onto the same interposer. This means HBM supply is also constrained by advanced packaging capacity. For the past two years, one hidden bottleneck for AI chips has been insufficient CoWoS packaging capacity. So HBM supply is stuck at three chokepoints: DRAM wafers, TSV stacking yield, and advanced packaging capacity—if any one is insufficient, GPUs can't ship.

Combine these three points and you understand: HBM's high price isn't hype—it's genuinely hard to make, genuinely scarce, and genuinely stuck at multiple bottlenecks. Its pricing reflects this: one 36GB stack of HBM3E costs around $300, and HBM4 is estimated at around $500 per stack—dozens of times the unit value of regular DRAM chips.

3. Market share landscape: A table being reshuffled by HBM4

HBM's competitive landscape differs from regular DRAM. In regular DRAM, Samsung leads (~38% share), but in the high-end HBM battlefield, the pecking order is completely different, and HBM4 is reshuffling it.

Looking at the latest HBM share landscape:

Vendor	HBM Share	Position & Momentum
SK Hynix	~62%	Absolute leader; first to complete HBM4 development (+40% power efficiency, 10Gbps); deeply tied to NVIDIA
Micron (MU)	~21%	Has overtaken Samsung; HBM4 wins volume production for NVIDIA's Vera Rubin platform; HBM annualized revenue running toward ~$8B
Samsung	~17%	Former storage king; fell behind in HBM (HBM4 yield delayed); now scrambling to catch up

This table hides the most interesting story of this cycle—Samsung, the overall storage leader, is actually the laggard in the most critical high-end battlefield, HBM; while SK Hynix, the perennial 'number two' in Samsung's shadow, has taken the top seat for the first time through its head start in HBM.

This is the power of a technology lead. SK Hynix's lead isn't marketing—it bet earlier and harder on HBM, honing its yields and customer collaboration to a level others can't match. A giant like Samsung, by contrast, suffered from slower pivots and HBM4 yield hiccups. This is a perfect case study: during a window of fast technology iteration, 'biggest scale' does not equal 'wins'; 'fastest pivot and most precise bet' does.

And HBM4 is the next decisive moment. Whoever brings HBM4 yield up first and lands the volume order for NVIDIA's next platform (Vera Rubin) will lock in market share for the next two to three years. Currently, SK Hynix leads, Micron is close behind, Samsung is chasing—but yield, once cracked, can ramp quickly, so Samsung isn't out, and this fight isn't over.

4. A key change with HBM4: TSMC officially enters the game

HBM4 brings a technical change that's easy to overlook but has profound implications—the 'base die' starts to be manufactured using logic process technology, and that pulls TSMC directly into the HBM game.

At the bottom of the HBM stack sits a base die that manages communication between the entire stack and the GPU. In past generations, this base die was made using the memory maker's own process. But for HBM4, to pack in more logic functions and boost performance, the base die is transitioning to advanced logic process technology (similar to what's used for CPUs/GPUs)—and memory makers don't have that process. They must go to foundries like TSMC.

The implications are significant:

First, HBM becomes a 'memory + logic hybrid,' not a pure memory product. Its technology stack is more complex, its moat deeper, and its price even higher.

Second, TSMC becomes a new player in the HBM chain. Whoever can secure TSMC's advanced logic capacity for base dies will have a more competitive HBM4 product. SK Hynix's collaboration with TSMC is one reason it leads in HBM4.

Third, this further solidifies the oligopoly. When HBM requires 'top-tier memory manufacturing + top-tier logic foundry + top-tier advanced packaging' working together, the number of players who can play this game only shrinks, not expands. That's good news for the three incumbents—their moats just got deeper.

In other words, HBM4 isn't just 'faster HBM'; it's an upgrade of the entire technology stack, turning this business from 'a memory maker's internal game' into 'a joint operation of memory + logic foundry + advanced packaging.' Understanding this is critical to seeing why this chain's barriers are getting thicker, not thinner.

5. For US equity investors: How to position along this chain

All this industry logic leads to a practical question—for US equities, how do you position along the HBM chain?

Let me lay out the investable US equities along this chain from 'direct' to 'indirect' (this is industry analysis, not a recommendation—your own judgment applies):

Most direct: Micron (MU). The only pure memory player among the three HBM giants that is directly investable in US equities (SK Hynix and Samsung trade in Korea). With its HBM share surpassing Samsung, winning volume production for NVIDIA's Vera Rubin, gross margin hitting 75% with next-quarter guidance of 81%, it is the most direct vehicle for US investors to bet on HBM. But remember, it is still the 'chaser' among the three, and its cyclical nature remains (see my doubts in Part 1).

Selling shovels: Advanced packaging and equipment. HBM is stuck at TSV and advanced packaging, so equipment and materials companies serving these links benefit regardless of who wins the memory battle. TSMC (CoWoS packaging) and a range of semiconductor equipment companies (for TSV, bonding, inspection) all gain from overall HBM capacity expansion. The logic for these names: no matter whether SK Hynix or Samsung wins, they all need to buy this equipment to make HBM.

Downstream: NVIDIA. Conversely, HBM is the 'entry ticket' and bottleneck for NVIDIA's GPUs—whether HBM is sufficient and its yield good enough directly determines how many GPUs NVIDIA can ship. So NVIDIA is both the largest customer for HBM and the ultimate driver of this chain's health. Tracking HBM is, to some extent, tracking whether NVIDIA's supply side can keep up.

My own approach to this chain: Think of it as 'selling water to gold miners,' but know which link of the water chain you're in. The pure memory player at the very source (Micron) offers the most upside but also the most cyclicality; the midstream packaging/equipment players are the most stable but offer less upside; downstream NVIDIA is the master switch for demand. The single biggest systemic risk for this chain is the same one I mentioned in Part 1—whether the AI capex well will one day run dry.

6. A final thought

Three years ago, HBM was an obscure niche within memory. Today, it is the one wall no one can bypass in the entire AI compute chain.

Its story matters not because it got more expensive, but because it reveals an industry law—on any value chain, profit flows to where the bottleneck is. When compute is abundant and bandwidth scarce, value shifts from 'computing' to 'feeding data.' When HBM becomes the most scarce, hardest-to-make, and chokepoint for everyone, it moves from supporting actor to lead, with profit margins even surpassing the top of the food chain, TSMC.

But I also remind myself: bottlenecks move. Today HBM is the wall because compute is relatively abundant; if the supply logic of compute changes, or if a new architecture bypasses the extreme dependence on bandwidth, the value of this wall will change too. There are no permanent bottlenecks—only current ones. HBM rides the wave today because it happens to sit exactly at the most painful point of this AI cycle.

So if I can leave just one sentence—

HBM is not better memory; it is the throat of the AI-era compute chain. Whoever grips that throat takes the fattest profits of this cycle. But the position of the throat is never permanent.

Next up: I'll put the three giants—SK Hynix, Samsung, Micron—on one table and compare their technology roadmaps, capacity strategies, and positioning differences to see who has the best shot at winning.

—

Disclaimer: This is industry research. The companies mentioned are for analytical purposes only and do not constitute investment advice. Market risk exists; invest wisely.

明投 Minto

投资分析 · 长期主义者

专注投资分析、市场洞察与资产配置。不追短期波动,只理解真正驱动长期回报的东西。