The Future of Chips or a Massive Mistake?

Anastasi In Tech Published Apr 15, 2026 Added 3w ago 34:33 3K views Open on YouTube ↗

Chapters

Topic clips curated from this video. Click to jump in.

Description

The Tessent product portfolio from Siemens EDA offers market leading DFT and SLM solutions that deliver faster time to market by reducing design complexity using high-quality DFT, including advanced debug, safety and security features and in-life data analytics to meet the evolving challenges of today’s silicon lifecycle.

Learn more at www.siemens.com/tessent

Timestamps:

00:00 - The Genius and Madness of Terafab

17:20 - Deep Dive Intro TeraFab

21:55 - Dirty Fab

My Podcast on Apple: https://podcasts.apple.com/at/podcast/deep-in-tech/id1829970978

My Podcast on Spotify: https://open.spotify.com/show/3drr7A8j2t4rz4dFcvOxxd

Let's connect on LinkedIn: https://www.linkedin.com/in/anastasiintech/

Newsletter: https://anastasiintech.substack.com

Instagram: https://www.instagram.com/anastasi.in.tech/

Patreon: https://www.patreon.com/AnastasiInTech

Transcript

Kind: captions Language: en Something colossal is unfolding in the Texas plains. A 2 nanometer chip factory built from scratch with no prior experience. It's designed to produce 1 terowatt of AI chips per year. Built on the most advanced transistor technology we have gate all around. And this move is either genius or a very expensive trap because even the best chip makers on earth with decades of experience and hundreds of billions invested still get it wrong. So why would they even try? And more importantly, what do they know that everyone else doesn't? I'm a chip design engineer and oh wow, terrafab. What an interesting fab at so many levels. But the strange thing about it, the deeper you go into it, the less crazy it starts to look. And this can be actually the move that rebalances the global supply chain. Subscribe to Instas Tech and let me explain. Back in 2007, around 12 companies could manufacture chips at the most advanced nodes. Today, only two to three remain. TSMC, Intel, and Samsung. And even they struggle to keep up because at the leading edge this is not just about building the factory. Essentially you need five pillars. Tools, raw materials, clean room, hundreds of machines operating at the absolute edge of physics and one invisible layer that holds it all together. The process that turns sand into a thinking machine. Now inside this system, some tools define everything. The most critical are EUV lithography machines. These are the machines that print cheap patterns onto silicon at atomic scale. Each costs around $150 million and a single advanced fab needs roughly 15 to 20 of them. Now scale this to terraab. Elen talks about one terawatt of AI compute per year. Let's take the high-end, the best Nvidia GPU and run a back of the envelope calculations. Let's assume 30,000 wafers per FAP per months at 85% yield, meaning about 85% of the GPU dies per wafer actually work. And remember, wafers are circular. So some dice are always lost at the ages. Under these assumptions, a single TSMC factory produces about 40 gawatt of compute per year. Now push that to terapab targets. To reach one teratt of compute per year, you would need to build an equivalent of 25 semiconductor fabs in one building. That means over 300 EUV machines. And here is a constraint because only ASML builds them and they produce about 50 machines per year. So this one project would require multiple years of global production. That's tens of billions for a single tool class. And instead of spreading that capacity across Arizona, Texas, or Utah, the idea is to compress all of it into a single place. But this is where it gets uncomfortable because it's extreme concentration of value. You're putting an equivalent of an entire semiconductor ecosystem into one physical location and that creates a new kind of risk because now if something goes wrong here it may wipe out the equivalent of multiple fabs hundreds of millions of dollars at once. This is a new kind of scale where efficiency goes up but fragility goes up with it. And then it gets even harder because the plan is to build all of that in one place. Right now, even chips built in the US get shipped to Taiwan for packaging. And this is a real bottleneck. That's why Terraft is such a bold move here. They're trying to build the entire semiconductor stack, logic, memory, packaging, and testing. All of it pulled into one place. And that's something the industry has avoided for decades for a reason. Because each of these steps is its own world. Manufacturing logic is one kind of art. Packaging adds another level of complexity like termal alignment stacking and these systems don't naturally coexist and mixing them risks yield loss. Now add memory and this is where it breaks. Well, there is extreme shortage of high bandwidth memory which is critical for AI chips and it totally makes sense to get your own supply. But memory is a completely different type of factory. It uses different process flows, different tools and produces far more dice per wafer than logic. And the equipment needed for this is expensive. Each machine cost tens of millions and a single facility needs hundreds of them. The problem is these machines are already sold out. At the same time, memory companies like SKH Highix are expanding capacity as fast as they can. This leads to one outcome. Backlogs everywhere. But the core problem is not backlogs and not capital. It's orchestration. You have to align lithography, edge, deposition, mrology, inspection, packaging, and test. And even if you get all of that right, the first wafers won't be good because a new factory doesn't start by printing money. It starts by printing defects. And the complexity of this process is insane because lithography affects edge. Edge affects deposition. Deposition affects electrical behavior. Every step interacts. Every step adds variations. Every step can kill your yield. and learning to control that takes years. So the real game here is not just building the factory, it's learning how to make atoms behave. So when you look at Terap, this is one of the hardest challenges in modern engineering, which makes one to wonder why would anyone even attempt something like this despite the cost, despite the risks. And this is where things start to get really interesting. Right now, advanced node capacity is effectively sold out. TSMC's 3 nanmter wafers and upcoming nodes are booked out for 3 years in advance. AI demand alone is estimated to exceed supply by at least three times. This means every wafer is a fight. You're competing with Apple, Nvidia, AMD, Broadcom, and multi-billion dollar prepayments to secure that capacity. So they wait and waiting means delayed product and lost momentum. So this is a real bottleneck because right now Tesla and SpaceX design their chips but they don't manufacture them. And the moment you depend on someone else to manufacture your chips, three things happen. First of all, you pay their margins, you depend on their capacity and your innovation cycle becomes very long. And that's fine when chips are just components. But that world is gone. Now chips are the product. Autonomy is chips. AI is chips. Satellite communication is chips. Basically, your entire business collapses into compute. And that's where the economics start to hurt because a chip that let's say cost $1 to make might sell for $2. They buy thousands in every car, every rocket, every Starling terminal. Take a Tesla Model 3. Inside that car, you're looking at up to 3,000 chips. Roughly $2,000 worth of silicon. And most of them are not AI chips. These are microcontrollers, power chips, sensors. Each one is cheap, but each one carries 30 to 50% margin. And if you add it all up, this is where the money disappears. And then you add the brain, the AI inference chips, AI4 and AI5. These are expensive by design, built on advanced process node, and it's about $200 per car just for this AI silicon. And that's before the rest of the system. And if you manage to move chip manufacturing in-house, you can save up to $1,000 per car, which means up to 12% higher margin per car with the same product. Now scale that. If robot taxi reaches 10 million cars per year, that same dynamics turns into 5 billion saved every year. And that's just cars. Now move to robots. The volume there is a different league. It can grow 10 to 100 times beyond automotive and today there are barely any purpose-built chips for this mostly Tesla internal stack and Nvidia's Jetson platform. So the market is wide open and if Tesla captures even 10 to 20% you are looking at a multi-billion dollar impact and when you consider that volume and the economics of scale tap app starts to make more and more sense. Now, here is the part no one expected. It turns out most of the output of the Terra app won't be AI chips. It will go into space chips. Potentially up to 80% of the wafers goes there. So, where does all of that silicon will end up? On Earth, we know how to build insanely powerful GPUs. But the moment you leave the planet, silicon enters a completely different reality. Space is unforgiving. It doesn't fail your hardware at once. It degrades it. In space, high energy particles flip bits, corrupt calculations, and wear down transistors over time. So, standard chips won't work. You need radiation hardened silicon. Something like SpaceX D3 chip. It's designed to survive radiation, extreme temperature swings, and constant particle impact. And to make that possible, you don't go for advanced logic nodes like two or three nanometers. You usually go backwards at least one or two nodes because in space reliability matters more than performance. But making these chips is way harder. First of all, fewer of them work on each wafer and then every single one goes through testing far beyond normal checks. You literally shoot them with particles in accelerators to mimic space radiation. And then comes regulation. Red hard chips are treated like defense tech. So even inside the United States, approval can slow everything down. All of this add time, cost, and complexity. And that shows up in price. A single rat chip can cost around $5,000. Now push it beyond low earth's orbit into deep space where radiation is far more intense and you need fully radiation hardened designs and suddenly you are at tens of thousands per chip. The single chip can cost as much as a car. A big part of the cost sits in the packaging. You need heavy shielding and careful isolation to protect the core logic of the chip from radiation because failure is not an option. You don't want your system dying halfway to Mars. But here is the problem. At $10,000 per chip, obviously space economics collapse and also innovations in this space move painfully slow. Even new designs takes years and everything has to be proven, tested, certified. But the space economy can't scale like that. So the only way out is control. controlling the manufacturing and redesigning the entire stack. And that's how you break the cycle and shift the cost curve maybe down to 300 or $500 per chip. The biggest challenge I see here is scale because space computing is still emerging field and I shared my opinion in my episode about space data centers. So this idea of mega fab mixing it with high bandage memory logic packaging and rat hard chips is dangerous. One rat hard process tab like high temperature analing can contaminate shared tools and suddenly you are not just risking one product but the entire fab. These space chips are exciting, but currently the most of the volume is actually on the ground. Installing terminals, these are very silicon hungry. If you open the terminal dish, you're looking at roughly 500 chips per unit, and that includes amplifiers, beam forming chips, and controllers. Now, multiply that by millions of terminals. That's serious volume. These chips are designed by Starlink but manufactured by ST Micro Electronics in Europe. And these are not cheap. You are looking at around $150 of silicon per terminal which already makes this a multi-billion dollar chip business every year. And that's before satellites and before rockets. And suddenly teraf makes a lot more sense especially when spacerade chips is a matter of national security. And if you manage to bring even part of that stack in house you take control of the most critical layer in the system. You secure supply and over time you control the economics and saving billions from one product line. Now if you look across all these layers of silicon they have a chip demand explosion for cars, robots and space. So with this explosion of chip demand the solution looks obvious. Bring manufacturing inhouse. But if it's so obvious why no one is doing that because building advanced semiconductor capacity is extremely hard. But then again so is launching rockets. One of the first real constraints is not engineering, it's availability. EV machines are not just expensive, they are scarce. These systems take years to build with long backlogs. Meanwhile, players like Intel, Samsung, and TSMC have been placing orders years and years in advance. So, the real question is, did they secure these machines already? Because if they didn't, the timeline breaks not by month but by years into 2028 and beyond. And lithography machines is just one part of the story. A modern fab needs an entire catalog, deposition, edge, iron implantation, mrology, inspection, cleaning, polishing, and packaging. That's hundreds of tools, and most come with 12 to 24 months lead times. and then months to install, months to calibrate and years before they actually mastered. So, Terap is not one FAP, it's multiple factories merged into one facility. And this is where the difficulty starts to compound. And we'll break it down in details in a moment. But here is the part most people miss. Even if you manage all that complexity and manufacturing working out, this is only a tiny part of the story. The moment a chip leaves the factory, the real testing begins. The chip in your car, in a data center, orbiting Earth, they don't stop being tested after production. They have to be monitored continuously every day in the field while it's running. And today, this is not optional. Failures simply cannot happen. Not in a Tesla moving at 130 km per hour. not in a data center training a frontier model for six months straight and this is what Simmons EDA's Tesscent does with its groundbreaking SSN and IST technology as a part of its silicon life cycle management platform covering manufacturing test infield test and silicon life cycle management end to end for any semiconductor product and for chips in satellites or in data centers running 24/7 this means one thing. You can stop replacing hardware on a schedule and start replacing it only when the data says so. That saves the industry billions. Tesscent is doing exactly that. Check them out through the link in the description box below. Now, bring this back to Teraf because everything we just talked about gets exponentially harder at the bleeding edge. Even if you solve the equipment problem, you're still not safe because Terafab is aiming straight at one of the most complex transistor architectures we've ever built. Gate all around. This is the next big leap after FinF fat. In older chips, the channel sits flat controlled from one side. Then when I was at university, I remember this moment when FinFat arrived and now it powered the last decade. All the advanced devices are built based on finfat. You lift the channel up, wrap the gate around three sides, and suddenly you have much better control. It worked brilliantly until it didn't. As scaling pushed further, things started to break. Leakage exploded, control became harder, and the transistor had to be reinvented again. That's where gate allaround comes in. But now the difficulty shifts. We've basically turned a relatively flat structure into a true 3D structure at atomic scale. And now suddenly you're building vertical stacks of nano sheets each only a few nanometers thick. And here every layer thickness, every spacing, every edge profile has to be controlled with atomic precision across billions of devices. And this is where things starts to break. At this scale, even tiny variations turn into performance loss or yield loss. Get all around in general adds more steps and multiplies the number of ways things can go wrong. And this is where most fabs start to struggle. And within this entire very complicated flow, there is one specifically dangerous moment. When you release the nano sheets and form the gate around them, if inner spacers are slightly off or the gate doesn't fully wrap the structure, defects explode. And these are not easy to detect. These are buried deep underneath inside this 3D structure. You often only see it later during stress test or even worse in the field which makes it extremely difficult. And this is a bigger picture. The challenge is not getting the machines. It's making this entire chain and the recipe behave consistently. Even TSMC approaches this carefully. They have already gate all around wafers running in Taiwan motherf. But in the new fab in Arizona, they start with older more mature process nodes. They ramp up step by step starting with finat. So from breaking the ground to building the advanced transistors in gate all around at TSMC Arizona it takes about 5 to 6 years and they are not reinventing anything here. They're just copying the process from Taiwan to Arizona desert conditions and that alone tells you how hard this is. Now imagine doing this at terap scale. Basically starting from scratch at the most advanced node trying to stabilize atomic level precision across thousands of steps in a brand new fab with no prior knowledge and that is a harsh starting point. One way to derisk it is for example to partner. For example, TSMC is building fabs in Germany for partners like Infinian, NXP and Bosch. Basically, instead of suffering through the entire learning curve yourself and risking failure along the way, you bring in the company that already knows how to run the process. Guys, it's me and this episode was recorded before Intel and Terrafab partnership was announced. And I think this is super interesting because these are two completely different um philosophies, mentalities, but directionally it's very smart and pragmatic move because here Intel brings what Terrafab actually lacks experience at advanced semiconductor manufacturing. They have also already 18A wafers running and they have advanced packaging capability and now the chances of success of Terrafab are actually going up. But even that doesn't remove the core difficulty. Starting at one of the most advanced notes at this scale. This is where a dangerous idea appears. Does it have to be so complicated? Maybe we are overengineering the factory. And this is where this idea of dirty fab comes in. >> I think they're getting clean rooms wrong. By the way, in these modern fabs, I'm going to make a bet here. Tesla will have a two nanometer fab and I can I can eat a cheeseburger and smoke a cigar in the fab. Oh, come on. >> At first, it sounds like a great idea. Let's simplify it. Let's cut the costs because currently in semiconductor fab cleanliness is at extreme levels. We are talking about ultra pure air, multi-stage filtration systems, and strict particle control. All of this is expensive to build and even more expensive to maintain. And the bigger the fab gets, the harder this becomes. Now imagine doing this at terapab scale at 100 million square ft. It's roughly equivalent of 30 GSMC Arizona fabs combined. So keeping that entire volume cleaner than a hospital operating room is insanely hard. So the idea is indeed tempting. Let's reduce cleanliness requirements and automate everything with robotics so we can move faster and spend less. At first, this sounds logical. Inside a FAB, wafers don't just move around freely. They are traveling through the factory in sealed containers called FOPS. These are controlled mini environments filled with nitrogen to remove oxygen, moisture, and reactive gases. Because while wafers aren't finished and moving between steps, even small changes in humidity can trigger unwanted chemical reaction. So if wafers are protected in this box, why does this surrounding environment matter so much? Because wafers spend 70% of time being processed outside those boxes. They are constantly taken out. Every major step, lithography, edge, deposition requires a wafer to be exposed. Even though this happens in controlled mini environments, that exposure is enough. At 2 nmters, a particle you cannot even see landing on a circuit is like an asteroid hitting your chip. The impact is huge. One dust particle can destroy thousands of transistors. And even if you remove people as the main source of contamination and replace them with robots, you don't fully eliminate the problem because a big part of contamination comes from the process itself. These machines, they are constantly operating. They're being opened and closed, being repaired, and also they release particles. They outgass chemicals. They create microcontaminations every second, even at parts per billion levels. They can change device behavior. That's why the air is filtered so aggressively. At this scale, a couple of particles can kill your yield and yield is everything. The entire business depends on how many working chips you get per wafer. That's why the system is built in layers. FOSs, clean room, bunny suits, and extreme air filtering. The idea is that each of these layers reduces risks and if you remove one the system becomes unstable which means more scrapped wafers and very expensive chips and that breaks the entire economics of building in house. And if you want to rethink this system you can't just relax one layer. You have to go back to the first principles. Rethink the process the environment together with the tool makers themselves. companies like ASML because this is not one variable you tweak. It's a tightly coupled system and you can automate everything but still can't escape the physics and that's exactly why building microchip factory is so hard and that complexity depends heavily on one question which kind of factory which kind of technology you're trying to build inside because these defines everything that follows their fab is not one factory even if we ignore memory packaging and testing at the logic level alone on. It splits into two. One is the glamorous part, a cuttingedge 2 nanometer logic factory, the one everyone talks about. The other is very different. A specialized factory for spacerade chips. And if you take that second part seriously, the entire picture shifts because the moment you optimize for space, you stop optimizing for performance. Not all space is the same because low earth's orbit is already harsh. But deep space is entirely different story. But even if the most of your volume sits on low earth's orbit, you still need reliability at scale, which makes one thing crystal clear. Pushing everything to 2 nmters makes your life way harder. Because the more you scale, the more fragile the transistor becomes. And in those tiny dimensions, a single radiation event can take down the entire logic block. So here, instead of pushing forward, you partially step back. This is where silicon on insulator or SOI becomes relevant. The idea is simple. You place the transistor on an insulating layer instead of bulk silicon. That makes the device far more stable under radiation. Exactly what you want for space systems. And here is the interesting part. SOI silicon insulator is not entirely different world. It's actually relatively close to the modern semiconductor process to semos. Yes, it adds some costs about 20% at the equipment level, but at the wafer level the increase is just single digits and you are not rebuilding the factory, you're just extending it. Another option could be to stay with the SMOS, so with a standard process, and push all this protection, all this radiation hardening to the package. But this becomes quickly expensive and it doesn't solve everything at the device level. So this becomes a strategic choice. Where do you solve the problem? In the transistor itself or later in the stack? And that choice tells you a lot about this factory is really trying to do. And at that point, the real story becomes clear. It's all about control. Control over the supply chain so you're not exposed to geopolitical risks or external bottlenecks. And that means reducing dependency on Asian and European suppliers and pulling more of the critical technology inhouse. But once you take that seriously, you run into way bigger question. Where do you actually build something like this? And when you're choosing location for an extremely expensive semiconductor fab, you look first of all as a seismic stability, access to ultra pure water, reliable and massive power supply and proximity to skilled talent. And suddenly Texas doesn't look crazy at all. If this were happening in some random location with no semiconductor ecosystem, the skepticism would be justified. But Texas already has this valuable foundation. Samsung has already its tailor factory right next to Tesla's Gigafactory and Texas Instruments has major factories across that region. That means existing supply chain, talent, and operational knowhow are already there. and Tesla's Gigafactory sits right in the middle of it, which means tighter integration across AI chips, robotics, and manufacturing. On top of that, taxes offers low taxes, aggressive incentives, and the chip act. At this scale, location becomes your leverage because it's not only about getting the machines. You have to build the entire supply chain. You need raw silicon wafers, specialty chemicals, ultra pure gases like neon and helium and complex logistics. And building all of that from scratch is extremely hard. But in Texas, much of this ecosystem is already exist. So you're basically not starting from scratch. And this goes far beyond Tesla. If the US wants to rebuild semiconductor capability, this is how it starts. And this fab becomes strategically important. Right now over 90% of advanced chips are made in Asia. So bringing even part of that back changes the game. It reduces risks, secure supply and makes broader institutional support far more likely. And this strategic weight is what makes these stories so compelling because economically it works. Technically it's a monster. Yes, timeline is overly optimistic and the $25 billion figure is just a starting ticket and once you add multiple fabs, packaging and continuous upgrades, you're pushing towards trillion dollar territory. And this is where it gets interesting because in this prayer IPO context, this move is powerful. Investors look differently at a company that depends on suppliers and a company that has topnotch semiconductor manufacturing inhouse. Just imagine not just launching rockets, not just designing satellites but manufacturing critical silicon too and that will affect valuation and that might be enough to justify a meaningful part of teraf investment and this is very musk like move. take something insanely expensive and turn it in a way that it helps to finance itself. Interestingly, there is another part of story because this chip factory turns Tesla and SpaceX into something very specific. A company that is extremely vertically integrated. This is how the old giants were built like Philips and Intel. After years of disagregation, this is a shift back towards vertical integration. And here we see again that the industry moves in cycles from integration to specialization and back to extreme integration. And that sounds powerful, but it's a very risky move because suddenly you get into one of the most capital intensive unforgiving industries on earth. And the biggest risk is utilization. Basically, your new problem is to keep a factory the size of a city full and running 24/7 every single day. A cheap factory only makes money when it's full. If your demand drops even slightly, your costs don't go down. Your depreciation is still there and suddenly your margins collapse. This is exactly what happened to Intel at some point of time. they were leading and then they slipped on just one processed node and the demand shifted and they were stuck carrying the full weight of their manufacturing. While companies like Nvidia and Apple stayed fabulous, flexible, asset light and fast. So yes, becoming extremely vertically integrated gives you a lot of control, but also it comes at a cost. It removes your safety net. So, this move is either genius or a very expensive trap and this remains an open question. I'm genuinely curious how it goes and once it runs, I would love to see it with my own eyes. And if you can help me to make it happen, hit me up. From my side, I promise I won't contaminate the fab. If you enjoyed this episode, remember to subscribe to the channel and share this episode with someone who still thinks terab is a crazy idea. And now watch this episode on the most advanced microchip factory on earth built in the desert or this one on the killer of space data centers and I will see you there. Love you guys.

Chapters

Description

Transcript

Related coverage