DeepSeek Has Big Tech Sweating, But Investors Should Stay Cool

    Date:

    Ever since Chinese AI startup DeepSeek unveiled its new R1 model – which rivals the best U.S. bots like ChatGPT – AI stocks have been on a wild roller coaster ride.

    The news was explosive, sparking fears that companies will pull back on their extreme AI spending. That led chipmaker Nvidia (NVDA) to lose $600 billion in market value within just a few hours on Monday – the biggest single-day drop in market history. 

    Unsurprisingly, there’s been a lot of talk about DeepSeek, Nvidia, and the rest of the AI universe over the past few days. 

    But the main question most are left wondering is: How? 

    How did a virtually unknown Chinese startup disrupt the entire global AI industry, reportedly developing the world’s most advanced model for just around $5 million? 

    Yet, we think the more important question may be: What does it mean? 

    What does DeepSeek’s breakthrough mean for the industry as a whole? Will it mark the end of the big bull run in AI stocks? Or is the recent selloff just an overreaction, making this a very compelling buying opportunity?

    In this issue, we will attempt to answer these questions, to the best of our ability. And in so doing, we hope to point you in the direction of some future winning stock picks. 

    Doing More With Less

    So… how did DeepSeek do it? 

    Based on our research, the startup’s success came via innovative engineering born out of geopolitical necessity. 

    That is, over the past two years, the U.S.’ approach to building next-gen AI models has been to “throw money at it.” Build more data centers. Buy more GPUs. Hire more engineers to build, train, and advance more models on top of all those GPUs. 

    But due to ongoing geopolitical tensions, Chinese AI companies have had to employ a different strategy. Ever since the AI Boom began, the U.S. has consistently enforced export bans on AI chips to China, thereby limiting the number of chips Chinese firms can buy. They haven’t been able to employ the “more, more, more” approach that Microsoft (MSFT), Alphabet (GOOGL), Amazon (AMZN) and others have in the past few years. 

    Instead, Chinese developers were forced to embrace a “do more with less” mentality. 

    That led DeepSeek to focus on an innovative blend of engineering techniques to create a super-efficient AI model. 

    Now, I won’t pretend to understand these techniques at a granular level. While I am familiar with different AI modeling composition, I am not a world-class developer myself. 

    However, I have studied this topic enough to have a general understanding of what made DeepSeek’s model tick. And spoiler alert: it’s pretty neat. 

    An Innovative Architecture Makes DeepSeek Tick

    At the heart of DeepSeek’s breakthrough is something called a Mixture-of-Experts (MoE) architecture. 

    In short, most AI models today are created to be omnipotent. They try to be doctors, lawyers, and engineers all rolled into one – experts on a near-infinite number of subjects. When you ask a general model like ChatGPT a question, its entire architecture “wakes up” to answer because all its expert knowledge is rolled into one model. 

    But DeepSeek employs a MoE architecture. In a sense, it’s created a room of multiple experts, where each is separate and distinct. It is a model composed of multiple topic-specific sub-models. Therefore, when you ask DeepSeek a question, the only part of the model that “wakes up” is the expert sub-model relevant to your question. 

    Thanks to this modular approach, DeepSeek can save an immense amount of computing power on each query because only part of the model is roused per query. According to DeepSeek’s own numbers, its V3 model is trained on nearly 700 billion parameters. But only about 20 billion of those – or less than 5% – are activated simultaneously at any given time. 

    This drastic reduction in activated parameters is partially what has allowed DeepSeek to create an AI as good as leading models for ~95% lower costs. 

    To be sure, DeepSeek isn’t the only company in the world employing MoE architecture. But through a variety of novel engineering techniques, it appears to be the firm that was able to perfect and scale it. 

    And that is how a virtually unknown Chinese AI startup disrupted the entire global AI industry.  

    But what does that mean for others in the space? Is this the start of the AI Boom’s own dot-com crash? 

    On the contrary, we actually view DeepSeek’s efficiency breakthrough as great news for the industry – and great news for AI stocks, too. 

    Chart

    SignUp For Breaking Alerts

    New Graphic

    We respect your email privacy

    Share post:

    Popular

    More like this
    Related

    Chair Powell to Dismiss Political Pressure: Jan. 29, 2025

    Market participants are gearing up for a widely expected...

    Options Market Expectations for the FOMC, META, MSFT, and TSLA

    Thanks to an FOMC meeting and key earning reports...

    Where’s the Moat?

    In Where’s the Moat?, Andrew Wilkinson and Steve Sosnick...

    Datos de Empleo en EEUU: Entusiasmo en Diciembre

    Comentario de Datos Económicos Mensuales y Evaluación sobre el...