391: Nvidia Q4, David Senra, Tiktok Spin?, Google on LLM Costs, Intel, Geothermal, and Apple's Blood Glucose Sensor
“Your favorite thing is out there and you haven’t found it yet”
To know how to criticize is good, to know how to create is better. —Henri Poincaré
🚨🗣️🗣️🗣️🎙️🎧 I had a great time talking to my friends Jim O’Shaughnessy (💚 🥃 ♾️) and David Senra (📚🎙️), and the conversation happened to be recorded, so you can join us through the magic of Airpods and MP3s:
The way David tells the Sam Zell story is just *chef’s kiss*
I’ve also got good feedback on a line I said, and the more I think about it, the more I like how it compresses the reasons why most people should explore and experiment:
“Your favorite thing is out there and you haven’t found it yet”
I also got feedback on this one:
“We need dream generators. You put the dream in people’s heads, and *then* they will act.”
In other words, you should probably watch Deadwood if you haven’t seen it yet 🤠📺
🥋👦🏻🔗My oldest kid has been doing Jiu-Jitsu for a month now.
He was telling me about getting his first stripe degree soon, and I thought it was a good opportunity to discuss the difference between the thing and the marker of the thing.
I asked him what it would mean if they gave him a black belt? Would that make him better all of a sudden?
Would your sensei be just as good if he was wearing a white belt?
So it’s not about the belt, it’s about the work and skills, and the belt is just a marker, it just represents something else, right?
We talked about how this applies to many things.
Is it better to be a champion with no medals, or to have a closet full of trophies and medals that you found in a cardboard box in the woods?
I abstracted it further, thinking of the Feynman anecdote about walking in the woods with his dad, being taught that there’s a difference between knowing the name of something and knowing the thing itself.
If you know every detail about a tree — the shape of the leaves, the texture of the bark, how tall it gets, how deep the roots go, where in the world can it be found, does it produce fruit or nuts or seeds? What birds and animals build nests in it, what insects is it vulnerable to, etc.
If you know all this about a tree but don’t know the name of the tree, do you still know the tree?
And if you know the name of something, but almost nothing else about it, do you know more than the person who doesn’t know the name but knows the rest?
It was a fun vein to explore — I recommend this type of convo if you have kids, I think it provides an important foundational lens through which to understand what learning is (meta-learning!).
✂️🎼 📺 🎞️ 📖 🎮 🎭 🤹🏻♀️ If a piece of content is *really* good, it can’t be too long.
If it’s *bad*, it can’t be short enough.
That’s why you should worry more about making something really good than about sticking to some arbitrary length/format.
🏦 💰 Liberty Capital 💳 💴
🕵️♂️ My Thoughts on Nvidia Q4 🔍🤖
With the frenzy over AI lately, I want to take a closer look at the picks & shovels vendor in the space.
I previously covered how they had a bunch of cycles inflecting down at the same time (gaming, PC sales, the China chip ban, crypto-mining, early in new generation ramp, etc), but if there’s one that is a tailwind, it’s definitely the hunger for compute from AI.
Here are some highlights from the most recent investor call:
Q4 revenue was $6.05 billion, up 2% sequentially, and down 21% year-on-year. Full year revenue was $27 billion, flat from the prior year.
If you look inside those aggregate numbers, you find that the data-center segment is holding down the fort while gaming is still really struggling, not so much because there’s no demand, but because the pandemic distorted demand so much that Nvidia became too optimistic and ended up with *way too much* inventory and fabbing capacity at TSMC.
Hyperscale customer revenue posted strong sequential growth, though short of our expectations as some cloud service providers paused at the end of the year to recalibrate their build plans.
Though we generally see tightening that reflects overall macroeconomic uncertainty, we believe this is a timing issue [and] the end market demand for GPUs and AI infrastructure is strong.
On the H100 ramp-up, their new generation of chips, based on the Hopper architecture:
Adoption of our new flagship H100 Data Center GPU is strong. In just the second quarter of its ramp, H100 revenue was already much higher than that of A100, which declined sequentially.
This is a testament of the exceptional performance on the H100, which is as much as 9x faster than the A100 for training and up 30x cluster in inferencing of transformer-based large language models. The transformer engine of H100 arrived just in time to serve the development and scale out of inference of large language models. [...]
Generative AI foundation model sizes continue to grow at exponential rates
Their timing was good on this one (if not on some other things).
To have chips optimized for transformer models, which are all the rage now, Nvidia had to start designing these chips a few years ago, before the current wave of generative AI and LLMs, and it certainly looks like they bet on the right horse 🐎
This last bit about the exponential rate of growth in the size of models is not just a throwaway, it’s important.
We have a really bad intuitive sense of the exponential. These things seem huge now, but they’re about to get truly massive.
Combining that fact with the fact that Moore’s Law is slowing down and it is becoming ever more expensive and difficult to shrink silicon, the diff between the growth rate of the models and the performance improvement in semis may mean that demand for the absolute number of chips could grow quite a bit.
AI adoption is at an inflection point.
OpenAI's ChatGPT has captured interest worldwide, allowing people to experience AI firsthand and showing what's possible with Generative AI. These new types of neural network models can improve productivity in a wide range of tasks [...] Generative AI applications will help almost every industry do more faster.
Generative large language models with over 100 billion parameters are the most advanced neural networks in today's world
Is it even possible to have a conference call in Q4 2023 without mentioning ChatGPT?
But at least in this case, they’re not just opportunistically using buzzwords: Nvidia has been talking about AI for 10 years and it’s the rest of the world that caught up to them…
We remain focused on expanding our software and services.
We released version 3.0 of NVIDIA AI enterprise with support for more than 50 NVIDIA AI frameworks and pretrained model and new workflows for contact center, intelligent virtual assistance, audio transcription and cybersecurity. Upcoming offerings include our NeMo and BioNeMo large language model services, which are currently in early access with customers.
At this point, it’s a cliché to say that Nvidia has more software engineers than hardware engineers.
And yet, even after years of hearing this, all that most people pay attention to are things like the new 4090 or H100, so I guess it bears repeating:
One of the main reasons why customers pick Nvidia and remain loyal to them is because there’s a big software ecosystem AND there are a lot of people with experience developing for it, so if you have trouble doing something, you can usually find answers, you can hire people with experience working on this ecosystem before.
Some of the less popular accelerated compute solutions out there may have a lower sticker price point, but you may also be getting less for your money if you take the value of the whole ecosystem into account.
Today, we are announcing the NVIDIA DGX Cloud, the fastest and easiest way to have your own DGX AI supercomputer, just open your browser. NVIDIA DGX Cloud is already available through Oracle Cloud Infrastructure and Microsoft Azure, Google GCP and others on the way. [...]
we believe that by hosting everything in the cloud, from the infrastructure through the operating system software, all the way through pretrained models, we can accelerate the adoption of Generative AI in enterprises
AI supercomputer as a service?
On gaming 🎮:
Gaming revenue of $1.83 billion was up 16% sequentially and down 46% from a year ago. Fiscal year revenue of $9.07 billion is down 27%. Sequential growth was driven by the strong reception of our 40 Series GeForce RTX GPUs based on the Ada Lovelace architecture.
The year-on-year decline reflects the impact of channel inventory correction, which is largely behind us.
When you’re down 46% but up 16% sequentially, you know that the recent past was wild.
It’ll be interesting to see if the inventory correction truly is behind them and if gaming starts to look better going forward.
It used to be the biggest segment, and I’m curious to see if it can catch back up to data-center or if it stays #2 for the foreseeable future 🤔
Our GeForce NOW cloud gaming service continued to expand in multiple dimensions, users, titles and performance. It now has more than 25 million members in over 100 countries. Last month, it enabled RTX 4080 graphics horsepower in the new high-performance ultimate membership tier. Ultimate members can stream at up to 240 frames per second from a cloud with full ray tracing and DLSS 3.
25 million members is starting to be legit. This is not Google Stadia.
Nvidia made a 10-year deal with Microsoft giving them assurances that if they succeed in Acquiring Activision, GeFore Now will have access to their games like Minecraft, Halo, Flight Simulator, and Activision’s big hits like Call of Duty and Overwatch.
At CES, we announced a strategic partnership with Foxconn to develop automated and autonomous vehicle platforms.
This partnership will provide scale for volume, manufacturing to meet growing demand for the NVIDIA DRIVE platform. Foxconn will use NVIDIA DRIVE, Hyperion compute and sensor architecture for its electric vehicles.
📱🔌🚘 Foxconn: We don’t just make iPhones!
Here’s Jensen on why AI needed accelerated computing vs general purpose computing (aka CPUs):
Large language models are called large because they are quite large. However, remember that we've accelerated and advanced AI processing by a million x over the last decade. Moore's Law, in its best days, would have delivered 100x in a decade.
By coming up with new processors, new systems, new interconnects, new frameworks and algorithms and working with data scientists, AI researchers on new models, across that entire span, we've made large language model processing a million times faster, a million times faster.
How many large foundational models will be required? 🔬🤔
the number of large language models or foundation models that have to be developed is quite large. Different countries with different cultures and its body of knowledge are different. Different fields, different domains, whether it's imaging or it's biology or it's physics, each one of them need their own domain of foundation models. [...]
the number of companies in the world have their own proprietary data. The most valuable data in the world are proprietary. And they belong to the company. It's inside their company. It will never leave the company.
And that body of data will also be harnessed to train new AI models for the very first time.
On ChatGPT and human language as a programming language:
the surprising capability of a single AI model that can perform tasks and skills that it was never trained to do. And for this language model to not just speak English, or can translate, of course, but not just speak human language, it can be prompted in human language, but output Python, output COBOL, a language that very few people even remember, output Python for Blender, a 3D program. So it's a program that writes a program for another program.
The world now realizes that maybe human language is a perfectly good computer programming language, and that we've democratized computer programming for everyone, almost anyone who could explain in human language a particular task to be performed.
When I say new era of computing, this new computing platform, this new computer could take whatever your prompt is, whatever your human-explained request is, and translate it to a sequence of instructions that you process it directly or it waits for you to decide whether you want to process it or not.
And so this type of computer is utterly revolutionary in its application because it's democratized programming to so many people really has excited enterprises all over the world. [...]
everybody who develops software is either alerted or shocked into alert or actively working on something that is like ChatGPT to be integrated into their application or integrated into their service. And so this is, as you can imagine, utterly worldwide.
The punchline? Here we go:
Keep reading with a 7-day free trial