Happy New Year!
I forgot to post the updates from November so it’s too late, but here are the things that caught my eye in December for AI:
Google announced the Gemini 2.0 Flash Thinking model, which is a reasoning model akin to OpenAI’s o series, which has been giving very promising results. Google also released the Veo 2 model which generates videos that seem to be better than Sora.
Meanwhile Pika launched 2.0 in parallel. Both of these video models look amazing and I think this could be the most underlooked space in terms of potential disruption.
Perplexity acquired Carbon to search work files, which means it will probably directly compete with Glean very soon.
Nvidia released a mini dev kit called the Jetson Orin Nano Super Dev Kit, which is for students and hobbyists to run inferences in the palm of your hand. Think of it as a supercharged raspberry pi. With 8GB of memory, you can prototype small AI applications. For $250 you could probably run a 7B model!
Despite Amazon’s investment in Anthropic, they released a family of models called Nova. It’s interesting that they are hedging their bets with Anthropic, and also having these likely run on their new Trainium chips. Oh and I believe the name “Olympus” was scrapped in favor of Nova.
Has China seriously entered the game? DeepSeek released DeepSeek V3, a 671B MoE model with 37B activated parameters (meaning you can run it on distributed GPUs easier). Some of the stats are pretty wild, beating GPT-4o and even Claude 3.5 on many benchmarks. In particular in the coding space, it seems to slightly edge both of those premiere models, and with Math it crushes them. The fact that this is open source means that anyone can try to run this model and in theory it should be cheaper than the premiere models. The issue is the deployment looks pretty tricky, and there is some data contamination as well. Chinese researchers also released an o1-like reasoning model called LLaVA-o1, similar to how Deepseek also released an r1 model as well the previous month. With DeepSeek, Qwen, LLaVa, the AI industry should start taking China more seriously. Are they really only using H800s…?
Meta released Llama 3.3, which was a 70B model that performs just as well as their flagship 405B model. It makes it easier to run (even on a single GPU) and easy to scale. Mark Zuckerberg also announced a 2GW+ data center for Llama 4.
OpenAI released benchmarks for its o3 model, if you recall in our past newsletters we talked about how the o series increased inference time reasoning (meaning after your prompt, it spends some time “thinking”, or trying different things out). This is great for getting better results, but also requires a lot more compute. OpenAI will spend a lot of money running o3, but the initial results show that it could be the best model yet for programming and other reasoning tasks. Earlier in the month, OpenAI announced a new $200 a month subscription specifically for how intensive o1 was to operate. OpenAI had a “12 days of AI” list of announcements, which included a voice service for ChatGPT (watch out call centers), access to apps, Sora updates
Meanwhile, OpenAI also announced it is converting to a for-profit entity, and announced their complex plan on how to do it. This caused both Elon Musk and Meta to file lawsuits to slow them down.
And that’s a wrap for December 2024, let’s make 2025 an even more exciting year