Books
Welcome! I have written various books together with Jay Alammar detailing the inner workings of AI that might interest you.
Writing has been part of my career “identity” for a while now. Especially writing in public is a great tool for me to learn and understand these technologies in as much detail as necessary!
An Illustrated Guide to AI Agents
I am thrilled to introduce the An Illustrated Guide to AI Agents book I am writing!
We are going with an amazing number of images and already have more than 300 custom-made images for only 8 completed chapters, with 4 more to go!
This book is for those interested in understanding how AI Agents work and how to go from LLMs all the way to autonomous systems. We cover memory, tools, MCP, context engineering, and much more!
Hands-On Large Language Models
Hands-On Large Language Models is a book I wrote showing the inner workings of Large Language Models. We were fortunate enough to have it be one of O’Reilly’s best-selling books!
With the incredible pace of LLM development, learning about these techniques can be overwhelming.
Throughout this book, we take an *intuition first* approach through visual storytelling with almost **300 custom-made images** in the final release.
This book is for those interested in this exciting field. Whether you are a beginner or more advanced, we believe there is something to be found for everyone!
All of the code is freely available on GitHub, making it easy for you to get started with the inner workings of LLMs.
图解DeepSeek技术
图解DeepSeek技术 (Illustrated DeepSeek) is a book I wrote that goes through how various DeepSeek models work, in particular DeepSeek-R1. This was a special collaboration with Turing Education that allowed us to create this book specifically for the Chinese audience to celebrate their wonderful release of DeepSeek-R1!
As always, we have hundreds of visuals scattered throughout the book exploring Mixture of Experts, reasoning models, and various attention mechanisms in the DeepSeek family of models.
This book is for those who want to dive deeper into the various technologies that made open-weight Large Language Models competitive with proprietary models.




