20 Comments
User's avatar
Haydar's avatar

Thank you for this insightful and visually engaging guide on quantization, Maarten—it's a fantastic resource!

Expand full comment
Jongin Choi's avatar

I noticed a small typo and wanted to let you know! Thank you for the great article—I really appreciate it.

Original:

In practice, we do not need to map the entire FP32 range [-3.4e38, 3.4e38] into INT8. We merely need to find a way to map the range of our data (the model’s parameters) into IN8.

Correction:

In practice, we do not need to map the entire FP32 range [-3.4e38, 3.4e38] into INT8. We merely need to find a way to map the range of our data (the model’s parameters) into INT8.

Thanks again for sharing this insightful piece! 😊

Expand full comment
Romee Panchal's avatar

Wow! What a beautiful and insightful article.

Expand full comment
Eddy Giusepe's avatar

Thank you very much, MAARTEN GROOTENDORST !

Your explanation is very good 🤗!

Expand full comment
Tinias's avatar

Cool! How do you create these visuals and how much time did it take you? They look very nice!

Expand full comment
Jeevesh Juneja's avatar

In the third figure of GPT-Q section, how does 0.5 get quantized to 0.33? If we are quantizing to int4, shouldn't the output be an integer?

Thanks for the amazing article btw! <3

Expand full comment
Yan Yong's avatar

Very informative about quantization. I have book market it, Thank you for your time and effort!

Expand full comment
Oleksandr Rechynskyi's avatar

wow, this is really extensive article. Thanks!

Expand full comment
Jiwoo Park's avatar

Hello. Thank you for the excellent material.

In the `Common Data Types` section of Part 2, the mantissa part in the `BF16` illustration is shown as `1001000`. I'm curious as to why it isn't 1001001.

Expand full comment
Valeriu Ohan's avatar

You are correct. The BF16 mantissa should be 1001001. Otherwise the encoded value is 3.125.

Expand full comment
Sandeep's avatar

I wish papers published on arXiv are as accessible as this -- even for a subscription. The two column PDF is too rigid to read on any screen.

Expand full comment
Chuanliang Jiang's avatar

Excellent blog. One quick question: what is the benefit to upgrade to paid member ?

Expand full comment
Maarten Grootendorst's avatar

Thank you for reminding me that I should make a dedicated page for this!

To give part of the answer, at the moment there is no benefit to the paid member other than to support me in creating these guides and keep them freely available to everyone. Creating these guides takes a long time to research the topic and create the visuals, so any support would be highly appreciated.

I want to keep this content available to those who do not have the funds. I might change this in the future and keep part of the content freely available with a part paid, but I prefer to keep it all free.

Seeing as this is merely a passion project thus far, I feel very hesistant to actively ask people to support me. In all honesty, just reading "Excellent blog" already makes me happy ;)

Expand full comment
The Humans In The Loop's avatar

These are absolutely excellent visualizations. Thank you for sharing!

Expand full comment
kevin's avatar

best!

Expand full comment
Hongwei's avatar

Thank you so much for the visual explanations. I benefited a lot from this.

Expand full comment
Rico Pagliuca's avatar

Incredible, thanks for this! I have read about all of these and have them mostly well understood but for me the visual aids are /incredibly/ assistive. Really appreciate it! Sharing widely.

<entitled> Addendum for HQQ when? </entitled> *ducks* :p

[ https://github.com/mobiusml/hqq ]

Expand full comment
Chase Hasbrouck's avatar

Maarten,

This is amazing. Would love to know the tooling/process you use to build visualizations, as it's something I've struggled to do well/productively in my own writing.

Expand full comment
Maarten Grootendorst's avatar

Thank you! There isn't a "secret sauce" to these kinds of visualizations.

Tool-wise, I started out with Powerpoint and only switched to Figma because I wanted .svg outputs as .png are quite large to host on websites. That said, the tool in itself is completely unimportant in the process as most visuals are actually nothing more than basic shapes (circles, squares, arrows, etc.).

The process is a bit more difficult since I draw a lot from my background as a psychologist. Most importantly, think about images you've seen before that explain a given concept well. Then, think about why you think that image works so well. If you can identify what makes a given image good/bad (additionally to a wider audience, but this is a bit more tricky), then you are already half-way there. Especially if you can do the same for your own visuals.

Expand full comment
Chase Hasbrouck's avatar

Sounds good. I think my main stumbling block has been jumping ahead to create an image without taking the time to think about what image I actually want.

Expand full comment