Going beyond autoregression.
When I saw that DiffusionGemma was released the first thing I googled was "Visual Guide to DiffusionGemma" and you didn't disappoint 🤭
Could you please hint why a small FFNN is added for self-conditioning?
You can find an answer in this paper ! https://arxiv.org/pdf/2510.19304
When I saw that DiffusionGemma was released the first thing I googled was "Visual Guide to DiffusionGemma" and you didn't disappoint 🤭
Could you please hint why a small FFNN is added for self-conditioning?
You can find an answer in this paper ! https://arxiv.org/pdf/2510.19304