PI #009: Let's Open the Stable Diffusion Black Box
Part 1: How Does a Stable Diffusion Model Learn to Generate Images?
Hello there, I am Paul Iusztin, and within this newsletter, I will deliver your weekly piece of MLE & MLOps wisdom straight to your inbox ๐ฅ
This week we will cover stable diffusion models. More concretely:
Get an Intuition of How Diffusion Models Work
How Diffusion models generate new images
โ Next week, we will release Part 2, where we cover how stable diffusion models are trained and controlled using text prompts.
Before diving into the topic, I want to celebrate that my Full Stack 7-Steps MLOps Framework course reached 410+ stars on GitHub.
I want to thank everyone who starred and read my course. This means a lot to me.
If you havenโt read it yet and you are interested in learning how to design, build, deploy, and monitor an end-to-end ML batch system, check it out on GitHub and Medium.
Now, letโs get back to our diffusion models ๐
#1. Get an Intuition of How Diffusion Models Work
We will use a dataset with cats as an example.
Thus, let's say that we want to train a stable diffusion model to generate new cats.
Then, to:
#๐ญ. ๐๐ฒ๐ป๐ฒ๐ฟ๐ฎ๐๐ฒ ๐๐ต๐ฒ ๐ฑ๐ฎ๐๐ฎ๐๐ฒ๐ - ๐ฎ๐ฑ๐ฑ ๐๐ฎ๐๐๐๐ถ๐ฎ๐ป ๐ป๐ผ๐ถ๐๐ฒ
We take every image from the dataset and gradually add Gaussian noise to them.
Now, we have multiple images containing various noise levels for every initial cat image.
#๐ฎ. ๐ง๐ฟ๐ฎ๐ถ๐ป ๐๐ต๐ฒ ๐บ๐ผ๐ฑ๐ฒ๐น - ๐ฟ๐ฒ๐บ๐ผ๐๐ฒ ๐๐ต๐ฒ ๐ป๐ผ๐ถ๐๐ฒ
The actual job of the model is to take a noisy image and remove the noise from it.
Thus, when training the diffusion model:
- it will take a noisy image as input
- it will try to remove the noise
- the loss is computed between the "cleaned" image & the original unnoisy image
There are some missing pieces to the algorithm I explained, but this is the central intuition that will get you started ๐ฅ
โ ๐๐ฌ... ๐๐ฉ๐ข๐ต'๐ด ๐ค๐ฐ๐ฐ๐ญ, ๐ฃ๐ถ๐ต ๐ฉ๐ฐ๐ธ ๐ฅ๐ฐ๐ฆ๐ด ๐ช๐ต ๐จ๐ฆ๐ฏ๐ฆ๐ณ๐ข๐ต๐ฆ ๐ฏ๐ฆ๐ธ ๐ช๐ฎ๐ข๐จ๐ฆ๐ด?
Well, the model learned to add cat features to a noisy image.
Thus, you can throw some Gaussian noise into the model,
... and after some magic happens, it will generate a completely new cat for you.
Note that here the process isn't controlled with a prompt as most of you are used to, but I will present that soon.
#2. How Diffusion models generate new images
The process of generating new images with stable diffusion can be explained in 5 simple steps.
The generation of new images is an iterative process (aka the sampling process).
The most naive sampling algorithm is called DDPM.
Here it is ๐
#๐ญ. Sample Gaussian Noise with the size of the desired output.
For a 516x516 image sample noise with a shape of 516x516.
#๐ฎ. Pass the sample through the diffusion model.
The model will return what it thinks represents noise from the sample.
#๐ฏ. Subtract the noise returned by the model from the sample.
#๐ฐ. Add back some Gaussian noise to stabilize the process.
This is important, as the model learned to get an image with Gaussian noise as input.
Thus, if you remove the noise from the image, the input won't have the same distribution anymore.
#๐ฑ. Repeat steps 2-4 for T iterations.
One important note is that all the operations are conditioned by T.
Thus, the model will return different results for different timesteps, and you subtract different noise scales from the sample.
Also, one step won't remove all the noise. That is why we need T steps to get high-quality samples.
To concludeโฆ
To generate new images using Diffusion models, you have to:
- sample gaussian noise
- pass it throw the model
- subtract the predicted noise from the sample
Thatโs it ๐ฅ
These are this weekโs insights about stable diffusion.
See you next Thursday at 9:00 am CET.
Have a fantastic weekend!
Paul
Whenever youโre ready, here is how I can help you:
The Full Stack 7-Steps MLOps Framework: a 7-lesson FREE course that will walk you step-by-step through how to design, implement, train, deploy, and monitor an ML batch system using MLOps good practices. It contains the source code + 2.5 hours of reading & video materials on Medium.
Machine Learning & MLOps Blog: here, I approach in-depth topics about designing and productionizing ML systems using MLOps.
Incredible article. I learned a lot! Thanks!!