Text-to-Image
Diffusers
English
File size: 3,590 Bytes
eaefa93
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
# DDPM Project

This repository contains the implementation of Denoising Diffusion Probabilistic Models (DDPM).

## Table of Contents
- [Introduction](#introduction)
- [Installation](#installation)
- [Usage](#usage)
- [Contributing](#contributing)

## Introduction
Denoising Diffusion Probabilistic Models (DDPM) are a class of generative models that learn to generate data by reversing a diffusion process. This repository provides a comprehensive implementation of DDPM.

## Installation
To install the necessary dependencies, run:
```bash

pip install -r requirements.txt

```

## Usage
To train the model, use the following command:
```bash

python train.py

```
To generate samples, use:
```bash

python generate.py

```

## Game
To understand the model and it's workings, we're working on a cool cute little game where the user is the UNET reverser/diffusion model and is tasked to denoise the images with noise made of grids of lines.

Use [learndiffusion.vercel.app](learndiffusion.vercel.app) to access the primitive version of the game. You can also contribute to the game by checking out at the diffusion_game branch. A new model showcase will also be added such that the model's weights are loaded from the internet, model's files are installed and loaded into a gradio interface for direct use/inference on the vercel. Feel free to make changes for the same, issue is opened.



## Explanations and Mathematics

- slides from presentation : 

- notes/explanations : [HERE](slides\notes)

- a cute lab talk ppt: 

- plato's allegory : \<link to REPUBLIC>



## Resources

- Original Paper : https://arxiv.org/pdf/2006.11239

- Improvement Paper : https://arxiv.org/abs/2102.09672

- Improvement by OpenAI : https://arxiv.org/pdf/2105.05233

- Stable Diffusion Paper : https://arxiv.org/abs/2112.10752

- 



### Papers for background

- UNET Paper for Biomedical Segmentation

- Autoencooder

- Variational Autoencoder

- Markov Hierarchical VAE

- Introductory Lectures on Diffusion Process



### Youtube videos and courses

#### Mathematics

- Outliers

- Omar Jahil



#### Pytorch Implementation

- [Deep Findr](https://www.youtube.com/watch?v=a4Yfz2FxXiY)

- [Notebook from Deep Findr](https://colab.research.google.com/drive/1sjy9odlSSy0RBVgMTgP7s99NXsqglsUL?usp=sharing)



## Pretrained Weights

weights from the model can be found in [pretrained_weights](https://drive.google.com/drive/folders/1NiQDI3e67I9FITVnrzNPP2Az0LABRpic?usp=sharing)



For loading the pretrained weights:

```

model2 = SimpleUnet()

model2.load_state_dict(torch.load("/content/drive/MyDrive/Research Work/mlsa/DDPM/model_weights.pth"))
model2.eval()
```



For making inferences

TODO: Errors in the sampling function, boolean errors and etc. Will open issues for solving by others as exercise if needed.

```
num_samples = 8  # Number of images to generate

image_size = (3, 32, 32)  # Example for CIFAR10
noise = torch.randn(num_samples, *image_size).to("cuda")

model2.to("cuda")
# Generate images by denoising
with torch.no_grad():

    generated_images = model2.sample(noise)

# Save the generated images
save_image(generated_images, "generated_images.png", nrow=4, normalize=True)

```





## Contributing

Contributions are welcome! Please open an issue or submit a pull request.





## Future Ideas

- Make the model onnx compatible for training and inferencing on Intel GPUs

- Build a Stable Diffusion model Text2Img using CLIP implementationnnnn !!!

- Train the current model for a much larger dataset with more generalizations and nuances