|
--- |
|
title: Diffusion Models - Complete DDPM Implementation |
|
emoji: 🌊 |
|
colorFrom: purple |
|
colorTo: pink |
|
sdk: pytorch |
|
app_file: "Diffusion Models.ipynb" |
|
pinned: false |
|
license: mit |
|
tags: |
|
- deep-learning |
|
- generative-ai |
|
- pytorch |
|
- diffusion-models |
|
- ddpm |
|
- denoising |
|
- generative-modeling |
|
- computer-vision |
|
- unsupervised-learning |
|
datasets: |
|
- synthetic-2d-data |
|
--- |
|
|
|
# Diffusion Models: Complete DDPM Implementation |
|
|
|
A comprehensive PyTorch implementation of Denoising Diffusion Probabilistic Models (DDPM) with detailed mathematical foundations and educational content. |
|
|
|
## Model Description |
|
|
|
This repository contains a complete implementation of Diffusion Models (DDPM) trained on 2D synthetic datasets. The model learns to generate new data points by mastering the art of noise removal through a reverse diffusion process. This implementation serves as both a working model and an educational resource for understanding the mathematics and implementation of diffusion models. |
|
|
|
### Architecture Details |
|
|
|
- **Model Type**: Denoising Diffusion Probabilistic Model (DDPM) |
|
- **Framework**: PyTorch |
|
- **Input**: 2D point coordinates |
|
- **Diffusion Steps**: 1000 timesteps |
|
- **Hidden Dimensions**: 256 units with SiLU activations |
|
- **Time Embedding**: 64-dimensional rich representations |
|
- **Total Parameters**: ~130K |
|
- **Model Size**: 1.8MB |
|
|
|
### Key Components |
|
|
|
1. **Noise Predictor Network**: Neural network that predicts noise ε_θ(x_t, t) |
|
2. **Forward Diffusion Process**: Gradually adds Gaussian noise over T steps |
|
3. **Reverse Diffusion Process**: Iteratively removes noise to generate samples |
|
4. **Time Embedding Module**: Converts timesteps to rich feature representations |
|
|
|
## Training Details |
|
|
|
- **Dataset**: Synthetic 2D point clusters |
|
- **Diffusion Steps**: 1000 |
|
- **Beta Schedule**: Linear (0.0001 to 0.02) |
|
- **Optimizer**: AdamW with cosine annealing |
|
- **Learning Rate**: 0.001 |
|
- **Training Epochs**: 2000 |
|
- **Batch Processing**: Dynamic batching for efficient training |
|
|
|
## Mathematical Foundation |
|
|
|
### Forward Process |
|
The forward process adds noise according to: |
|
``` |
|
q(x_t | x_{t-1}) = N(x_t; √(1-β_t) x_{t-1}, β_t I) |
|
``` |
|
|
|
With direct sampling: |
|
``` |
|
x_t = √ᾱ_t x_0 + √(1-ᾱ_t) ε |
|
``` |
|
|
|
### Reverse Process |
|
The model learns to reverse noise: |
|
``` |
|
p_θ(x_{t-1} | x_t) = N(x_{t-1}; μ_θ(x_t, t), Σ_θ(x_t, t)) |
|
``` |
|
|
|
### Loss Function |
|
Trained by minimizing noise prediction error: |
|
``` |
|
L = E[||ε - ε_θ(x_t, t)||²] |
|
``` |
|
|
|
## Model Performance |
|
|
|
### Training Metrics |
|
- **Final Training Loss**: Converged to stable low values |
|
- **Training Time**: ~30 minutes on GPU |
|
- **Memory Usage**: <500MB GPU memory |
|
- **Convergence**: Stable training without mode collapse |
|
|
|
### Capabilities |
|
- ✅ High-quality 2D point generation |
|
- ✅ Smooth interpolation in data space |
|
- ✅ Stable training without adversarial dynamics |
|
- ✅ Mathematically grounded approach |
|
- ✅ Excellent sample diversity |
|
|
|
## Usage |
|
|
|
### Quick Start |
|
|
|
```python |
|
import torch |
|
import torch.nn as nn |
|
import matplotlib.pyplot as plt |
|
|
|
# Load the model components (full implementation in notebook) |
|
class NoisePredictor(nn.Module): |
|
def __init__(self, data_dim=2, hidden_dim=256, time_embed_dim=64): |
|
super(NoisePredictor, self).__init__() |
|
# ... (complete implementation in notebook) |
|
|
|
def forward(self, x, t): |
|
# ... (complete implementation in notebook) |
|
return noise_prediction |
|
|
|
class DiffusionModel: |
|
def __init__(self, T=1000, beta_start=0.0001, beta_end=0.02): |
|
# ... (complete implementation in notebook) |
|
|
|
def sample(self, n_samples=100): |
|
# Generate new samples from pure noise |
|
# ... (complete implementation in notebook) |
|
return generated_samples |
|
|
|
# Load trained model |
|
model = DiffusionModel() |
|
# Load weights: model.model.load_state_dict(torch.load('diffusion_model_complete.pth')) |
|
|
|
# Generate new samples |
|
samples = model.sample(n_samples=100) |
|
plt.scatter(samples[:, 0], samples[:, 1]) |
|
plt.title("Generated 2D Points") |
|
plt.show() |
|
``` |
|
|
|
### Advanced Usage |
|
|
|
```python |
|
# Visualize the diffusion process |
|
model.visualize_diffusion_process() |
|
|
|
# Monitor training progress |
|
model.plot_training_curves() |
|
|
|
# Sample with different parameters |
|
high_quality_samples = model.sample(n_samples=500, guidance_scale=1.0) |
|
``` |
|
|
|
## Visualizations Available |
|
|
|
1. **Diffusion Process**: Step-by-step noise addition and removal |
|
2. **Training Curves**: Loss evolution and learning dynamics |
|
3. **Generated Samples**: Comparison with original data distribution |
|
4. **Sampling Process**: Real-time generation visualization |
|
5. **Parameter Analysis**: Beta schedule and noise analysis |
|
|
|
## Files and Outputs |
|
|
|
- `Diffusion Models.ipynb`: Complete implementation with educational content |
|
- `diffusion_model_complete.pth`: Trained model weights |
|
- `diffusion_process.png`: Visualization of forward and reverse processes |
|
- `diffusion_results.png`: Generated samples and quality assessment |
|
- `training_metrics.png`: Comprehensive training analytics |
|
- `diffusion_logs/`: Detailed training and sampling logs |
|
|
|
## Applications |
|
|
|
This diffusion model implementation can be adapted for: |
|
|
|
- **Image Generation**: Extend to pixel-based image synthesis |
|
- **Audio Synthesis**: Apply to waveform or spectrogram generation |
|
- **3D Point Clouds**: Generate 3D shapes and objects |
|
- **Time Series**: Financial data, sensor readings, weather patterns |
|
- **Scientific Data**: Molecular structures, particle physics |
|
- **Data Augmentation**: Synthetic training data creation |
|
|
|
## Educational Value |
|
|
|
This implementation is designed as a learning resource featuring: |
|
|
|
- **Complete Mathematical Derivations**: From first principles to implementation |
|
- **Step-by-Step Explanations**: Every component explained in detail |
|
- **Visual Learning**: Rich plots and animations for understanding |
|
- **Progressive Complexity**: Build understanding gradually |
|
- **Practical Implementation**: Real working code with best practices |
|
|
|
## Research Applications |
|
|
|
The model demonstrates key concepts in: |
|
|
|
- **Generative Modeling**: Alternative to GANs and VAEs |
|
- **Probability Theory**: Markov chains and stochastic processes |
|
- **Neural Network Architecture**: Time conditioning and embeddings |
|
- **Optimization**: Stable training of generative models |
|
- **Sampling Methods**: DDPM and potential DDIM extensions |
|
|
|
## Comparison with Other Generative Models |
|
|
|
### Advantages over GANs |
|
- ✅ Stable training (no adversarial dynamics) |
|
- ✅ No mode collapse |
|
- ✅ Mathematical foundation |
|
- ✅ High-quality samples |
|
|
|
### Advantages over VAEs |
|
- ✅ Higher sample quality |
|
- ✅ No posterior collapse |
|
- ✅ Better likelihood estimates |
|
- ✅ Flexible architectures |
|
|
|
### Trade-offs |
|
- ⚠️ Slower sampling (requires multiple steps) |
|
- ⚠️ More computationally intensive |
|
- ⚠️ Memory requirements for long sequences |
|
|
|
## Citation |
|
|
|
If you use this implementation in your research or projects, please cite: |
|
|
|
```bibtex |
|
@misc{ddpm_implementation_2024, |
|
title={Complete DDPM Implementation: Educational Diffusion Models}, |
|
author={Gruhesh Kurra}, |
|
year={2024}, |
|
url={https://huggingface.co/karthik-2905/DiffusionModels} |
|
} |
|
``` |
|
|
|
## Future Extensions |
|
|
|
Planned improvements and extensions: |
|
|
|
- 🔄 **DDIM Implementation**: Faster sampling with deterministic steps |
|
- 🎨 **Conditional Generation**: Text-guided or class-conditional generation |
|
- 📊 **Alternative Schedules**: Cosine and sigmoid beta schedules |
|
- 🖼️ **Image Diffusion**: Extension to CIFAR-10 and other image datasets |
|
- 🎵 **Audio Applications**: Waveform and spectrogram generation |
|
- 🧬 **Scientific Applications**: Molecular and protein structure generation |
|
|
|
## License |
|
|
|
This project is licensed under the MIT License - see the LICENSE file for details. |
|
|
|
## Additional Resources |
|
|
|
- **GitHub Repository**: [DiffusionModels](https://github.com/GruheshKurra/DiffusionModels) |
|
- **Detailed Notebook**: Complete implementation with educational content |
|
- **Training Logs**: Comprehensive metrics and analysis |
|
|
|
## Model Card Authors |
|
|
|
**Gruhesh Kurra** - Implementation, documentation, and educational content |
|
|
|
--- |
|
|
|
**Tags**: diffusion-models, generative-ai, pytorch, ddpm, deep-learning, denoising |
|
|
|
**Model Card Last Updated**: December 2024 |