File size: 3,308 Bytes
da3e9f3
7932a4f
 
 
 
 
 
 
cc1aefd
7932a4f
06ca883
 
 
da3e9f3
 
 
 
 
 
80fedff
da3e9f3
6379200
80fedff
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6379200
 
83bc592
80fedff
 
 
 
 
 
 
 
7932a4f
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
---
license: apache-2.0
language:
- en
pipeline_tag: text-to-image
tags:
- pytorch
- diffusers
- conditional-image-generation
- diffusion-models-class
datasets:
- dpdl-benchmark/caltech_birds2011
library_name: diffusers
---

    # class-conditional-diffusion-cub-200
    
    A Diffusion model on Cub 200 dataset for generating bird images.
    
    ## Usage Predict function to generate images
    ```python

      def load_model(model_path, device):
          # Initialize the same model architecture as during training
          model = ClassConditionedUnet().to(device)
          
          # Load the trained weights
          model.load_state_dict(torch.load(model_path))
          
          # Set model to evaluation mode
          model.eval()
          
          return model
      
      
      def predict(model, class_label, noise_scheduler, num_samples=8, device='cuda'):
          model.eval()  # Ensure the model is in evaluation mode
          
          # Prepare a batch of random noise as input
          shape = (num_samples, 3, 256, 256)  # Input shape: (batch_size, channels, height, width)
          noisy_image = torch.randn(shape).to(device)
          
          # Ensure class_label is a tensor and properly repeated for the batch
          class_labels = torch.tensor([class_label] * num_samples, dtype=torch.long).to(device)
      
          # Reverse the diffusion process step by step
          for t in tqdm(range(49, -1, -1), desc="Reverse Diffusion Steps"):  # Iterate backwards through timesteps
              t_tensor = torch.tensor([t], dtype=torch.long).to(device)  # Single time step for the batch
              
              # Predict noise with the model and remove it from the image
              with torch.no_grad():
                  noise_pred = model(noisy_image, t_tensor.expand(num_samples), class_labels)  # Class conditioning here
              
              # Step with the scheduler (model_output, timestep, sample)
              noisy_image = noise_scheduler.step(noise_pred, t, noisy_image).prev_sample
          
          # Post-process the output to get image values between [0, 1]
          generated_images = (noisy_image + 1) / 2  # Rescale from [-1, 1] to [0, 1]
          
          return generated_images
      
      
      def display_images(images, num_rows=2):
          # Create a grid of images
          grid = torchvision.utils.make_grid(images, nrow=num_rows)
          np_grid = grid.permute(1, 2, 0).cpu().numpy()  # Convert to (H, W, C) format for visualization
          
          # Plot the images
          plt.figure(figsize=(12, 6))
          plt.imshow(np.clip(np_grid, 0, 1))  # Clip values to ensure valid range
          plt.axis('off')
          plt.show()
    ```

# Example of loading a model and generating predictions

    ```python
    model_path = "model_epoch_0.pth"  # Path to your saved model
    device = 'cuda' if torch.cuda.is_available() else 'cpu'
    model = load_model(model_path, device)
    noise_scheduler = DDPMScheduler(num_train_timesteps=1000, beta_schedule='squaredcos_cap_v2')
    class_label = 1  # Example class label, change to your desired class
    generated_images = predict(model, class_label, noise_scheduler, num_samples=2, device=device)
    display_images(generated_images)
    ```