--- tags: - autoencoder - image-colorization - pytorch - pytorch_model_hub_mixin license: apache-2.0 datasets: - flwrlabs/celeba language: - en metrics: - mse pipeline_tag: image-to-image --- # Model Colorization Autoencoder ## Model Description This autoencoder model is designed for image colorization. It takes grayscale images as input and outputs colorized versions of those images. The model architecture consists of an encoder-decoder structure, where the encoder compresses the input image into a latent representation, and the decoder reconstructs the image in color. ### Architecture - **Encoder**: The encoder comprises three convolutional layers followed by max pooling and ReLU activations, each paired with batch normalization. It ends with a flattening layer and a fully connected layer to produce a latent vector. - **Decoder**: The decoder mirrors the encoder, using linear and transposed convolutional layers with ReLU activations and batch normalization. The final layer outputs a color image using a sigmoid activation function. The architecture details are as follows: ```python class ModelColorization(nn.Module, PyTorchModelHubMixin): def __init__(self): super(ModelColorization, self).__init__() self.encoder = nn.Sequential( nn.Conv2d(1, 64, kernel_size=3, stride=1, padding=1), nn.MaxPool2d(kernel_size=2, stride=2), nn.ReLU(), nn.BatchNorm2d(64), nn.Conv2d(64, 32, kernel_size=3, stride=1, padding=1), nn.MaxPool2d(kernel_size=2, stride=2), nn.ReLU(), nn.BatchNorm2d(32), nn.Conv2d(32, 16, kernel_size=3, stride=1, padding=1), nn.MaxPool2d(kernel_size=2, stride=2), nn.ReLU(), nn.BatchNorm2d(16), nn.Flatten(), nn.Linear(16*45*45, 4000), ) self.decoder = nn.Sequential( nn.Linear(4000, 16 * 45 * 45), nn.ReLU(), nn.Unflatten(1, (16, 45, 45)), nn.ConvTranspose2d(16, 32, kernel_size=3, stride=2, padding=1, output_padding=1), nn.ReLU(), nn.BatchNorm2d(32), nn.ConvTranspose2d(32, 64, kernel_size=3, stride=2, padding=1, output_padding=1), nn.ReLU(), nn.BatchNorm2d(64), nn.ConvTranspose2d(64, 3, kernel_size=3, stride=2, padding=1, output_padding=1), nn.Sigmoid() ) def forward(self, x): x = self.encoder(x) x = self.decoder(x) return x ``` ### Training Details The model was trained using PyTorch for 5 epochs. Here are the training and validation losses observed during the training: Epoch 1: Training Loss: 0.0063, Validation Loss: 0.0042 Epoch 2: Training Loss: 0.0036, Validation Loss: 0.0035 Epoch 3: Training Loss: 0.0032, Validation Loss: 0.0032 Epoch 4: Training Loss: 0.0030, Validation Loss: 0.0030 Epoch 5: Training Loss: 0.0029, Validation Loss: 0.0030 The model demonstrated continuous improvement in reducing both training and validation loss over the epochs. ### Usage You can load the model from the Hugging Face Hub using the following code: ```python # Ensure you have the necessary dependencies installed: pip install torch torchvision transformers from transformers import AutoModel model = AutoModel.from_pretrained("sebastiansarasti/AutoEncoderImageColorization") ```