ianisdev commited on
Commit
353a5c7
·
verified ·
1 Parent(s): fd3b67a

Upload folder using huggingface_hub

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
README.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - vqvae
5
+ - image-generation
6
+ - unsupervised-learning
7
+ - pytorch
8
+ - mnist
9
+ - generative-model
10
+ datasets:
11
+ - mnist
12
+ library_name: pytorch
13
+ model-index:
14
+ - name: VQ-VAE-MNIST
15
+ results:
16
+ - task:
17
+ type: image-generation
18
+ name: Image Generation
19
+ dataset:
20
+ name: MNIST
21
+ type: image-classification
22
+ metrics:
23
+ - name: FID
24
+ type: frechet-inception-distance
25
+ value: 53.21
26
+ ---
27
+
28
+ # VQ-VAE for MNIST
29
+
30
+ This is a **Vector Quantized Variational Autoencoder (VQ-VAE)** trained on the MNIST dataset using PyTorch. The model compresses and reconstructs grayscale handwritten digits and is used as part of an image augmentation and generative modeling pipeline.
31
+
32
+ ## 🧠 Model Details
33
+
34
+ - **Model Type**: VQ-VAE
35
+ - **Dataset**: MNIST
36
+ - **Epochs**: 35
37
+ - **Latent Space**: Discrete (quantized vectors)
38
+ - **Input Size**: 64×64 (resized and converted to RGB)
39
+ - **Reconstruction Loss**: MSE-based
40
+ - **Implementation**: Custom PyTorch with 3-layer Conv Encoder/Decoder
41
+ - **FID Score**: **53.21**
42
+ - **Loss Curve**: [`loss_curve.png`](./loss_curve.png)
43
+
44
+ > This model learns compressed representations of digit images using vector quantization. The reconstructions can be used for augmentation or generative downstream tasks.
45
+
46
+ ## 📁 Files
47
+
48
+ - `generator.pt`: Trained VQ-VAE model weights.
49
+ - `loss_curve.png`: Visual plot of training loss across 35 epochs.
50
+ - `fid_score.json`: Stored Fréchet Inception Distance (FID) evaluation result.
51
+ - `fid_real/` and `fid_fake/`: 1000 real and generated images used for FID computation.
52
+
53
+ ## 📦 How to Use
54
+
55
+ ```python
56
+ import torch
57
+ from models.vqvae.model import VQVAE
58
+
59
+ model = VQVAE()
60
+ model.load_state_dict(torch.load("generator.pt", map_location="cpu"))
61
+ model.eval()
fid_fake/00000.png ADDED
fid_fake/00001.png ADDED
fid_fake/00002.png ADDED
fid_fake/00003.png ADDED
fid_fake/00004.png ADDED
fid_fake/00005.png ADDED
fid_fake/00006.png ADDED
fid_fake/00007.png ADDED
fid_fake/00008.png ADDED
fid_fake/00009.png ADDED
fid_fake/00010.png ADDED
fid_fake/00011.png ADDED
fid_fake/00012.png ADDED
fid_fake/00013.png ADDED
fid_fake/00014.png ADDED
fid_fake/00015.png ADDED
fid_fake/00016.png ADDED
fid_fake/00017.png ADDED
fid_fake/00018.png ADDED
fid_fake/00019.png ADDED
fid_fake/00020.png ADDED
fid_fake/00021.png ADDED
fid_fake/00022.png ADDED
fid_fake/00023.png ADDED
fid_fake/00024.png ADDED
fid_fake/00025.png ADDED
fid_fake/00026.png ADDED
fid_fake/00027.png ADDED
fid_fake/00028.png ADDED
fid_fake/00029.png ADDED
fid_fake/00030.png ADDED
fid_fake/00031.png ADDED
fid_fake/00032.png ADDED
fid_fake/00033.png ADDED
fid_fake/00034.png ADDED
fid_fake/00035.png ADDED
fid_fake/00036.png ADDED
fid_fake/00037.png ADDED
fid_fake/00038.png ADDED
fid_fake/00039.png ADDED
fid_fake/00040.png ADDED
fid_fake/00041.png ADDED
fid_fake/00042.png ADDED
fid_fake/00043.png ADDED
fid_fake/00044.png ADDED
fid_fake/00045.png ADDED
fid_fake/00046.png ADDED
fid_fake/00047.png ADDED
fid_fake/00048.png ADDED