--- license: apache-2.0 datasets: - ILSVRC/imagenet-1k model-index: - name: MaskBit-Tokenizer-12bits results: - task: type: image-generation dataset: name: ILSVRC/imagenet-1k type: ILSVRC/imagenet-1k metrics: - name: rFID type: rFID value: 1.52 - name: InceptionScore type: InceptionScore value: 184.3 - name: LPIPS type: LPIPS value: 0.298 - name: PSNR type: PSNR value: 21.2 - name: SSIM type: SSIM value: 0.55 - name: CodebookUsage type: CodebookUsage value: 1.0 --- This model is the MaskBit tokenizer with a vocabulary size of 12bits. It uses a downsampling factor of 16 and is trained on ImageNet for images of resolution 256. You can find more details on the [project page](https://weber-mark.github.io/projects/maskbit.html) and in the [paper](https://arxiv.org/abs/2409.16211).