Shio-Koube commited on
Commit
57bfae5
·
verified ·
1 Parent(s): e0c761d

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -0
README.md ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+ This is a MAE trained on Anime dataset. The main goal is to have a model efficient for image search, retrival and clustering.
5
+
6
+ There are 2 parts of this model, the encoder and decoder. The encoder encode the full images into 8x512 embedding and the masked out image into 8 (28x28/10) x 512 embedding. The decoder try to reconstruct that image.
7
+
8
+ Model arch is LocalViT small but with 16 layers instead of 12, Decoder is a simple transformers model with LocalViT style MLP.
9
+