File size: 989 Bytes
db72275 a6e8fce 69bbebb db72275 a6e8fce db72275 a6e8fce db72275 a6e8fce db72275 a6e8fce db72275 a6e8fce db72275 a6e8fce db72275 a6e8fce db72275 a6e8fce |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
---
library_name: transformers
tags:
- vision
- anime
- image-feature-extraction
---
# ViTMAE (base-sized model) pre-trained on Pixiv
ViTMAE model pre-trained on Pixiv artworks from id 20 to 100649536. Architecture is the same as [facebook/vit-mae-base](https://huggingface.co/facebook/vit-mae-base), but with a smaller patch size (14) and a larger image size (266).
All training was done on TPUs sponsored by [TPU Research Cloud](https://sites.research.google/trc/about/).
## Usage
```
from transformers import AutoImageProcessor, ViTMAEForPreTraining, ViTModel
# for resizing images to 266 pixes and normalizing to [-1, 1]
processor = AutoImageProcessor.from_pretrained("zapparias/pixiv-vit-mae-base")
# load encoder + decoder
model = ViTMAEForPreTraining.from_pretrained("zapparias/pixiv-vit-mae-base")
# you can also load the encoder into a standard ViT model for feature extraction
model = ViTModel.from_pretrained("zapparias/pixiv-vit-mae-base", add_pooling_layer=False)
```
|