llm-jp
/

llm-jp-clip-vit-large-patch14

Zero-Shot Image Classification

Model card Files Files and versions Community

speed commited on Jan 2

Commit

7fd0df9

·

verified ·

1 Parent(s): 667dae1

Update README.md

Files changed (1) hide show

README.md +32 -2

README.md CHANGED Viewed

@@ -3,6 +3,36 @@ tags:
 - clip
 library_name: open_clip
 pipeline_tag: zero-shot-image-classification
-license: mit
 ---
-# Model card for llm-jp-roberta-ViT-L-14-relaion-1.5B-lr5e-4-bs8k-accum4-20241218-epoch90

 - clip
 library_name: open_clip
 pipeline_tag: zero-shot-image-classification
+license: [apache-2.0]
 ---
+# Model Card for llm-jp-roberta-ViT-L-14-relaion-1.5B-lr5e-4-bs8k-accum4-20241218-epoch90
+# Model Details
+A CLIP ViT-L/14 model trained using [OpenCLIP](https://github.com/mlfoundations/open_clip) with the Japanese translation of the English subset of ReLAION-5B (https://huggingface.co/datasets/laion/relaion2B-en-research-safe), translated by [gemma-2-9b-it](https://huggingface.co/datasets/laion/relaion2B-en-research-safe).
+# How to Use
+# Training Details
+## Model Architecture
+- Text Encoder: RoBERTa base with llm-jp-tokenizer
+- Image Encoder: ViT-L/14
+## Training Data
+We used a Japanese-translated version of the relaion2B-en-research-safe dataset.
+The translation was performed using gemma-2-9b-it.
+Due to a 70% success rate in image downloads, the dataset size was 1.45 billion samples, and we processed it over 9 epochs (13 billion samples in total).
+# Evaluation
+# Citation
+Bibtex:
+```
+```