Update README.md
Browse files
README.md
CHANGED
@@ -40,9 +40,11 @@ This version of Cephalo, lamm-mit/Cephalo-Idefics-2-vision-10b-alpha, is based o
|
|
40 |
|
41 |
The model was trained in several stages:
|
42 |
|
43 |
-
Step 1
|
44 |
-
|
45 |
-
Step
|
|
|
|
|
46 |
|
47 |
The model was trained on a combination of scientific text-image data extracted from Wikipedia and scientific papers. For further details on the base model, see: https://huggingface.co/HuggingFaceM4/idefics2-8b-chatty. More details about technical aspects of the model, training and example applications to materials science problems are provided in the paper (reference at the bottom).
|
48 |
|
|
|
40 |
|
41 |
The model was trained in several stages:
|
42 |
|
43 |
+
**Step 1**: Train https://huggingface.co/lamm-mit/Cephalo-Idefics-2-vision-8b-beta by fine-tuning the HuggingFaceM4/idefics2-8b-chatty model.
|
44 |
+
|
45 |
+
**Step 2**: Combine the https://huggingface.co/lamm-mit/Cephalo-Idefics-2-vision-8b-beta decoder with the last 8 layers of the HuggingFaceM4/idefics2-8b-chatty decoder.
|
46 |
+
|
47 |
+
**Step 3**: Fine-tune the merged model, which now has 40 decoder layers and a total of 10b parameters.
|
48 |
|
49 |
The model was trained on a combination of scientific text-image data extracted from Wikipedia and scientific papers. For further details on the base model, see: https://huggingface.co/HuggingFaceM4/idefics2-8b-chatty. More details about technical aspects of the model, training and example applications to materials science problems are provided in the paper (reference at the bottom).
|
50 |
|