BAAI
/

YufengCui commited on
Commit
9c6373a
·
verified ·
1 Parent(s): d1e3547

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -0
README.md CHANGED
@@ -36,6 +36,10 @@ We introduce **Emu3**, a new suite of state-of-the-art multimodal models trained
36
  - **Emu3** simply generates a video causally by predicting the next token in a video sequence, unlike the video diffusion model as in Sora. With a video in context, Emu3 can also naturally extend the video and predict what will happen next.
37
 
38
 
 
 
 
 
39
 
40
  #### Quickstart
41
 
 
36
  - **Emu3** simply generates a video causally by predicting the next token in a video sequence, unlike the video diffusion model as in Sora. With a video in context, Emu3 can also naturally extend the video and predict what will happen next.
37
 
38
 
39
+ ### Model Information
40
+
41
+ The **Emu3-Stage1** model is the pre-trained weights of the first stage of the pre-training process of Emu3. The pre-training process of Emu3 is conducted in two stages. In the first stage, **which does not utilize video data**, training begins from scratch with a context length of 5120 for text and image data. The model supports image captioning and can generate images at a resolution of 512x512. You can use our [training scripts](https://github.com/baaivision/Emu3/tree/main/scripts) for further instruction tuning for **image generation and perception tasks**.
42
+
43
 
44
  #### Quickstart
45