MasaakiKotera Eric2333 commited on
Commit
be067e6
β€’
1 Parent(s): 65056db

Add description about newer model in README.md (#2)

Browse files

- Add description about newer model in README.md (991fc3f484aaad404966f65fb619de239434c901)


Co-authored-by: YichongEricZhao <[email protected]>

Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -25,10 +25,11 @@ Firstly, combine the split model using the command `cat model.pt.part-* > model.
25
  β”‚ β”œβ”€β”€ example_finetuning.py
26
  β”‚ └── example_pretraining.py
27
  β”œβ”€β”€ experiments_data
28
- β”œβ”€β”€ model.pt.part-aa # splited bin data of pre-trained model
29
- β”œβ”€β”€ model.pt.part-ab
30
  β”œβ”€β”€ model.pt.part-ac
31
  β”œβ”€β”€ model.pt.part-ad
 
32
  β”œβ”€β”€ model.py # define the architecture
33
  β”œβ”€β”€ sampling.py # script to generate sequences
34
  β”œβ”€β”€ tokenization.py # preparete data
 
25
  β”‚ β”œβ”€β”€ example_finetuning.py
26
  β”‚ └── example_pretraining.py
27
  β”œβ”€β”€ experiments_data
28
+ β”œβ”€β”€ model.pt.part-aa # splited bin data of *HISTORICAL* model (shorter context window, less VRAM comsuption)
29
+ β”œβ”€β”€ model.pt.part-ab
30
  β”œβ”€β”€ model.pt.part-ac
31
  β”œβ”€β”€ model.pt.part-ad
32
+ β”œβ”€β”€ model_updated.pt # *NEWER* model, with longer context windows and being trained on a deduplicated dataset
33
  β”œβ”€β”€ model.py # define the architecture
34
  β”œβ”€β”€ sampling.py # script to generate sequences
35
  β”œβ”€β”€ tokenization.py # preparete data