Chinese
liswei commited on
Commit
dc50571
·
verified ·
1 Parent(s): 5f5686b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -3
README.md CHANGED
@@ -28,6 +28,8 @@ We will extend the model to train on larger data sets and different base models
28
  We release both pre-trained base models and instruction tuned variants with 270M and 1.1B parameters.
29
  Along with the model, datasets used to train the base and instruction-tuned models are also released.
30
 
 
 
31
  List of released models:
32
  * [Taiwan-ELM-270M](https://huggingface.co/liswei/Taiwan-ELM-270M)
33
  * [Taiwan-ELM-1_1B](https://huggingface.co/liswei/Taiwan-ELM-1_1B)
@@ -37,10 +39,18 @@ List of released models:
37
  List of released datasets:
38
  * [liswei/Taiwan-Text-Excellence-2B](https://huggingface.co/datasets/liswei/Taiwan-Text-Excellence-2B)
39
  * [liswei/PromptPair-TW](https://huggingface.co/datasets/liswei/PromptPair-TW)
 
 
 
 
 
 
 
 
40
 
41
  ## Usage Examples
42
 
43
- We adapt the LLaMA2 template:
44
  ```jinja2
45
  <s>[INST] <<SYS>>
46
  {{ system_prompt }}
@@ -49,9 +59,9 @@ We adapt the LLaMA2 template:
49
  {{ user_message }} [/INST]
50
  ```
51
 
52
- The model could be load via `AutoModelForCausalLM` with `trust_remote_code=True`:
53
  ```python
54
- taiwanelm_270m = AutoModelForCausalLM.from_pretrained("liswei/Taiwan-ELM-270M", trust_remote_code=True)
55
  ```
56
 
57
  We also support additional generation methods and speculative generation, please find reference at [OpenELM#usage](https://huggingface.co/apple/OpenELM#usage).
 
28
  We release both pre-trained base models and instruction tuned variants with 270M and 1.1B parameters.
29
  Along with the model, datasets used to train the base and instruction-tuned models are also released.
30
 
31
+ In an effort to improve transparency, training checkpoints (including rng/optimizer state) and training logs are also released in the model page.
32
+
33
  List of released models:
34
  * [Taiwan-ELM-270M](https://huggingface.co/liswei/Taiwan-ELM-270M)
35
  * [Taiwan-ELM-1_1B](https://huggingface.co/liswei/Taiwan-ELM-1_1B)
 
39
  List of released datasets:
40
  * [liswei/Taiwan-Text-Excellence-2B](https://huggingface.co/datasets/liswei/Taiwan-Text-Excellence-2B)
41
  * [liswei/PromptPair-TW](https://huggingface.co/datasets/liswei/PromptPair-TW)
42
+ * [liswei/wikinews-zhtw-dedup](https://huggingface.co/datasets/liswei/wikinews-zhtw-dedup)
43
+ * [liswei/wikipedia-zhtw-dedup](https://huggingface.co/datasets/liswei/wikipedia-zhtw-dedup)
44
+ * [liswei/coct-en-zhtw-dedup](liswei/coct-en-zhtw-dedup)
45
+
46
+ Some of the datasets are not used for training Taiwan ELM but also released:
47
+ * [liswei/common-crawl-zhtw](liswei/common-crawl-zhtw)
48
+ * [liswei/c4-zhtw](liswei/c4-zhtw)
49
+ * [liswei/rm-static-zhTW](liswei/rm-static-zhTW)
50
 
51
  ## Usage Examples
52
 
53
+ For instruction-tuned modesl, we adapt the [LLaMA2](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) template:
54
  ```jinja2
55
  <s>[INST] <<SYS>>
56
  {{ system_prompt }}
 
59
  {{ user_message }} [/INST]
60
  ```
61
 
62
+ The model could be load via `AutoModelForCausalLM` and `text-generation-inference` with `trust_remote_code=True`:
63
  ```python
64
+ taiwan_elm_270m = AutoModelForCausalLM.from_pretrained("liswei/Taiwan-ELM-270M", trust_remote_code=True)
65
  ```
66
 
67
  We also support additional generation methods and speculative generation, please find reference at [OpenELM#usage](https://huggingface.co/apple/OpenELM#usage).