liswei
/

Taiwan-ELM

Chinese

Model card Files Files and versions Community

liswei commited on Jun 2, 2024

Commit

dc50571

verified ·

1 Parent(s): 5f5686b

Update README.md

Browse files

Files changed (1) hide show

README.md +13 -3

README.md CHANGED Viewed

@@ -28,6 +28,8 @@ We will extend the model to train on larger data sets and different base models
 We release both pre-trained base models and instruction tuned variants with 270M and 1.1B parameters.
 Along with the model, datasets used to train the base and instruction-tuned models are also released.
 List of released models:
 * [Taiwan-ELM-270M](https://huggingface.co/liswei/Taiwan-ELM-270M)
 * [Taiwan-ELM-1_1B](https://huggingface.co/liswei/Taiwan-ELM-1_1B)
@@ -37,10 +39,18 @@ List of released models:
 List of released datasets:
 * [liswei/Taiwan-Text-Excellence-2B](https://huggingface.co/datasets/liswei/Taiwan-Text-Excellence-2B)
 * [liswei/PromptPair-TW](https://huggingface.co/datasets/liswei/PromptPair-TW)
 ## Usage Examples
-We adapt the LLaMA2 template:
 ```jinja2
 <s>[INST] <<SYS>>
 {{ system_prompt }}
@@ -49,9 +59,9 @@ We adapt the LLaMA2 template:
 {{ user_message }} [/INST]
 ```
-The model could be load via `AutoModelForCausalLM` with `trust_remote_code=True`:
 ```python
-taiwanelm_270m = AutoModelForCausalLM.from_pretrained("liswei/Taiwan-ELM-270M", trust_remote_code=True)
 ```
 We also support additional generation methods and speculative generation, please find reference at [OpenELM#usage](https://huggingface.co/apple/OpenELM#usage).

 We release both pre-trained base models and instruction tuned variants with 270M and 1.1B parameters.
 Along with the model, datasets used to train the base and instruction-tuned models are also released.
+In an effort to improve transparency, training checkpoints (including rng/optimizer state) and training logs are also released in the model page.
 List of released models:
 * [Taiwan-ELM-270M](https://huggingface.co/liswei/Taiwan-ELM-270M)
 * [Taiwan-ELM-1_1B](https://huggingface.co/liswei/Taiwan-ELM-1_1B)
 List of released datasets:
 * [liswei/Taiwan-Text-Excellence-2B](https://huggingface.co/datasets/liswei/Taiwan-Text-Excellence-2B)
 * [liswei/PromptPair-TW](https://huggingface.co/datasets/liswei/PromptPair-TW)
+* [liswei/wikinews-zhtw-dedup](https://huggingface.co/datasets/liswei/wikinews-zhtw-dedup)
+* [liswei/wikipedia-zhtw-dedup](https://huggingface.co/datasets/liswei/wikipedia-zhtw-dedup)
+* [liswei/coct-en-zhtw-dedup](liswei/coct-en-zhtw-dedup)
+Some of the datasets are not used for training Taiwan ELM but also released:
+* [liswei/common-crawl-zhtw](liswei/common-crawl-zhtw)
+* [liswei/c4-zhtw](liswei/c4-zhtw)
+* [liswei/rm-static-zhTW](liswei/rm-static-zhTW)
 ## Usage Examples
+For instruction-tuned modesl, we adapt the [LLaMA2](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) template:
 ```jinja2
 <s>[INST] <<SYS>>
 {{ system_prompt }}
 {{ user_message }} [/INST]
 ```
+The model could be load via `AutoModelForCausalLM` and `text-generation-inference` with `trust_remote_code=True`:
 ```python
+taiwan_elm_270m = AutoModelForCausalLM.from_pretrained("liswei/Taiwan-ELM-270M", trust_remote_code=True)
 ```
 We also support additional generation methods and speculative generation, please find reference at [OpenELM#usage](https://huggingface.co/apple/OpenELM#usage).