add performance metrics, update names
Browse files
README.md
CHANGED
@@ -29,6 +29,13 @@ The training data for SEA-LION encompasses 980B tokens.
|
|
29 |
- **Languages:** English, Chinese, Indonesian, Malay, Thai, Vietnamese, Filipino, Tamil, Burmese, Khmer, Lao
|
30 |
- **License:** MIT License
|
31 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
32 |
|
33 |
## Training Details
|
34 |
|
@@ -109,7 +116,7 @@ The tokenizer type is Byte-Pair Encoding (BPE).
|
|
109 |
Lam Wen Zhi Clarence<br>
|
110 |
Leong Wei Qi<br>
|
111 |
Li Yier<br>
|
112 |
-
Liu Darius<br>
|
113 |
Lovenia Holy<br>
|
114 |
Montalan Jann Railey<br>
|
115 |
Ng Boon Cheong Raymond<br>
|
@@ -121,7 +128,7 @@ Susanto Yosephine<br>
|
|
121 |
Tai Ngee Chia<br>
|
122 |
Tan Choon Meng<br>
|
123 |
Teo Jin Howe<br>
|
124 |
-
Teo Leslie<br>
|
125 |
Teo Wei Yi<br>
|
126 |
Tjhi William<br>
|
127 |
Yeo Yeow Tong<br>
|
|
|
29 |
- **Languages:** English, Chinese, Indonesian, Malay, Thai, Vietnamese, Filipino, Tamil, Burmese, Khmer, Lao
|
30 |
- **License:** MIT License
|
31 |
|
32 |
+
### Performance Benchmarks
|
33 |
+
|
34 |
+
SEA-LION has an average performance on general tasks in English (as measured by Hugging Face's LLM Leaderboard):
|
35 |
+
|
36 |
+
| Model | ARC | HellaSwag | MMLU | TruthfulQA | Average |
|
37 |
+
|-------------|:-----:|:---------:|:-----:|:----------:|:-------:|
|
38 |
+
| SEA-LION 7B | 39.93 | 68.51 | 26.87 | 35.09 | 42.60 |
|
39 |
|
40 |
## Training Details
|
41 |
|
|
|
116 |
Lam Wen Zhi Clarence<br>
|
117 |
Leong Wei Qi<br>
|
118 |
Li Yier<br>
|
119 |
+
Liu Bing Jie Darius<br>
|
120 |
Lovenia Holy<br>
|
121 |
Montalan Jann Railey<br>
|
122 |
Ng Boon Cheong Raymond<br>
|
|
|
128 |
Tai Ngee Chia<br>
|
129 |
Tan Choon Meng<br>
|
130 |
Teo Jin Howe<br>
|
131 |
+
Teo Eng Sipp Leslie<br>
|
132 |
Teo Wei Yi<br>
|
133 |
Tjhi William<br>
|
134 |
Yeo Yeow Tong<br>
|