ChangranHuuu
commited on
Commit
·
536b5be
1
Parent(s):
83e22f4
Changran done a pass of edits and comments. updated model description, out of scope use, training procudure.
Browse files
README.md
CHANGED
@@ -8,7 +8,7 @@ license: other
|
|
8 |
|
9 |
<!-- Provide a quick summary of what the model is/does. -->
|
10 |
|
11 |
-
BLOOMChat is
|
12 |
|
13 |
## Model Details
|
14 |
|
@@ -58,7 +58,7 @@ BLOOMChat should NOT be used for:
|
|
58 |
- Making highly important decisions
|
59 |
- Important automated pipelines
|
60 |
|
61 |
-
This model is still in early development and can be prone to mistakes and hallucinations, there is still room for improvement. This model is intended to provide the community with a
|
62 |
|
63 |
### Recommendations
|
64 |
|
@@ -150,15 +150,16 @@ python -m inference_server.cli --model_name sambanovasystems/BLOOMChat-176B-v1 -
|
|
150 |
```
|
151 |
|
152 |
```
|
153 |
-
<human>:
|
154 |
<bot>:
|
155 |
```
|
156 |
|
157 |
```
|
158 |
-
<human>:
|
159 |
<bot>:
|
160 |
```
|
161 |
|
|
|
162 |
</details>
|
163 |
|
164 |
---
|
@@ -326,21 +327,16 @@ Estos son solo algunos ejemplos de juegos que podrían interesarte según tus cr
|
|
326 |
|
327 |
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
|
328 |
|
329 |
-
We trained BLOOMChat with
|
|
|
330 |
|
331 |
-
### Prompting Style Used For Training
|
332 |
-
```
|
333 |
-
<human>: {input that the user wants from the bot}
|
334 |
-
<bot>:
|
335 |
-
```
|
336 |
|
|
|
337 |
```
|
338 |
-
<human>: {
|
339 |
-
<bot>: {
|
340 |
-
<human>: {
|
341 |
-
<bot>: {
|
342 |
-
<human>: {input that the user wants from the bot}
|
343 |
-
<bot>:
|
344 |
```
|
345 |
|
346 |
### Hyperparameters
|
|
|
8 |
|
9 |
<!-- Provide a quick summary of what the model is/does. -->
|
10 |
|
11 |
+
BLOOMChat is a 176 billion parameter multilingual chat model. It is instruction tuned from [BLOOM (176B)](https://huggingface.co/bigscience/bloom) on assistant-style conversation datasets and supports conversation, question answering and generative answers in multiple languages.
|
12 |
|
13 |
## Model Details
|
14 |
|
|
|
58 |
- Making highly important decisions
|
59 |
- Important automated pipelines
|
60 |
|
61 |
+
This model is still in early development and can be prone to mistakes and hallucinations, there is still room for improvement. This model is intended to provide the community with a multilingual chat LLM baseline.
|
62 |
|
63 |
### Recommendations
|
64 |
|
|
|
150 |
```
|
151 |
|
152 |
```
|
153 |
+
<human>: Create an itemized list of tasks to complete to start a clothing brand
|
154 |
<bot>:
|
155 |
```
|
156 |
|
157 |
```
|
158 |
+
<human>: 十七岁的风是什么颜色的?
|
159 |
<bot>:
|
160 |
```
|
161 |
|
162 |
+
|
163 |
</details>
|
164 |
|
165 |
---
|
|
|
327 |
|
328 |
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
|
329 |
|
330 |
+
We trained BLOOMChat with [SambaNova DataScale systems](https://sambanova.ai/products/datascale/) with SambaNova's in-house Reconfigurable Dataflow Unit (RDU). We started from [BLOOM (176B)](https://huggingface.co/bigscience/bloom), an open-source multilingual LLM pretrained by the [BigScience group](https://huggingface.co/bigscience). We instruction-tune BLOOM (176B) on OpenChatKit with each data source subsampled to 100k for one epoch, followed by three epochs over the combined OpenChatKit and Dolly 2.0.
|
331 |
+
All of the code used to prepare the datasets and the scripts to run training and inference are open-sourced and freely available at [sambanova/bloomchat](https://github.com/sambanova/bloomchat/tree/main)
|
332 |
|
|
|
|
|
|
|
|
|
|
|
333 |
|
334 |
+
### Prompting Style Used For Training
|
335 |
```
|
336 |
+
<human>: {input1 that the user wants from the bot}
|
337 |
+
<bot>: {response1}</s>
|
338 |
+
<human>: {input2 that the user wants from the bot}
|
339 |
+
<bot>: {response2}</s>
|
|
|
|
|
340 |
```
|
341 |
|
342 |
### Hyperparameters
|