README.md
CHANGED
@@ -6,7 +6,7 @@ license: apache-2.0
|
|
6 |
|
7 |
Official research release for the family of **XGen** models (`7B`) by Salesforce AI Research:
|
8 |
|
9 |
-
*Title*: [Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length](https://blog.salesforceairesearch.com/xgen/)
|
10 |
|
11 |
## Models
|
12 |
|
@@ -16,7 +16,7 @@ Official research release for the family of **XGen** models (`7B`) by Salesforce
|
|
16 |
* License: Apache-2.0
|
17 |
|
18 |
The training data for the models are tokenized with OpenAI Tiktoken library.
|
19 |
-
To use this model, install
|
20 |
|
21 |
```sh
|
22 |
pip install tiktoken
|
@@ -25,6 +25,7 @@ pip install tiktoken
|
|
25 |
The models can be used as auto-regressive samplers as follows:
|
26 |
|
27 |
```python
|
|
|
28 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
29 |
|
30 |
tokenizer = AutoTokenizer.from_pretrained("Salesforce/xgen-7b-8k-base", trust_remote_code=True)
|
|
|
6 |
|
7 |
Official research release for the family of **XGen** models (`7B`) by Salesforce AI Research:
|
8 |
|
9 |
+
*Title*: [Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length](https://blog.salesforceairesearch.com/xgen-7b/)
|
10 |
|
11 |
## Models
|
12 |
|
|
|
16 |
* License: Apache-2.0
|
17 |
|
18 |
The training data for the models are tokenized with OpenAI Tiktoken library.
|
19 |
+
To use this model, install the package via `pip`:
|
20 |
|
21 |
```sh
|
22 |
pip install tiktoken
|
|
|
25 |
The models can be used as auto-regressive samplers as follows:
|
26 |
|
27 |
```python
|
28 |
+
import torch
|
29 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
30 |
|
31 |
tokenizer = AutoTokenizer.from_pretrained("Salesforce/xgen-7b-8k-base", trust_remote_code=True)
|