Update README.md
Browse files
README.md
CHANGED
@@ -10,6 +10,8 @@ inference: false
|
|
10 |
<!-- Provide a quick summary of what the model is/does. -->
|
11 |
|
12 |
## 介绍
|
|
|
|
|
13 |
Baichuan-13B 是由百川智能继 [Baichuan-7B](https://github.com/baichuan-inc/baichuan-7B) 之后开发的包含 130 亿参数的开源可商用的大规模语言模型,在标准的中文和英文 benchmark上均取得同尺寸最好的效果。本次发布包含有预训练 (Baichuan-13B-Base) 和对齐 (Baichuan-13B-Chat) 两个版本。Baichuan-13B 有如下几个特点:
|
14 |
|
15 |
1. **更大尺寸、更多数据**:Baichuan-13B在[Baichuan-7B](https://github.com/baichuan-inc/baichuan-7B) 的基础上进一步扩大参数量到130亿,并且在高质量的语料上训练了1.4万亿tokens,超过LLaMA-13B 40%,是当前开源13B尺寸下训练数据量最多的模型。支持中英双语,使用ALiBi 位置编码,上下文窗口长度为 4096。
|
@@ -18,6 +20,8 @@ Baichuan-13B 是由百川智能继 [Baichuan-7B](https://github.com/baichuan-inc
|
|
18 |
4. **开源免费可商用**:Baichuan-13B不仅对学术研究完全开放,开发者也仅需邮件申请并获得官方商用许可后,即可以免费商用。
|
19 |
|
20 |
## Introduction
|
|
|
|
|
21 |
Baichuan-13B is an open-source, commercially usable large-scale language model developed by Baichuan Intelligence, following [Baichuan-7B](https://github.com/baichuan-inc/baichuan-7B). With 13 billion parameters, it achieves the best performance in standard Chinese and English benchmarks among models of its size. This release includes two versions: pre-training (Baichuan-13B-Base) and alignment (Baichuan-13B-Chat). Baichuan-13B has the following features:
|
22 |
|
23 |
1. **Larger size, more data**: Baichuan-13B further expands the parameter volume to 13 billion based on [Baichuan-7B](https://github.com/baichuan-inc/baichuan-7B), and has trained 1.4 trillion tokens on high-quality corpora, exceeding LLaMA-13B by 40%. It is currently the model with the most training data in the open-source 13B size. It supports both Chinese and English, uses ALiBi position encoding, and has a context window length of 4096.
|
@@ -26,32 +30,6 @@ Baichuan-13B is an open-source, commercially usable large-scale language model d
|
|
26 |
4. **Open-source, free, and commercially usable**: Baichuan-13B is not only fully open to academic research, but developers can also use it for free commercially after applying for and receiving official commercial permission via email.
|
27 |
|
28 |
|
29 |
-
## How to Get Started with the Model
|
30 |
-
|
31 |
-
如下是一个使用Baichuan-13B-Base进行1-shot推理的任务,根据作品给出作者名,正确输出为"夜雨寄北->李商隐"
|
32 |
-
```python
|
33 |
-
from transformers import AutoModelForCausalLM, AutoTokenizer
|
34 |
-
|
35 |
-
tokenizer = AutoTokenizer.from_pretrained("baichuan-inc/Baichuan-13B-Base", trust_remote_code=True)
|
36 |
-
model = AutoModelForCausalLM.from_pretrained("baichuan-inc/Baichuan-13B-Base", device_map="auto", trust_remote_code=True)
|
37 |
-
inputs = tokenizer('登鹳雀楼->王之涣\n夜雨寄北->', return_tensors='pt')
|
38 |
-
inputs = inputs.to('cuda:0')
|
39 |
-
pred = model.generate(**inputs, max_new_tokens=64,repetition_penalty=1.1)
|
40 |
-
print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
|
41 |
-
```
|
42 |
-
|
43 |
-
The following is a task of performing 1-shot inference using Baichuan-13B-Base, where the author's name is given based on the work, with the correct output being "One Hundred Years of Solitude->Gabriel Garcia Marquez"
|
44 |
-
```python
|
45 |
-
from transformers import AutoModelForCausalLM, AutoTokenizer
|
46 |
-
|
47 |
-
tokenizer = AutoTokenizer.from_pretrained("baichuan-inc/Baichuan-13B-Base", trust_remote_code=True)
|
48 |
-
model = AutoModelForCausalLM.from_pretrained("baichuan-inc/Baichuan-13B-Base", device_map="auto", trust_remote_code=True)
|
49 |
-
inputs = tokenizer('Hamlet->Shakespeare\nOne Hundred Years of Solitude->', return_tensors='pt')
|
50 |
-
inputs = inputs.to('cuda:0')
|
51 |
-
pred = model.generate(**inputs, max_new_tokens=64,repetition_penalty=1.1)
|
52 |
-
print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
|
53 |
-
```
|
54 |
-
|
55 |
## Model Details
|
56 |
|
57 |
### Model Description
|
|
|
10 |
<!-- Provide a quick summary of what the model is/does. -->
|
11 |
|
12 |
## 介绍
|
13 |
+
Baichuan-13B-Base为Baichuan-13B系列模型中的预训练版本,经过对齐后的模型可见[Baichuan-13B-Chat](https://github.com/baichuan-inc/Baichuan-13B-Chat)。
|
14 |
+
|
15 |
Baichuan-13B 是由百川智能继 [Baichuan-7B](https://github.com/baichuan-inc/baichuan-7B) 之后开发的包含 130 亿参数的开源可商用的大规模语言模型,在标准的中文和英文 benchmark上均取得同尺寸最好的效果。本次发布包含有预训练 (Baichuan-13B-Base) 和对齐 (Baichuan-13B-Chat) 两个版本。Baichuan-13B 有如下几个特点:
|
16 |
|
17 |
1. **更大尺寸、更多数据**:Baichuan-13B在[Baichuan-7B](https://github.com/baichuan-inc/baichuan-7B) 的基础上进一步扩大参数量到130亿,并且在高质量的语料上训练了1.4万亿tokens,超过LLaMA-13B 40%,是当前开源13B尺寸下训练数据量最多的模型。支持中英双语,使用ALiBi 位置编码,上下文窗口长度为 4096。
|
|
|
20 |
4. **开源免费可商用**:Baichuan-13B不仅对学术研究完全开放,开发者也仅需邮件申请并获得官方商用许可后,即可以免费商用。
|
21 |
|
22 |
## Introduction
|
23 |
+
Baichuan-13B-Base is the pre-training version in the Baichuan-13B series of models, and the aligned model can be found at [Baichuan-13B-Chat](https://github.com/baichuan-inc/Baichuan-13B-Chat).
|
24 |
+
|
25 |
Baichuan-13B is an open-source, commercially usable large-scale language model developed by Baichuan Intelligence, following [Baichuan-7B](https://github.com/baichuan-inc/baichuan-7B). With 13 billion parameters, it achieves the best performance in standard Chinese and English benchmarks among models of its size. This release includes two versions: pre-training (Baichuan-13B-Base) and alignment (Baichuan-13B-Chat). Baichuan-13B has the following features:
|
26 |
|
27 |
1. **Larger size, more data**: Baichuan-13B further expands the parameter volume to 13 billion based on [Baichuan-7B](https://github.com/baichuan-inc/baichuan-7B), and has trained 1.4 trillion tokens on high-quality corpora, exceeding LLaMA-13B by 40%. It is currently the model with the most training data in the open-source 13B size. It supports both Chinese and English, uses ALiBi position encoding, and has a context window length of 4096.
|
|
|
30 |
4. **Open-source, free, and commercially usable**: Baichuan-13B is not only fully open to academic research, but developers can also use it for free commercially after applying for and receiving official commercial permission via email.
|
31 |
|
32 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
33 |
## Model Details
|
34 |
|
35 |
### Model Description
|