s-JoL commited on
Commit
b336ae4
·
1 Parent(s): f9ff52f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -26
README.md CHANGED
@@ -10,6 +10,8 @@ inference: false
10
  <!-- Provide a quick summary of what the model is/does. -->
11
 
12
  ## 介绍
 
 
13
  Baichuan-13B 是由百川智能继 [Baichuan-7B](https://github.com/baichuan-inc/baichuan-7B) 之后开发的包含 130 亿参数的开源可商用的大规模语言模型,在标准的中文和英文 benchmark上均取得同尺寸最好的效果。本次发布包含有预训练 (Baichuan-13B-Base) 和对齐 (Baichuan-13B-Chat) 两个版本。Baichuan-13B 有如下几个特点:
14
 
15
  1. **更大尺寸、更多数据**:Baichuan-13B在[Baichuan-7B](https://github.com/baichuan-inc/baichuan-7B) 的基础上进一步扩大参数量到130亿,并且在高质量的语料上训练了1.4万亿tokens,超过LLaMA-13B 40%,是当前开源13B尺寸下训练数据量最多的模型。支持中英双语,使用ALiBi 位置编码,上下文窗口长度为 4096。
@@ -18,6 +20,8 @@ Baichuan-13B 是由百川智能继 [Baichuan-7B](https://github.com/baichuan-inc
18
  4. **开源免费可商用**:Baichuan-13B不仅对学术研究完全开放,开发者也仅需邮件申请并获得官方商用许可后,即可以免费商用。
19
 
20
  ## Introduction
 
 
21
  Baichuan-13B is an open-source, commercially usable large-scale language model developed by Baichuan Intelligence, following [Baichuan-7B](https://github.com/baichuan-inc/baichuan-7B). With 13 billion parameters, it achieves the best performance in standard Chinese and English benchmarks among models of its size. This release includes two versions: pre-training (Baichuan-13B-Base) and alignment (Baichuan-13B-Chat). Baichuan-13B has the following features:
22
 
23
  1. **Larger size, more data**: Baichuan-13B further expands the parameter volume to 13 billion based on [Baichuan-7B](https://github.com/baichuan-inc/baichuan-7B), and has trained 1.4 trillion tokens on high-quality corpora, exceeding LLaMA-13B by 40%. It is currently the model with the most training data in the open-source 13B size. It supports both Chinese and English, uses ALiBi position encoding, and has a context window length of 4096.
@@ -26,32 +30,6 @@ Baichuan-13B is an open-source, commercially usable large-scale language model d
26
  4. **Open-source, free, and commercially usable**: Baichuan-13B is not only fully open to academic research, but developers can also use it for free commercially after applying for and receiving official commercial permission via email.
27
 
28
 
29
- ## How to Get Started with the Model
30
-
31
- 如下是一个使用Baichuan-13B-Base进行1-shot推理的任务,根据作品给出作者名,正确输出为"夜雨寄北->李商隐"
32
- ```python
33
- from transformers import AutoModelForCausalLM, AutoTokenizer
34
-
35
- tokenizer = AutoTokenizer.from_pretrained("baichuan-inc/Baichuan-13B-Base", trust_remote_code=True)
36
- model = AutoModelForCausalLM.from_pretrained("baichuan-inc/Baichuan-13B-Base", device_map="auto", trust_remote_code=True)
37
- inputs = tokenizer('登鹳雀楼->王之涣\n夜雨寄北->', return_tensors='pt')
38
- inputs = inputs.to('cuda:0')
39
- pred = model.generate(**inputs, max_new_tokens=64,repetition_penalty=1.1)
40
- print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
41
- ```
42
-
43
- The following is a task of performing 1-shot inference using Baichuan-13B-Base, where the author's name is given based on the work, with the correct output being "One Hundred Years of Solitude->Gabriel Garcia Marquez"
44
- ```python
45
- from transformers import AutoModelForCausalLM, AutoTokenizer
46
-
47
- tokenizer = AutoTokenizer.from_pretrained("baichuan-inc/Baichuan-13B-Base", trust_remote_code=True)
48
- model = AutoModelForCausalLM.from_pretrained("baichuan-inc/Baichuan-13B-Base", device_map="auto", trust_remote_code=True)
49
- inputs = tokenizer('Hamlet->Shakespeare\nOne Hundred Years of Solitude->', return_tensors='pt')
50
- inputs = inputs.to('cuda:0')
51
- pred = model.generate(**inputs, max_new_tokens=64,repetition_penalty=1.1)
52
- print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
53
- ```
54
-
55
  ## Model Details
56
 
57
  ### Model Description
 
10
  <!-- Provide a quick summary of what the model is/does. -->
11
 
12
  ## 介绍
13
+ Baichuan-13B-Base为Baichuan-13B系列模型中的预训练版本,经过对齐后的模型可见[Baichuan-13B-Chat](https://github.com/baichuan-inc/Baichuan-13B-Chat)。
14
+
15
  Baichuan-13B 是由百川智能继 [Baichuan-7B](https://github.com/baichuan-inc/baichuan-7B) 之后开发的包含 130 亿参数的开源可商用的大规模语言模型,在标准的中文和英文 benchmark上均取得同尺寸最好的效果。本次发布包含有预训练 (Baichuan-13B-Base) 和对齐 (Baichuan-13B-Chat) 两个版本。Baichuan-13B 有如下几个特点:
16
 
17
  1. **更大尺寸、更多数据**:Baichuan-13B在[Baichuan-7B](https://github.com/baichuan-inc/baichuan-7B) 的基础上进一步扩大参数量到130亿,并且在高质量的语料上训练了1.4万亿tokens,超过LLaMA-13B 40%,是当前开源13B尺寸下训练数据量最多的模型。支持中英双语,使用ALiBi 位置编码,上下文窗口长度为 4096。
 
20
  4. **开源免费可商用**:Baichuan-13B不仅对学术研究完全开放,开发者也仅需邮件申请并获得官方商用许可后,即可以免费商用。
21
 
22
  ## Introduction
23
+ Baichuan-13B-Base is the pre-training version in the Baichuan-13B series of models, and the aligned model can be found at [Baichuan-13B-Chat](https://github.com/baichuan-inc/Baichuan-13B-Chat).
24
+
25
  Baichuan-13B is an open-source, commercially usable large-scale language model developed by Baichuan Intelligence, following [Baichuan-7B](https://github.com/baichuan-inc/baichuan-7B). With 13 billion parameters, it achieves the best performance in standard Chinese and English benchmarks among models of its size. This release includes two versions: pre-training (Baichuan-13B-Base) and alignment (Baichuan-13B-Chat). Baichuan-13B has the following features:
26
 
27
  1. **Larger size, more data**: Baichuan-13B further expands the parameter volume to 13 billion based on [Baichuan-7B](https://github.com/baichuan-inc/baichuan-7B), and has trained 1.4 trillion tokens on high-quality corpora, exceeding LLaMA-13B by 40%. It is currently the model with the most training data in the open-source 13B size. It supports both Chinese and English, uses ALiBi position encoding, and has a context window length of 4096.
 
30
  4. **Open-source, free, and commercially usable**: Baichuan-13B is not only fully open to academic research, but developers can also use it for free commercially after applying for and receiving official commercial permission via email.
31
 
32
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  ## Model Details
34
 
35
  ### Model Description