Triangle104 commited on
Commit
d55bd93
1 Parent(s): 6dd1cd1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +79 -11
README.md CHANGED
@@ -17,25 +17,90 @@ This model was converted to GGUF format from [`allenai/Llama-3.1-Tulu-3-8B`](htt
17
  Refer to the [original model card](https://huggingface.co/allenai/Llama-3.1-Tulu-3-8B) for more details on the model.
18
 
19
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
 
21
 
22
 
23
  The chat template for our models is formatted as:
24
 
25
 
26
- <|user|>\nHow are you doing?\n<|assistant|>\nI'm just a computer program, so I don't have feelings, but I'm functioning as expected. How can I assist you today?<|endoftext|>
 
 
 
27
 
28
  Or with new lines expanded:
29
 
 
30
  <|user|>
31
  How are you doing?
32
  <|assistant|>
33
- I'm just a computer program, so I don't have feelings, but I'm functioning as expected. How can I assist you today?<|endoftext|>
 
 
34
 
35
  It is embedded within the tokenizer as well, for tokenizer.apply_chat_template.
36
 
37
- System prompt
38
-
39
 
40
 
41
 
@@ -44,10 +109,11 @@ In Ai2 demos, we use this system prompt by default:
44
 
45
  You are Tulu 3, a helpful and harmless AI Assistant built by the Allen Institute for AI.
46
 
 
47
  The model has not been trained with a specific system prompt in mind.
48
 
49
- Bias, Risks, and Limitations
50
-
51
 
52
 
53
 
@@ -60,10 +126,13 @@ to train the base Llama 3.1 models, however it is likely to have
60
  included a mix of Web data and technical sources like books and code.
61
  See the Falcon 180B model card for an example of this.
62
 
 
63
  Hyperparamters
64
 
 
65
  PPO settings for RLVR:
66
 
 
67
  Learning Rate: 3 × 10⁻⁷
68
  Discount Factor (gamma): 1.0
69
  General Advantage Estimation (lambda): 0.95
@@ -83,8 +152,8 @@ Total Episodes: 100,000
83
  KL penalty coefficient (beta): [0.1, 0.05, 0.03, 0.01]
84
  Warm up ratio (omega): 0.0
85
 
86
- License and use
87
-
88
 
89
 
90
 
@@ -98,8 +167,8 @@ The models have been fine-tuned using a dataset mix with outputs
98
  generated from third party models and are subject to additional terms:
99
  Gemma Terms of Use and Qwen License Agreement (models were improved using Qwen 2.5).
100
 
101
- Citation
102
-
103
 
104
 
105
 
@@ -138,7 +207,6 @@ If Tülu3 or any of the related materials were helpful to your work, please cite
138
  }
139
 
140
  ---
141
-
142
  ## Use with llama.cpp
143
  Install llama.cpp through brew (works on Mac and Linux)
144
 
 
17
  Refer to the [original model card](https://huggingface.co/allenai/Llama-3.1-Tulu-3-8B) for more details on the model.
18
 
19
  ---
20
+ Model details:
21
+ -
22
+ Tülu3 is a leading instruction following model family, offering fully
23
+ open-source data, code, and recipes designed to serve as a
24
+ comprehensive guide for modern post-training techniques.
25
+ Tülu3 is designed for state-of-the-art performance on a diversity of
26
+ tasks in addition to chat, such as MATH, GSM8K, and IFEval.
27
+
28
+
29
+ Model description
30
+
31
+
32
+
33
+ Model type: A model trained on a mix of publicly available, synthetic and human-created datasets.
34
+ Language(s) (NLP): Primarily English
35
+ License: Llama 3.1 Community License Agreement
36
+ Finetuned from model: allenai/Llama-3.1-Tulu-3-8B-DPO
37
+
38
+
39
+ Model Sources
40
+
41
+
42
+
43
+ Training Repository: https://github.com/allenai/open-instruct
44
+ Eval Repository: https://github.com/allenai/olmes
45
+ Paper: https://arxiv.org/abs/2411.15124
46
+ Demo: https://playground.allenai.org/
47
+
48
+
49
+ Using the model
50
+
51
+
52
+ Loading with HuggingFace
53
+
54
+
55
+
56
+ To load the model with HuggingFace, use the following snippet:
57
+
58
+
59
+ from transformers import AutoModelForCausalLM
60
+
61
+
62
+ tulu_model = AutoModelForCausalLM.from_pretrained("allenai/Llama-3.1-Tulu-3-8B")
63
+
64
+
65
+ VLLM
66
+
67
+
68
+
69
+ As a Llama base model, the model can be easily served with:
70
+
71
+
72
+ vllm serve allenai/Llama-3.1-Tulu-3-8B
73
+
74
+
75
+ Note that given the long chat template of Llama, you may want to use --max_model_len=8192.
76
+
77
+
78
+ Chat template
79
 
80
 
81
 
82
  The chat template for our models is formatted as:
83
 
84
 
85
+ <|user|>\nHow are you doing?\n<|assistant|>\nI'm just a
86
+ computer program, so I don't have feelings, but I'm functioning as
87
+ expected. How can I assist you today?<|endoftext|>
88
+
89
 
90
  Or with new lines expanded:
91
 
92
+
93
  <|user|>
94
  How are you doing?
95
  <|assistant|>
96
+ I'm just a computer program, so I don't have feelings, but I'm
97
+ functioning as expected. How can I assist you today?<|endoftext|>
98
+
99
 
100
  It is embedded within the tokenizer as well, for tokenizer.apply_chat_template.
101
 
102
+
103
+ System prompt
104
 
105
 
106
 
 
109
 
110
  You are Tulu 3, a helpful and harmless AI Assistant built by the Allen Institute for AI.
111
 
112
+
113
  The model has not been trained with a specific system prompt in mind.
114
 
115
+
116
+ Bias, Risks, and Limitations
117
 
118
 
119
 
 
126
  included a mix of Web data and technical sources like books and code.
127
  See the Falcon 180B model card for an example of this.
128
 
129
+
130
  Hyperparamters
131
 
132
+
133
  PPO settings for RLVR:
134
 
135
+
136
  Learning Rate: 3 × 10⁻⁷
137
  Discount Factor (gamma): 1.0
138
  General Advantage Estimation (lambda): 0.95
 
152
  KL penalty coefficient (beta): [0.1, 0.05, 0.03, 0.01]
153
  Warm up ratio (omega): 0.0
154
 
155
+
156
+ License and use
157
 
158
 
159
 
 
167
  generated from third party models and are subject to additional terms:
168
  Gemma Terms of Use and Qwen License Agreement (models were improved using Qwen 2.5).
169
 
170
+
171
+ Citation
172
 
173
 
174
 
 
207
  }
208
 
209
  ---
 
210
  ## Use with llama.cpp
211
  Install llama.cpp through brew (works on Mac and Linux)
212