alvarobartt HF staff commited on
Commit
718ef20
·
1 Parent(s): aac7e40

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -48
README.md CHANGED
@@ -2,7 +2,6 @@
2
  model-index:
3
  - name: notus-7b-dpo-lora
4
  results: []
5
- license: mit
6
  datasets:
7
  - argilla/ultrafeedback-binarized-avg-rating-for-dpo
8
  language:
@@ -10,36 +9,37 @@ language:
10
  base_model: alignment-handbook/zephyr-7b-sft-full
11
  library_name: transformers
12
  pipeline_tag: text-generation
 
 
 
 
 
13
  ---
14
 
15
- # Model Card for Model ID
16
-
17
- <!-- Provide a quick summary of what the model is/does. -->
18
-
19
 
 
 
 
 
 
20
 
21
  ## Model Details
22
 
23
  ### Model Description
24
 
25
- <!-- Provide a longer summary of what this model is. -->
26
-
27
-
28
-
29
- - **Developed by:** [More Information Needed]
30
- - **Shared by [optional]:** [More Information Needed]
31
- - **Model type:** [More Information Needed]
32
- - **Language(s) (NLP):** [More Information Needed]
33
- - **License:** [More Information Needed]
34
- - **Finetuned from model [optional]:** [More Information Needed]
35
 
36
  ### Model Sources [optional]
37
 
38
- <!-- Provide the basic links for the model. -->
39
-
40
- - **Repository:** [More Information Needed]
41
- - **Paper [optional]:** [More Information Needed]
42
- - **Demo [optional]:** [More Information Needed]
43
 
44
  ## Uses
45
 
@@ -139,26 +139,7 @@ Use the code below to get started with the model.
139
  #### Summary
140
 
141
 
142
-
143
- ## Model Examination [optional]
144
-
145
- <!-- Relevant interpretability work for the model goes here -->
146
-
147
- [More Information Needed]
148
-
149
- ## Environmental Impact
150
-
151
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
152
-
153
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
154
-
155
- - **Hardware Type:** [More Information Needed]
156
- - **Hours used:** [More Information Needed]
157
- - **Cloud Provider:** [More Information Needed]
158
- - **Compute Region:** [More Information Needed]
159
- - **Carbon Emitted:** [More Information Needed]
160
-
161
- ## Technical Specifications [optional]
162
 
163
  ### Model Architecture and Objective
164
 
@@ -170,7 +151,7 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
170
 
171
  #### Hardware
172
 
173
- [More Information Needed]
174
 
175
  #### Software
176
 
@@ -206,11 +187,4 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
206
 
207
  [More Information Needed]
208
 
209
-
210
  ## Training procedure
211
-
212
-
213
- ### Framework versions
214
-
215
-
216
- - PEFT 0.6.1
 
2
  model-index:
3
  - name: notus-7b-dpo-lora
4
  results: []
 
5
  datasets:
6
  - argilla/ultrafeedback-binarized-avg-rating-for-dpo
7
  language:
 
9
  base_model: alignment-handbook/zephyr-7b-sft-full
10
  library_name: transformers
11
  pipeline_tag: text-generation
12
+ tags:
13
+ - dpo
14
+ - preference
15
+ - ultrafeedback
16
+ license: apache-2.0
17
  ---
18
 
19
+ # Model Card for Notus 7B
 
 
 
20
 
21
+ Notus is going to be a collection of fine-tuned models using DPO, similarly to Zephyr, but mainly focused
22
+ on the Direct Preference Optimization (DPO) step, aiming to incorporate preference feedback into the LLMs
23
+ when fine-tuning those. Notus models are intended to be used as assistants via chat-like applications, and
24
+ are evaluated with the MT-Bench and AlpacaEval benchmarks, to be directly compared with Zephyr fine-tuned models
25
+ also using DPO.
26
 
27
  ## Model Details
28
 
29
  ### Model Description
30
 
31
+ - **Developed by:** Argilla, Inc. (based on HuggingFace H4 and MistralAI previous efforts and amazing work)
32
+ - **Shared by:** Argilla, Inc.
33
+ - **Model type:** GPT-like 7B model DPO fine-tuned using LoRA
34
+ - **Language(s) (NLP):** Mainly English
35
+ - **License:** Apache 2.0 (same as Zephyr 7B SFT and Mistral 7B v0.1)
36
+ - **Finetuned from model:** [`alignment-handbook/zephyr-7b-sft-full`](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full)
 
 
 
 
37
 
38
  ### Model Sources [optional]
39
 
40
+ - **Repository:** https://github.com/argilla-io/notus-7b-dpo
41
+ - **Paper:** N/A
42
+ - **Demo:** https://argilla-notus-chat-ui.hf.space/
 
 
43
 
44
  ## Uses
45
 
 
139
  #### Summary
140
 
141
 
142
+ ## Technical Specifications
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
143
 
144
  ### Model Architecture and Objective
145
 
 
151
 
152
  #### Hardware
153
 
154
+ 8 x A100 40GB
155
 
156
  #### Software
157
 
 
187
 
188
  [More Information Needed]
189
 
 
190
  ## Training procedure