xwinxu commited on
Commit
83532b8
1 Parent(s): dad0387

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +40 -0
README.md ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - stanfordnlp/SHP
5
+ - Anthropic/hh-rlhf
6
+ - OpenAssistant/oasst1
7
+ language:
8
+ - en
9
+ metrics:
10
+ - accuracy
11
+ tags:
12
+ - human feedback
13
+ - rlhf
14
+ - preferences
15
+ - alignment
16
+ - HALO
17
+ - halos
18
+ - dpo
19
+ - rl
20
+ ---
21
+
22
+ ![halos](https://gist.github.com/assets/29318529/fe2d8391-dbd1-4b7e-9dc4-7cb97e55bc06)
23
+
24
+ This repo contains the model checkpoints for:
25
+ - model family <b>llama30b</b>
26
+ - optimized with the loss <b>SFT+KTO</b>
27
+ - aligned using the SHP, Anthropic HH and Open Assistant datasets.
28
+
29
+ Please refer to our code repository which contains intructions for training your own HALOs and links to our model cards.
30
+
31
+ If you find this repo or the technical paper useful in your research, please feel free to cite [our work](http://halos.github.io/):
32
+ ```
33
+ @misc{ethayarajh2023halos,
34
+ url = {http://halos.github.io/},
35
+ author = {Ethayarajh, Kawin and Xu, Winnie, and Jurafsky, Dan and Kiela, Douwe},
36
+ title = {Human-Centered Loss Functions (HALOs)},
37
+ publisher = {Contextual AI Blog},
38
+ year = {2023},
39
+ }
40
+ ```