File size: 2,978 Bytes
a7b7aca
e1ccd33
0f7e722
 
a7b7aca
0f7e722
 
 
 
be76f7f
0f7e722
 
 
 
 
 
d490735
0f7e722
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
321af1c
 
 
0f7e722
 
96b3f4c
0f7e722
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
---
license: other
task_categories:
- text-generation
---

Dolphin 🐬
https://erichartford.com/dolphin

This model is based on llama1, so it is for non-commercial use only.  Future versions will be trained on llama2 and other open models that are suitable for commercial use.

This model is uncensored.  I have filtered the dataset to remove alignment and bias.  This makes the model compliant to any requests.  You are advised to implement your own alignment layer before exposing the model as a service.  It will be highly compliant to any requests, even unethical ones.  Please read my blog post about uncensored models.  https://erichartford.com/uncensored-models
You are responsible for any content you create using this model.  Enjoy responsibly.

## Dataset

This dataset is an open source implementation of [Microsoft's Orca](https://www.microsoft.com/en-us/research/publication/orca-progressive-learning-from-complex-explanation-traces-of-gpt-4/)

After uncensoring, deduping, and cleaning, our dataset consists of:

- 842,610 instructions of FLANv2 augmented with GPT-4 completions
- 2,625,353 instructions of FLANv2 augmented with GPT-3.5 completions

We followed the submix and system prompt distribution outlined in the Orca paper. With a few exceptions. We included all 75k of CoT in the FLAN-1m dataset rather than sampling that. Also, we found that many items were duplicated, so we removed duplicates, resulting in 3.5m instructs in the ChatGPT dataset.

Then we filtered out instances of alignment, refusal, avoidance, and bias, in order to produce an uncensored model upon which can be layered your personalized alignment LoRA.

We also filtered out duplicates and cleaned the data.

## Training
We trained with the flan5m (gpt3.5 completions) dataset in its entirety for 3 epochs at a learning rate of 2e-5 before we stopped training to avoid overfit.
We trained with the flan1m (gpt4 completions) dataset in its entirety for 2.5 epochs at a learning rate of 1e-5 before we stopped training to avoid overfit.
It took about 600 hours to train on 8x H100s
We used a prompt format similar to Vicuna, but we added the SYSTEM: field. 

Prompt format:
```
SYSTEM: {system}
USER: {prompt}
ASSISTANT:
```

Example:
```
SYSTEM: you are an expert marine biologist.
USER: Please list 10 ways that dolphins are superior to orcas.
ASSISTANT:
```

## Evaluation
Evaluation will be coming soon.

## Team
The core Dolphin Team includes:
- Eric "Faldore" Hartford
- Pankaj Mathur
- Rob "Rohan" O'Callahan
- Tom "TheBloke" Jobbins

## Gratitude
- Thank you to Microsoft for authoring the Orca paper and inspiring this work.
- Special thanks to WingLian, NanoBit, Teknium for helpful advice
- Special thanks to EdenCoder and chirper.ai for mentorship and financial sponsorship.
- Special thanks to Kilkonie for his very valued mentorship.
- Thank you to Catto
- Thank you to all the other people in the Open Source AI community who have taught me and helped me along the way.