Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,99 @@
|
|
1 |
-
---
|
2 |
-
license: apache-2.0
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
base_model:
|
6 |
+
- mistralai/Mistral-7B-Instruct-v0.3
|
7 |
+
pipeline_tag: text2text-generation
|
8 |
+
library_name: unsloth
|
9 |
+
---
|
10 |
+
|
11 |
+
# ModelCard for UnGPT-v1
|
12 |
+
|
13 |
+
## Model Details
|
14 |
+
- **Name:** UnGPT-v1
|
15 |
+
- **Foundation Model:** Mistral v0.3 (7B parameters)
|
16 |
+
- **Recommended Context Length:** 16k tokens
|
17 |
+
- **Fine-tuning Methodology:** LoRA-based training with Odds Ratio Preference Optimization method, using a combination of ebooks and synthetic data.
|
18 |
+
|
19 |
+
## Usage Instructions
|
20 |
+
Use the Alpaca format for prompts:
|
21 |
+
```
|
22 |
+
### Instruction:
|
23 |
+
{instruction}
|
24 |
+
|
25 |
+
### Input:
|
26 |
+
{input}
|
27 |
+
|
28 |
+
### Response:
|
29 |
+
```
|
30 |
+
**Example prompts**
|
31 |
+
|
32 |
+
For instructions, it is not recommended to deviate from the provided examples. For the input, a minimum is 10 sentences, but more can work as the model can handle longer context sizes (Thanks to the Mistral 7B v0.3 base model).
|
33 |
+
|
34 |
+
1. **Completion Prompt:**
|
35 |
+
```
|
36 |
+
### Instruction:
|
37 |
+
Continue writing the story while retaining writing style. Write about 10 sentences.
|
38 |
+
|
39 |
+
### Input:
|
40 |
+
It was a dark and stormy night...
|
41 |
+
|
42 |
+
### Response:
|
43 |
+
```
|
44 |
+
|
45 |
+
2. **Fill-in-the-middle Prompt:**
|
46 |
+
```
|
47 |
+
### Instruction:
|
48 |
+
Fill in the missing part of the story ({{FILL_ME}}) with about 10 sentences while retaining the writing style.
|
49 |
+
|
50 |
+
### Input:
|
51 |
+
The bus was speeding down the road, cops chasing after it.
|
52 |
+
{{FILL_ME}}
|
53 |
+
She woke up to find herself in an unfamiliar room...
|
54 |
+
|
55 |
+
### Response:
|
56 |
+
```
|
57 |
+
|
58 |
+
## Dataset Preparation
|
59 |
+
|
60 |
+
For dataset acquisition and cleanup please refer steps 1 and 2 of my text-completion example, [molbal/llm-text-completion-finetune](https://github.com/molbal/llm-text-completion-finetune/).
|
61 |
+
|
62 |
+
Chunking: Split texts into chunks based on sentence boundaries, aiming for 100 sentences per example.
|
63 |
+
- For completion examples, 90 sentences were used as input, 10 sentences as response.
|
64 |
+
- For fill-in-the-middle examples, 80 + 10 sentences as input (before and after the {{FILL_ME}} placeholder, respectively), and 10 sentences as response.
|
65 |
+
|
66 |
+
The beauty of the ORPO method is that for a single prompt we can set both a positive and a negative example. I wanted the model to avoid 'GPTisms' so I had gpt4o-mini generate answers both for completion and FOM tasks and added them as a neative example.
|
67 |
+
|
68 |
+
The dataset used is ~15k examples, each approximately 9000 characters long including input, accepted and refused response. (Note these are characters not tokens)
|
69 |
+
|
70 |
+
## Training setup
|
71 |
+
|
72 |
+
- Fine-tuned the Mistral v0.3 foundation model using Unsloth and ORPO trainer.
|
73 |
+
|
74 |
+
- Training configuration:
|
75 |
+
- Batch size: 1
|
76 |
+
- Gradient accumulation steps: 4
|
77 |
+
- Learning rate scheduler type: Linear
|
78 |
+
- Optimizer: AdamW (8-bit)
|
79 |
+
- Number of training epochs: 1
|
80 |
+
|
81 |
+
- Hardware
|
82 |
+
- I used GPU accelerated containers from the provider vast.ai (My referral link: https://cloud.vast.ai/?ref_id=123492 ) and executed training for ~8 hours on a single RTX 4090.
|
83 |
+
|
84 |
+
- Training costs
|
85 |
+
- ~5€ for renting a GPU pod (+15€ in unsuccessful attempts)
|
86 |
+
- ~5€ in OpenAI API costs for generating refusals
|
87 |
+
|
88 |
+
**Licensing and Citation**
|
89 |
+
|
90 |
+
- **License:** This model is licensed under the Apache License 2.0.
|
91 |
+
- **Citation:**
|
92 |
+
```bibtex
|
93 |
+
@misc{ungpt-v1,
|
94 |
+
author = Bálint Molnár-Kaló,
|
95 |
+
title = {UnGPT-v1: A Fine-tuned Mistral Model for Story Continuation},
|
96 |
+
howpublished = {\url{https://huggingface.co/models/molbal/UnGPT-v1}},
|
97 |
+
year = 2024
|
98 |
+
}
|
99 |
+
```
|