Update README.md
Browse files
README.md
CHANGED
@@ -12,7 +12,17 @@ tags:
|
|
12 |
base_model: alpindale/Mistral-7B-v0.2
|
13 |
---
|
14 |
|
15 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
16 |
|
17 |
- **Developed by:** macadeliccc
|
18 |
- **License:** apache-2.0
|
|
|
12 |
base_model: alpindale/Mistral-7B-v0.2
|
13 |
---
|
14 |
|
15 |
+
# Mistral-7B-v0.2-OpenHermes
|
16 |
+
|
17 |
+
SFT Training Params:
|
18 |
+
+ Learning Rate: 2e-4
|
19 |
+
+ Batch Size: 8
|
20 |
+
+ Gradient Accumulation steps: 4
|
21 |
+
+ Dataset: teknium/OpenHermes-2.5 (200k split contains a slight bias towards rp and theory of life)
|
22 |
+
+ R: 16
|
23 |
+
+ Lora Alpha: 16
|
24 |
+
|
25 |
+
Training Time: 13 hours on A100
|
26 |
|
27 |
- **Developed by:** macadeliccc
|
28 |
- **License:** apache-2.0
|