Update README.md
Browse files
README.md
CHANGED
@@ -68,6 +68,29 @@ Note that `checkpoint_0` is the base model and `checkpoint_mistral` is OpenMath-
|
|
68 |
The performance is _not good_™, but this model could be used to quickly generate synthetic data, as the coverage is decent for this dataset. The uploaded model is checkpoint-2.6k.
|
69 |
|
70 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
71 |
People involved in creating this fine tune:
|
72 |
- Coulton Theuer [[email protected]]
|
73 |
- Bret Ellenbogen [[email protected]]
|
|
|
68 |
The performance is _not good_™, but this model could be used to quickly generate synthetic data, as the coverage is decent for this dataset. The uploaded model is checkpoint-2.6k.
|
69 |
|
70 |
|
71 |
+
| Checkpoint | Coverage |
|
72 |
+
|------------|-----------|
|
73 |
+
| 1600 | 0.890244 |
|
74 |
+
| 2200 | 0.890244 |
|
75 |
+
| 2400 | 0.890244 |
|
76 |
+
| **2600** | 0.878049 |
|
77 |
+
| 1200 | 0.878049 |
|
78 |
+
| 2800 | 0.853659 |
|
79 |
+
| 2000 | 0.853659 |
|
80 |
+
| 800 | 0.841463 |
|
81 |
+
| 1000 | 0.829268 |
|
82 |
+
| 1800 | 0.829268 |
|
83 |
+
| 1400 | 0.817073 |
|
84 |
+
| mistral | 0.804878 |
|
85 |
+
| 3000 | 0.780488 |
|
86 |
+
| 600 | 0.768293 |
|
87 |
+
| 400 | 0.731707 |
|
88 |
+
| 200 | 0.682927 |
|
89 |
+
| 0 | 0.000000 |
|
90 |
+
|
91 |
+
Note that after 800 steps the fine tuned model had better coverage than the much larger teacher model.
|
92 |
+
|
93 |
+
|
94 |
People involved in creating this fine tune:
|
95 |
- Coulton Theuer [[email protected]]
|
96 |
- Bret Ellenbogen [[email protected]]
|