Update README.md
Browse files
README.md
CHANGED
@@ -63,11 +63,6 @@ Notes from previous model cards:
|
|
63 |
|
64 |

|
65 |
|
66 |
-
Note that `checkpoint_0` is the base model and `checkpoint_mistral` is OpenMath-Mistral-7B-v0.1-hf.
|
67 |
-
|
68 |
-
The performance is _not good_™, but this model could be used to quickly generate synthetic data, as the coverage is decent for this dataset. The uploaded model is checkpoint-2.6k.
|
69 |
-
|
70 |
-
|
71 |
| Checkpoint | Coverage |
|
72 |
|------------|-----------|
|
73 |
| 1600 | 0.890244 |
|
@@ -88,7 +83,10 @@ The performance is _not good_™, but this model could be used to quickly genera
|
|
88 |
| 200 | 0.682927 |
|
89 |
| 0 | 0.000000 |
|
90 |
|
91 |
-
Note that after 800 steps the fine tuned model had better coverage than the much larger teacher model.
|
|
|
|
|
|
|
92 |
|
93 |
|
94 |
People involved in creating this fine tune:
|
|
|
63 |
|
64 |

|
65 |
|
|
|
|
|
|
|
|
|
|
|
66 |
| Checkpoint | Coverage |
|
67 |
|------------|-----------|
|
68 |
| 1600 | 0.890244 |
|
|
|
83 |
| 200 | 0.682927 |
|
84 |
| 0 | 0.000000 |
|
85 |
|
86 |
+
Note that `checkpoint_0` is the base model and `checkpoint_mistral` is OpenMath-Mistral-7B-v0.1-hf. Also note that after 800 steps the fine tuned model had better coverage than the much larger teacher model.
|
87 |
+
|
88 |
+
The performance is _not good_™, but this model could be used to quickly generate synthetic data, as the coverage is decent for this dataset. The uploaded model is checkpoint-2.6k.
|
89 |
+
|
90 |
|
91 |
|
92 |
People involved in creating this fine tune:
|