cstr commited on
Commit
5b94316
1 Parent(s): 3c9bcc2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -10
README.md CHANGED
@@ -13,14 +13,13 @@ tags:
13
  base_model: cstr/phi-3-orpo-v8_16
14
  ---
15
 
16
- # Uploaded model
17
 
18
- - **Developed by:** cstr
19
- - **License:** apache-2.0
20
- - **Finetuned from model :** cstr/phi-3-orpo-v8_16
21
 
22
- This is a quick experiment with only 1000 orpo steps (after an initial 150) from a azureml translated german orca binarized-dataset (johannhartmann/mistralorpo), with original phi-3 prompt template. The immediate result is not really good, but also not bad enough to disencourage further experiments.
23
- On english benchmarks:
 
24
 
25
  | Metric |Value|
26
  |---------------------------------|----:|
@@ -32,12 +31,12 @@ On english benchmarks:
32
  |Winogrande (5-shot) |70.24|
33
  |GSM8k (5-shot) |62.32|
34
 
35
- On german EQ-Bench (v2_de) 51.82 (insignificant over 51.41) for original llamafied but with still only 164/171 correctly parsed.
36
 
37
- Note: We can improve the latter by only a few SFT steps, as shown with cas/phi3-mini-4k-llamafied-sft-v3 (170 correct but with then only 39.46 score in v2_de, which was also an experiment in changing the prompt template).
38
  All that was quickly done with bnb and q4 quants only, which might, in theory, affect especially such small dense models significantly.
39
  But it served the intention for both proof-of-concept-experiments at least. Probably it would easily be possible to further improve results, but that what take some time and compute.
40
 
41
- This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
42
 
43
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
13
  base_model: cstr/phi-3-orpo-v8_16
14
  ---
15
 
16
+ # Model details
17
 
18
+ This is a quick experiment on llamafied phi-3 with only 1000 orpo steps from an azureml translated german orca binarized-dataset (johannhartmann/mistralorpo), with original phi-3 prompt template. The immediate result is not really good, but also not bad enough to disencourage further experiments.
 
 
19
 
20
+ # Benchmark results
21
+
22
+ This was an experiment on a german dataset snippet which, as expected, worsened results on english benchmarks:
23
 
24
  | Metric |Value|
25
  |---------------------------------|----:|
 
31
  |Winogrande (5-shot) |70.24|
32
  |GSM8k (5-shot) |62.32|
33
 
34
+ On german EQ-Bench (v2_de) 51.82 (insignificant over 51.41 for original llamafied but significantly better than intermediate cstr/phi-3-orpo-v8_16 which after initial 150 test steps achieved 46.38) but with still only 164/171 correctly parsed.
35
 
36
+ Note: We can improve the correctness of parsing, i.a., by only a few SFT steps, as shown with cas/phi3-mini-4k-llamafied-sft-v3 (170/171 correct but with then only 39.46 score in v2_de, which was also an experiment in changing the prompt template).
37
  All that was quickly done with bnb and q4 quants only, which might, in theory, affect especially such small dense models significantly.
38
  But it served the intention for both proof-of-concept-experiments at least. Probably it would easily be possible to further improve results, but that what take some time and compute.
39
 
40
+ # Training setup
41
 
42
+ This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.