UsernameJustAnother commited on
Commit
df1c04b
1 Parent(s): cf3b4f8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -29,7 +29,7 @@ datasets:
29
  - **License:** apache-2.0
30
  - **Finetuned from model :** unsloth/Mistral-Nemo-Base-2407
31
 
32
- **Standard disclaimer:** This is me teaching myself the basics of fine-tuning, with notes extensively borrowed from https://huggingface.co/nothingiisreal/MN-12B-Celeste-V1.9. Huge props to [nothingisreal](https://huggingface.co/nothingiisreal) for posting their process and making me think this was even possible for a little fish like me.
33
 
34
  The aim here is for a solid RP/storywriting model that will fit in 16GB of VRAM with a decent amount of context (> 16K).
35
 
@@ -40,7 +40,7 @@ The aim here is for a solid RP/storywriting model that will fit in 16GB of VRAM
40
  - 2K of Claude instruct, lightly curated & de-clauded
41
  - 2K of curated Falling through the Skies
42
  - 2K of curated/lightly de-ministrated C2 chat
43
- - Trained on a single 80GB A100 from runpod.io, with batch size of 8 (up from 2 on A100 40G), so far less steps involved.
44
  - And remember kids, water is wet and fish are moist.
45
 
46
  I pulled v7 because I honestly don't think it's as good as v6, and don't want folks to get the wrong idea that it's better just because the version number is higher. Besides, nothing good ever fires on all _seven_ cylinders.
 
29
  - **License:** apache-2.0
30
  - **Finetuned from model :** unsloth/Mistral-Nemo-Base-2407
31
 
32
+ **Standard disclaimer:** This is me teaching myself the basics of fine-tuning, with notes extensively borrowed from [MN-12B-Celeste-V1.9](https://huggingface.co/nothingiisreal/MN-12B-Celeste-V1.9). Huge props to [nothingisreal](https://huggingface.co/nothingiisreal) for posting their process and making me think this was even possible for a little fish like me.
33
 
34
  The aim here is for a solid RP/storywriting model that will fit in 16GB of VRAM with a decent amount of context (> 16K).
35
 
 
40
  - 2K of Claude instruct, lightly curated & de-clauded
41
  - 2K of curated Falling through the Skies
42
  - 2K of curated/lightly de-ministrated C2 chat
43
+ - Trained on a single 80GB A100 from runpod.io, with batch size of 8 (up from 2 on A100 40G), so far less steps involved. Took about 7.5hrs to run.
44
  - And remember kids, water is wet and fish are moist.
45
 
46
  I pulled v7 because I honestly don't think it's as good as v6, and don't want folks to get the wrong idea that it's better just because the version number is higher. Besides, nothing good ever fires on all _seven_ cylinders.