|
tldr; This is Phi 3 Medium finetuned for roleplaying. |
|
|
|
We needed more explicit moist. |
|
|
|
It failed. |
|
|
|
Training Details: |
|
- 8x H100 80GB SXM GPUs |
|
- 10 minutes training duration |
|
- A continued finetune of Cream-Phi-3-14B-v1b (now released as the official v1) |
|
|
|
Results for Roleplay Mode (i.e., not Instruct format): |
|
- Workable RP formatting with occassional mistakes. (Yep, it got worse) |
|
- Long-ish and moist response. It cooks fast. |
|
- Slightly incoherent. Can go hard on moist scenes but with poor spatial and anatomical understanding. |
|
- Important: My testing is lazy and flawed. Take it with a grain of salt and test the GGUFs before taking notes. |
|
|
|
 |
|
(No eval split = no eval metrics ^) |
|
|
|
|
|
Axolotl Config (some fields omitted) |
|
```yaml |
|
base_model: BeaverAI/Cream-Phi-3-14B-v1b |
|
load_in_4bit: true |
|
bf16: auto |
|
fp16: |
|
tf32: false |
|
flash_attention: true |
|
|
|
sequence_len: 6144 |
|
datasets: |
|
- path: SicariusSicariiStuff/Bluemoon_Top50MB_Sorted_Fixed |
|
type: customphi3 |
|
|
|
num_epochs: 2 |
|
warmup_steps: 5 |
|
weight_decay: 0.1 |
|
|
|
adapter: lora |
|
lora_r: 32 |
|
lora_alpha: 16 |
|
lora_dropout: 0.1 |
|
lora_target_linear: true |
|
|
|
gradient_accumulation_steps: 2 |
|
micro_batch_size: 1 |
|
gradient_checkpointing: true |
|
gradient_checkpointing_kwargs: |
|
use_reentrant: true |
|
|
|
sample_packing: true |
|
pad_to_sequence_len: true |
|
|
|
optimizer: paged_adamw_8bit |
|
lr_scheduler: cosine |
|
learning_rate: 0.0001 |
|
max_grad_norm: 1.0 |
|
``` |
|
|