sft_gpt7b_domar_pretuned
This model is a fine-tuned version of AI-Sweden-Models/gpt-sw3-6.7b on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.5598
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 4
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
1.6679 | 0.02 | 500 | 1.6397 |
1.7258 | 0.04 | 1000 | 1.6302 |
1.6602 | 0.06 | 1500 | 1.6228 |
1.6731 | 0.08 | 2000 | 1.6173 |
1.7399 | 0.09 | 2500 | 1.6155 |
1.8291 | 0.11 | 3000 | 1.6106 |
1.7113 | 0.13 | 3500 | 1.6078 |
1.6768 | 0.15 | 4000 | 1.6037 |
1.7504 | 0.17 | 4500 | 1.6028 |
1.598 | 0.19 | 5000 | 1.6003 |
1.5689 | 0.21 | 5500 | 1.5974 |
1.6727 | 0.23 | 6000 | 1.5961 |
1.5689 | 0.25 | 6500 | 1.5952 |
1.6331 | 0.26 | 7000 | 1.5931 |
1.6459 | 0.28 | 7500 | 1.5921 |
1.6334 | 0.3 | 8000 | 1.5918 |
1.6803 | 0.32 | 8500 | 1.5894 |
1.6182 | 0.34 | 9000 | 1.5879 |
1.693 | 0.36 | 9500 | 1.5866 |
1.6276 | 0.38 | 10000 | 1.5857 |
1.612 | 0.4 | 10500 | 1.5859 |
1.6412 | 0.42 | 11000 | 1.5843 |
1.6827 | 0.43 | 11500 | 1.5824 |
1.584 | 0.45 | 12000 | 1.5826 |
1.591 | 0.47 | 12500 | 1.5817 |
1.6641 | 0.49 | 13000 | 1.5805 |
1.6555 | 0.51 | 13500 | 1.5799 |
1.689 | 0.53 | 14000 | 1.5798 |
1.6216 | 0.55 | 14500 | 1.5785 |
1.6271 | 0.57 | 15000 | 1.5780 |
1.6898 | 0.59 | 15500 | 1.5771 |
1.6752 | 0.6 | 16000 | 1.5762 |
1.5884 | 0.62 | 16500 | 1.5761 |
1.6094 | 0.64 | 17000 | 1.5755 |
1.5202 | 0.66 | 17500 | 1.5749 |
1.6506 | 0.68 | 18000 | 1.5744 |
1.6805 | 0.7 | 18500 | 1.5736 |
1.6421 | 0.72 | 19000 | 1.5732 |
1.652 | 0.74 | 19500 | 1.5731 |
1.5729 | 0.76 | 20000 | 1.5722 |
1.6231 | 0.77 | 20500 | 1.5715 |
1.6527 | 0.79 | 21000 | 1.5710 |
1.656 | 0.81 | 21500 | 1.5705 |
1.5076 | 0.83 | 22000 | 1.5708 |
1.6925 | 0.85 | 22500 | 1.5700 |
1.6761 | 0.87 | 23000 | 1.5701 |
1.6376 | 0.89 | 23500 | 1.5697 |
1.696 | 0.91 | 24000 | 1.5686 |
1.6921 | 0.93 | 24500 | 1.5688 |
1.6896 | 0.94 | 25000 | 1.5681 |
1.7896 | 0.96 | 25500 | 1.5678 |
1.6342 | 0.98 | 26000 | 1.5679 |
1.6001 | 1.0 | 26500 | 1.5679 |
1.7183 | 1.02 | 27000 | 1.5678 |
1.5685 | 1.04 | 27500 | 1.5675 |
1.5349 | 1.06 | 28000 | 1.5672 |
1.6439 | 1.08 | 28500 | 1.5677 |
1.6201 | 1.1 | 29000 | 1.5670 |
1.6209 | 1.11 | 29500 | 1.5664 |
1.5495 | 1.13 | 30000 | 1.5665 |
1.5573 | 1.15 | 30500 | 1.5661 |
1.6094 | 1.17 | 31000 | 1.5660 |
1.625 | 1.19 | 31500 | 1.5662 |
1.5404 | 1.21 | 32000 | 1.5656 |
1.547 | 1.23 | 32500 | 1.5655 |
1.5997 | 1.25 | 33000 | 1.5648 |
1.6287 | 1.27 | 33500 | 1.5651 |
1.4998 | 1.28 | 34000 | 1.5650 |
1.7069 | 1.3 | 34500 | 1.5642 |
1.5453 | 1.32 | 35000 | 1.5643 |
1.5378 | 1.34 | 35500 | 1.5640 |
1.5702 | 1.36 | 36000 | 1.5643 |
1.6593 | 1.38 | 36500 | 1.5641 |
1.4526 | 1.4 | 37000 | 1.5641 |
1.5875 | 1.42 | 37500 | 1.5635 |
1.7064 | 1.44 | 38000 | 1.5632 |
1.6517 | 1.45 | 38500 | 1.5629 |
1.5637 | 1.47 | 39000 | 1.5630 |
1.5557 | 1.49 | 39500 | 1.5632 |
1.6615 | 1.51 | 40000 | 1.5626 |
1.5869 | 1.53 | 40500 | 1.5629 |
1.6263 | 1.55 | 41000 | 1.5622 |
1.5958 | 1.57 | 41500 | 1.5624 |
1.5646 | 1.59 | 42000 | 1.5620 |
1.5605 | 1.61 | 42500 | 1.5620 |
1.5753 | 1.62 | 43000 | 1.5621 |
1.6315 | 1.64 | 43500 | 1.5618 |
1.6351 | 1.66 | 44000 | 1.5616 |
1.4516 | 1.68 | 44500 | 1.5615 |
1.6654 | 1.7 | 45000 | 1.5616 |
1.4796 | 1.72 | 45500 | 1.5613 |
1.7079 | 1.74 | 46000 | 1.5613 |
1.6877 | 1.76 | 46500 | 1.5613 |
1.5899 | 1.78 | 47000 | 1.5612 |
1.5419 | 1.79 | 47500 | 1.5609 |
1.5972 | 1.81 | 48000 | 1.5611 |
1.6402 | 1.83 | 48500 | 1.5609 |
1.6036 | 1.85 | 49000 | 1.5607 |
1.5839 | 1.87 | 49500 | 1.5607 |
1.6727 | 1.89 | 50000 | 1.5608 |
1.5385 | 1.91 | 50500 | 1.5605 |
1.5856 | 1.93 | 51000 | 1.5608 |
1.6168 | 1.95 | 51500 | 1.5604 |
1.5426 | 1.96 | 52000 | 1.5605 |
1.5768 | 1.98 | 52500 | 1.5603 |
1.519 | 2.0 | 53000 | 1.5606 |
1.615 | 2.02 | 53500 | 1.5607 |
1.6096 | 2.04 | 54000 | 1.5606 |
1.5881 | 2.06 | 54500 | 1.5604 |
1.5782 | 2.08 | 55000 | 1.5604 |
1.6988 | 2.1 | 55500 | 1.5604 |
1.6284 | 2.12 | 56000 | 1.5604 |
1.6219 | 2.13 | 56500 | 1.5605 |
1.5288 | 2.15 | 57000 | 1.5604 |
1.57 | 2.17 | 57500 | 1.5603 |
1.6524 | 2.19 | 58000 | 1.5605 |
1.5774 | 2.21 | 58500 | 1.5602 |
1.5434 | 2.23 | 59000 | 1.5601 |
1.4985 | 2.25 | 59500 | 1.5602 |
1.4937 | 2.27 | 60000 | 1.5602 |
1.5134 | 2.29 | 60500 | 1.5601 |
1.5064 | 2.3 | 61000 | 1.5601 |
1.6091 | 2.32 | 61500 | 1.5601 |
1.6257 | 2.34 | 62000 | 1.5600 |
1.6497 | 2.36 | 62500 | 1.5601 |
1.5469 | 2.38 | 63000 | 1.5599 |
1.5453 | 2.4 | 63500 | 1.5600 |
1.5256 | 2.42 | 64000 | 1.5599 |
1.5616 | 2.44 | 64500 | 1.5600 |
1.6449 | 2.46 | 65000 | 1.5600 |
1.6298 | 2.47 | 65500 | 1.5598 |
1.697 | 2.49 | 66000 | 1.5599 |
1.5351 | 2.51 | 66500 | 1.5598 |
1.5463 | 2.53 | 67000 | 1.5599 |
1.6256 | 2.55 | 67500 | 1.5598 |
1.5567 | 2.57 | 68000 | 1.5598 |
1.6036 | 2.59 | 68500 | 1.5599 |
1.5113 | 2.61 | 69000 | 1.5598 |
1.6975 | 2.63 | 69500 | 1.5598 |
1.69 | 2.64 | 70000 | 1.5599 |
1.5828 | 2.66 | 70500 | 1.5598 |
1.6462 | 2.68 | 71000 | 1.5598 |
1.5645 | 2.7 | 71500 | 1.5598 |
1.5385 | 2.72 | 72000 | 1.5599 |
1.6244 | 2.74 | 72500 | 1.5599 |
1.5805 | 2.76 | 73000 | 1.5599 |
1.6334 | 2.78 | 73500 | 1.5599 |
1.5254 | 2.8 | 74000 | 1.5598 |
1.5892 | 2.81 | 74500 | 1.5599 |
1.68 | 2.83 | 75000 | 1.5599 |
1.5866 | 2.85 | 75500 | 1.5598 |
1.5692 | 2.87 | 76000 | 1.5598 |
1.4843 | 2.89 | 76500 | 1.5598 |
1.633 | 2.91 | 77000 | 1.5598 |
1.6205 | 2.93 | 77500 | 1.5598 |
1.5802 | 2.95 | 78000 | 1.5598 |
1.5723 | 2.97 | 78500 | 1.5598 |
1.6153 | 2.98 | 79000 | 1.5598 |
Framework versions
- PEFT 0.8.2
- Transformers 4.38.1
- Pytorch 2.2.0+cu118
- Datasets 2.17.1
- Tokenizers 0.15.2
- Downloads last month
- 1
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for thorirhrafn/sft_gpt7b_domar_pretuned
Base model
AI-Sweden-Models/gpt-sw3-6.7b