lemexp-afp-small-thms-deepseek-coder-1.3b-base

This model is a fine-tuned version of deepseek-ai/deepseek-coder-1.3b-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1166

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 6
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.4601 0.2001 2492 0.3758
0.369 0.4001 4984 0.3327
0.327 0.6002 7476 0.2894
0.2988 0.8003 9968 0.2848
0.2873 1.0003 12460 0.2698
0.2487 1.2004 14952 0.2669
0.2371 1.4004 17444 0.2430
0.2324 1.6005 19936 0.2373
0.2278 1.8006 22428 0.2372
0.2212 2.0006 24920 0.2190
0.1991 2.2007 27412 0.2131
0.1854 2.4008 29904 0.2092
0.1769 2.6008 32396 0.2034
0.1818 2.8009 34888 0.1904
0.1798 3.0010 37380 0.1980
0.1656 3.2010 39872 0.1820
0.1533 3.4011 42364 0.1724
0.1557 3.6012 44856 0.1687
0.1536 3.8012 47348 0.1721
0.1531 4.0013 49840 0.1634
0.1256 4.2013 52332 0.1529
0.1274 4.4014 54824 0.1471
0.1294 4.6015 57316 0.1433
0.1209 4.8015 59808 0.1375
0.1298 5.0016 62300 0.1308
0.1064 5.2017 64792 0.1269
0.1063 5.4017 67284 0.1262
0.1036 5.6018 69776 0.1213
0.1084 5.8019 72268 0.1166

Framework versions

  • PEFT 0.14.0
  • Transformers 4.47.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
0
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for yalhessi/lemexp-afp-small-thms-deepseek-coder-1.3b-base

Adapter
(140)
this model