lemexp-processed-task1_min_symbols_template_small-deepseek-coder-1.3b-base

This model is a fine-tuned version of deepseek-ai/deepseek-coder-1.3b-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1407

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 6
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.3044 0.2000 5030 0.2924
0.2643 0.4001 10060 0.2585
0.2396 0.6001 15090 0.2310
0.2277 0.8001 20120 0.2108
0.2103 1.0002 25150 0.2088
0.1942 1.2002 30180 0.2031
0.1846 1.4002 35210 0.1955
0.1814 1.6003 40240 0.1827
0.1688 1.8003 45270 0.1789
0.1772 2.0003 50300 0.1767
0.143 2.2003 55330 0.1748
0.1465 2.4004 60360 0.1691
0.1517 2.6004 65390 0.1702
0.1533 2.8004 70420 0.1651
0.1412 3.0005 75450 0.1569
0.1237 3.2005 80480 0.1581
0.131 3.4005 85510 0.1561
0.1221 3.6006 90540 0.1486
0.1161 3.8006 95570 0.1464
0.1165 4.0006 100600 0.1413
0.0994 4.2007 105630 0.1458
0.1113 4.4007 110660 0.1483
0.0996 4.6007 115690 0.1426
0.1116 4.8008 120720 0.1392
0.1004 5.0008 125750 0.1380
0.0814 5.2008 130780 0.1418
0.086 5.4009 135810 0.1406
0.0731 5.6009 140840 0.1416
0.0807 5.8009 145870 0.1407

Framework versions

  • PEFT 0.14.0
  • Transformers 4.47.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
186
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for yalhessi/lemexp-processed-task1_min_symbols_template_small-deepseek-coder-1.3b-base

Adapter
(140)
this model