lemexp-hol-thms-by-file-deepseek-coder-1.3b-base

This model is a fine-tuned version of deepseek-ai/deepseek-coder-1.3b-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2839

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 6
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.4174 0.2000 3777 0.4078
0.34 0.4000 7554 0.3589
0.3104 0.6001 11331 0.3305
0.2901 0.8001 15108 0.3113
0.2823 1.0001 18885 0.2998
0.2591 1.2001 22662 0.2958
0.2597 1.4001 26439 0.2871
0.2536 1.6002 30216 0.2895
0.2353 1.8002 33993 0.2819
0.2236 2.0002 37770 0.2784
0.2055 2.2002 41547 0.2898
0.2129 2.4003 45324 0.2687
0.2001 2.6003 49101 0.2719
0.2108 2.8003 52878 0.2714
0.2023 3.0003 56655 0.2650
0.1811 3.2003 60432 0.2709
0.1702 3.4004 64209 0.2655
0.176 3.6004 67986 0.2665
0.1702 3.8004 71763 0.2666
0.1597 4.0004 75540 0.2620
0.1388 4.2004 79317 0.2704
0.1452 4.4005 83094 0.2707
0.146 4.6005 86871 0.2705
0.1388 4.8005 90648 0.2622
0.1435 5.0005 94425 0.2649
0.1295 5.2006 98202 0.2794
0.1195 5.4006 101979 0.2780
0.124 5.6006 105756 0.2796
0.1183 5.8006 109533 0.2839

Framework versions

  • PEFT 0.14.0
  • Transformers 4.47.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
0
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for yalhessi/lemexp-hol-thms-by-file-deepseek-coder-1.3b-base

Adapter
(140)
this model