lemexp-processed-task1_min_symbols_lemma_command_small-deepseek-coder-1.3b-base

This model is a fine-tuned version of deepseek-ai/deepseek-coder-1.3b-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4329

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 6
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.6364 0.2000 3683 0.6357
0.5857 0.4001 7366 0.5827
0.5682 0.6001 11049 0.5516
0.5421 0.8001 14732 0.5293
0.5142 1.0002 18415 0.5177
0.4674 1.2002 22098 0.5015
0.4615 1.4002 25781 0.5000
0.453 1.6003 29464 0.4770
0.4506 1.8003 33147 0.4701
0.4309 2.0003 36830 0.4646
0.3829 2.2004 40513 0.4667
0.3925 2.4004 44196 0.4595
0.3858 2.6004 47879 0.4566
0.3879 2.8005 51562 0.4439
0.3764 3.0005 55245 0.4379
0.3267 3.2005 58928 0.4502
0.3346 3.4006 62611 0.4443
0.3363 3.6006 66294 0.4339
0.3321 3.8006 69977 0.4350
0.3423 4.0007 73660 0.4288
0.2789 4.2007 77343 0.4458
0.2928 4.4007 81026 0.4379
0.2963 4.6007 84709 0.4325
0.2887 4.8008 88392 0.4275
0.2949 5.0008 92075 0.4292
0.2437 5.2008 95758 0.4366
0.2424 5.4009 99441 0.4358
0.2528 5.6009 103124 0.4331
0.2477 5.8009 106807 0.4329

Framework versions

  • PEFT 0.14.0
  • Transformers 4.47.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for yalhessi/lemexp-processed-task1_min_symbols_lemma_command_small-deepseek-coder-1.3b-base

Adapter
(140)
this model