TianyiQ's picture
Upload folder using huggingface_hub
536cb42 verified
|
raw
history blame
2.65 kB
metadata
license: other
base_model: meta-llama/Meta-Llama-3-8B
tags:
  - llama-factory
  - full
  - generated_from_trainer
model-index:
  - name: C017_random_sample_llama3-8b-base_pretrain_20240504_182259
    results: []

C017_random_sample_llama3-8b-base_pretrain_20240504_182259

This model is a fine-tuned version of /data/pro-align/progressalign/shared_storage/downloaded_models/llama3-8b-base on the C017_random_sample_data dataset. It achieves the following results on the evaluation set:

  • Loss: 2.4690

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1.5e-05
  • train_batch_size: 8
  • eval_batch_size: 16
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • total_train_batch_size: 64
  • total_eval_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: polynomial
  • lr_scheduler_warmup_steps: 20
  • num_epochs: 4.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.5442 0.2028 200 2.5552
2.5376 0.4057 400 2.5096
2.4487 0.6085 600 2.4831
2.5324 0.8114 800 2.4690
2.265 1.0142 1000 2.4733
2.3002 1.2170 1200 2.4736
2.29 1.4199 1400 2.4734
2.2566 1.6227 1600 2.4725
2.3052 1.8256 1800 2.4721
2.2702 2.0284 2000 2.4734
2.2411 2.2312 2200 2.4746
2.2413 2.4341 2400 2.4749
2.216 2.6369 2600 2.4749
2.2696 2.8398 2800 2.4747
2.2455 3.0426 3000 2.4752
2.216 3.2454 3200 2.4753
2.2348 3.4483 3400 2.4757
2.238 3.6511 3600 2.4753
2.2349 3.8540 3800 2.4752

Framework versions

  • Transformers 4.40.1
  • Pytorch 2.3.0
  • Datasets 2.19.0
  • Tokenizers 0.19.1