Model Card for Llama-3-8B-Instruct-SkillMix

This model was SFT-ed from meta-llama/Meta-Llama-3-8B with data generated by the Seed-Dataset Agnostic version of the Instruct-SkillMix pipeline.

Training Details

We used 4000 examples from Instruct-SkillMix-SDA(k=2) (data available at PrincetonPLI/Instruct-SkillMix-SDA).

  • LR: 2e-5
    • Linear Warmup Ratio: 0.03
    • Decay: Cosine Decay to 0
  • Batch Size: 128
  • epoch: 7 / 15
  • Optimizer: AdamW
  • Sequence Length: 1024

Evaluation Details

We provide the set of generation configuration used for evaluation.

AlpacaEval

  • model_kwargs:
    • torch_dtype: 'bfloat16'
    • max_new_tokens: 2048
  • temperature: 0.9
  • top_p: 1.0
  • do_sample: True
  • stop_token_ids:
    • 128001
    • 128009

MTBench

  • model_kwargs:
    • torch_dtype: 'bfloat16'
    • max_new_tokens: 1024
  • temperature: 0.7
  • stop_token_ids:
    • 128001
    • 128009

WildBench

  • model_kwargs:
    • torch_dtype: 'bfloat16'
    • max_new_tokens: 4096
  • temperature: 0.9
  • top_p: 1.0
  • do_sample: True
  • stop_token_ids:
    • 128001
    • 128009

Citation

Paper: Instruct-SkillMix

@misc{kaur2024instructskillmixpowerfulpipelinellm,
      title={Instruct-SkillMix: A Powerful Pipeline for LLM Instruction Tuning}, 
      author={Simran Kaur and Simon Park and Anirudh Goyal and Sanjeev Arora},
      year={2024},
      eprint={2408.14774},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2408.14774}, 
}

Contact

Simran Kaur, Princeton University

Simon Park, Princeton University

{skaur, juhyunp} 'at' princeton 'dot' edu

Downloads last month
16
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for PrincetonPLI/Llama-3-8B-Instruct-SkillMix

Finetuned
(384)
this model

Collection including PrincetonPLI/Llama-3-8B-Instruct-SkillMix