Model Card for Model ID

AuroraGPT-7B base model with Tulu-3 SFT dataset applied for instruction tuning

Usage

This model uses a pretty standart chat interface. Using the supplied tokenizer, you can convert from input messages:

messages = [{"role": "system", "content": <system_prompt>},{"role": "user", "content": <user_prompt>}]

to a chat-string using tokenizer.apply_chat_template(message).

Training Procedure

  • LR = 5x10^-5
  • per-gpu batch size = 1
  • Gradient accumulation = 6
  • Global batch size = 768
Downloads last month
26
Safetensors
Model size
5.93B params
Tensor type
FP16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for argonne-private/AuroraGPT-Tulu3-SFT-0125

Finetuned
(1)
this model
Quantizations
2 models

Dataset used to train argonne-private/AuroraGPT-Tulu3-SFT-0125