argonne-private
/

AuroraGPT-Tulu3-SFT-0125

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Model Card for Model ID

AuroraGPT-7B base model with Tulu-3 SFT dataset applied for instruction tuning

Usage

This model uses a pretty standart chat interface. Using the supplied tokenizer, you can convert from input messages:

messages = [{"role": "system", "content": <system_prompt>},{"role": "user", "content": <user_prompt>}]

to a chat-string using tokenizer.apply_chat_template(message).

Training Procedure

LR = 5x10^-5
per-gpu batch size = 1
Gradient accumulation = 6
Global batch size = 768

Downloads last month: 26

Safetensors

Model size

5.93B params

Tensor type

FP16

·

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for argonne-private/AuroraGPT-Tulu3-SFT-0125

Base model

argonne-private/AuroraGPT-7B

Finetuned

(1)

this model

Quantizations

Dataset used to train argonne-private/AuroraGPT-Tulu3-SFT-0125