SmolLM2-1.7B-UltraChat_200k

Quantized Low Rank Adaptation (QLoRA) finetuned from HuggingFaceTB/SmolLM2-1.7B to UltraChat 200k dataset.

Serves as an exercise in LLM post-training.

Model Details

Developed by: Andrew Melbourne
Model type: Language Model
License: Apache 2.0
Finetuned from model: HuggingFaceTB/SmolLM2-1.7B

Model Sources

Training and inference scripts are available here.

Repository: SmolLM2-1.7B-ultrachat_200k on Github

How to Get Started with the Model

Use the code below to get started with the model.

from peft import LoraConfig, get_peft_model, TaskType
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("M3LBY/SmolLM2-1.7B-UltraChat_200k")
tokenizer = AutoTokenizer.from_pretrained("M3LBY/SmolLM2-1.7B-UltraChat_200k")

messages = [{"role": "user", "content": "How far away is the sun?"}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(**inputs)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Training Details

The adapter model was trained using Supervised Fine-Tuning (SFT) with the following configuration:

Base model: SmolLM2-1.7B
Mixed precision: bfloat16
Learning rate: 2e-5 with linear scheduler
Warmup ratio: 0.1
Training epochs: 1
Effective batch size: 32
Sequence length: 512 tokens
Flash Attention 2 enabled

Trained to a loss of 1.6965 after 6,496 steps.

Elapsed time: 2 hours 37 minutes.

Consumed ~22 Colab Compute Units for an estimated cost of $2.21 cents.

Evaluation

Citation [optional]

BibTeX:

[More Information Needed]

APA: