Llama-3.3-Tiny-Instruct

This is a tiny random version of the JackFram/llama-68m model, created for testing and experimentation purposes.

Model Details

  • Base model: JackFram/llama-68m
  • Seed: 42
  • Hidden size: 768
  • Number of layers: 2
  • Number of attention heads: 12
  • Vocabulary size: 32000
  • Max position embeddings: 2048

Parameters

  • Total parameters: ~68,030,208
  • Trainable parameters: ~68,030,208

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained("AlignmentResearch/Llama-3.3-Tiny-Instruct")
tokenizer = AutoTokenizer.from_pretrained("AlignmentResearch/Llama-3.3-Tiny-Instruct")

# Generate text (note: this model has random weights!)
inputs = tokenizer("Hello, how are you?", return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
print(tokenizer.decode(outputs[0]))

Important Notes

โš ๏ธ This model has random weights and is not trained! It's designed for:

  • Testing model loading and inference pipelines
  • Benchmarking model architecture
  • Educational purposes
  • Rapid prototyping where actual model performance isn't needed

The model will generate random/nonsensical text since it hasn't been trained on any data.

Creation

This model was created using the upload_tiny_llama33.py script from the minimal-grpo-trainer repository.

Downloads last month
685
Safetensors
Model size
68M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for AlignmentResearch/Llama-3.3-Tiny-Instruct

Base model

JackFram/llama-68m
Finetuned
(19)
this model
Finetunes
4 models