pico-lm
/

pico-decoder-medium

Text Generation

Model card Files Files and versions Community

pico-decoder-medium / README.md

rdiehlmartinez's picture

pico-decoder-medium-1 trained to 50k steps

fe843c0 11 days ago

|

history blame contribute delete

2.08 kB

	---
	license: apache-2.0
	datasets:
	- pico-lm/pretokenized-dolma
	language:
	- en
	metrics:
	- pico-lm/perplexity
	pipeline_tag: text-generation
	---

	# Pico Decoder Medium

	pico-decoder-medium is a 181M parameter model in the `pico-decoder` suite, balancing scale and analyzability. Built with [`pico-train`](https://github.com/pico-lm) and instrumented with [`pico-analyze`](https://github.com/pico-lm), it enables detailed studies of layer-wise learning behavior during language model pretraining.

	> NOTE: The `pico-decoder-medium-1` branch contains the full commit history for the training run.

	## 🔧 Model Details

	\| Field \| Value \|
	\|---------------------\|------------------------------------\|
	\| Architecture \| Decoder-only transformer (LLaMA-style) \|
	\| Parameters \| 181M \|
	\| Layers \| 12 \|
	\| Hidden Size \| 768 \|
	\| Feed Forward Size\| 3072 \|
	\| Attention Heads \| 12 \|
	\| Key/Value Heads \| 4 \|

	## 📚 Training

	- Dataset: [`pretokenized-dolma`](https://github.com/pico-lm)
	- Training steps: 200,000
	- Batch size: 1024
	- Sequence length: 2048
	- Optimizer: AdamW
	- Learning rate schedule: Linear decay with warmup
	- Compute: 16 A100-SXM4-80GB GPUs

	## 📈 Evaluation and Analysis

	This model supports fine-grained analysis using [pico-analyze](https://github.com/pico-lm). This tool enables researchers to understand how learning unfolds over training, even at very small scales.

	We also evaluate perplexity of the model on the [pico-paloma-tinsy](https://huggingface.co/datasets/pico-lm/pretokenized-paloma-tinsy) dataset.

	## 📄 Citation

	```bibtex
	@software{pico2025,
	author = {Diehl Martinez, Richard},
	title = {Pico: A Lightweight Framework for Studying Language Model Learning Dynamics},
	year = {2025},
	url = {https://github.com/pico-lm}
	}