Chrisneverdie
/

OnlySportsLM_196M

Text Generation

Model card Files Files and versions Community

Chrisneverdie commited on Sep 4, 2024

Commit

a50bd8a

·

verified ·

1 Parent(s): b684761

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -12,7 +12,7 @@ tags:
 ## Model Overview
-OnlySportsLM is a 196 million parameter language model specifically designed and trained for sports-related natural language processing tasks. It is part of the larger OnlySports collection, which aims to advance domain-specific language modeling in the sports domain.
 ## Model Architecture
@@ -22,9 +22,9 @@ OnlySportsLM is a 196 million parameter language model specifically designed and
 ## Training
-- Dataset: OnlySports Dataset (subset of 315B tokens out of 600B total)
 - Training setup: 8 H100 GPUs
-- Optimizer: AdamW with weight decay of 0.1
 - Learning rate: Initially 6e-4, adjusted to 1e-4 due to observed loss spikes
 - Context length: 1024 tokens

 ## Model Overview
+OnlySportsLM is a 196M language model specifically designed and trained for sports-related natural language processing tasks. It is part of the larger OnlySports collection, which aims to advance domain-specific language modeling in sports.
 ## Model Architecture
 ## Training
+- Dataset: [OnlySports Dataset](https://huggingface.co/datasets/Chrisneverdie/OnlySports_Dataset) (subset of 315B tokens out of 600B total)
 - Training setup: 8 H100 GPUs
+- Optimizer: AdamW
 - Learning rate: Initially 6e-4, adjusted to 1e-4 due to observed loss spikes
 - Context length: 1024 tokens