Update README.md
Browse files
README.md
CHANGED
@@ -8,9 +8,57 @@ pipeline_tag: text-generation
|
|
8 |
tags:
|
9 |
- Sports
|
10 |
---
|
|
|
11 |
|
12 |
-
|
13 |
-
In our OnlySports Benchmark, OnlySportsLM outperforms the preceding SOTA general purpose 135M/360M language model by 37.62%/34.08%.
|
14 |
|
15 |
-
|
16 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
8 |
tags:
|
9 |
- Sports
|
10 |
---
|
11 |
+
# OnlySportsLM
|
12 |
|
13 |
+
## Model Overview
|
|
|
14 |
|
15 |
+
OnlySportsLM is a 196 million parameter language model specifically designed and trained for sports-related natural language processing tasks. It is part of the larger OnlySports collection, which aims to advance domain-specific language modeling in the sports domain.
|
16 |
|
17 |
+
## Model Architecture
|
18 |
+
|
19 |
+
- Base architecture: RWKV-v6
|
20 |
+
- Parameters: 196 million
|
21 |
+
- Structure: 20 layers, 640 dimensions
|
22 |
+
|
23 |
+
## Training
|
24 |
+
|
25 |
+
- Dataset: OnlySports Dataset (subset of 315B tokens out of 600B total)
|
26 |
+
- Training setup: 8 H100 GPUs
|
27 |
+
- Optimizer: AdamW with weight decay of 0.1
|
28 |
+
- Learning rate: Initially 6e-4, adjusted to 1e-4 due to observed loss spikes
|
29 |
+
- Context length: 1024 tokens
|
30 |
+
|
31 |
+
## Performance
|
32 |
+
|
33 |
+
OnlySportsLM shows impressive performance on sports-related tasks:
|
34 |
+
|
35 |
+
- Outperforms previous SOTA 135M/360M models by 37.62%/34.08% on the OnlySports Benchmark
|
36 |
+
- Competitive with larger models like SomlLM 1.7B and Qwen 1.5B in the sports domain
|
37 |
+
- Demonstrates improved performance on general zero-shot tasks throughout training
|
38 |
+
|
39 |
+
For detailed performance metrics, please refer to our [technical report](https://github.com/chrischenhub/OnlySportsLM).
|
40 |
+
|
41 |
+
## Usage
|
42 |
+
|
43 |
+
You can use this model for various sports-related content generation.
|
44 |
+
|
45 |
+
Download all files in this repo. Open RWKV_v6_demo.py for inference.
|
46 |
+
|
47 |
+
## Limitations
|
48 |
+
|
49 |
+
- The model is specifically trained on sports-related content and may not perform as well on general topics
|
50 |
+
- Training was stopped at 315B tokens due to resource constraints, potentially limiting its full capabilities
|
51 |
+
|
52 |
+
## Related Resources
|
53 |
+
|
54 |
+
- [OnlySports Dataset](https://huggingface.co/collections/Chrisneverdie/onlysports-66b3e5cf595eb81220cc27a6)
|
55 |
+
- [Sports Text Classifier](https://huggingface.co/Chrisneverdie/OnlySports_Classifier)
|
56 |
+
- [GitHub Repository](https://github.com/chrischenhub/OnlySportsLM)
|
57 |
+
|
58 |
+
## Citation
|
59 |
+
|
60 |
+
If you use OnlySportsLM in your research, please cite our paper (citation details to be added upon publication).
|
61 |
+
|
62 |
+
## Contact
|
63 |
+
|
64 |
+
For more information or inquiries about OnlySportsLM, please visit our [GitHub repository](https://github.com/chrischenhub/OnlySportsLM) or email [email protected].
|