Chrisneverdie commited on
Commit
885550a
·
verified ·
1 Parent(s): d4850ad

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -3
README.md CHANGED
@@ -8,9 +8,57 @@ pipeline_tag: text-generation
8
  tags:
9
  - Sports
10
  ---
 
11
 
12
- OnlySportsLM is A 196 million parameter RWKV-v6 based sports language model trained on half of the OnlySports Dataset.
13
- In our OnlySports Benchmark, OnlySportsLM outperforms the preceding SOTA general purpose 135M/360M language model by 37.62%/34.08%.
14
 
15
- Please download the model file and use RWKV-v6-demo.py for inference.
16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  tags:
9
  - Sports
10
  ---
11
+ # OnlySportsLM
12
 
13
+ ## Model Overview
 
14
 
15
+ OnlySportsLM is a 196 million parameter language model specifically designed and trained for sports-related natural language processing tasks. It is part of the larger OnlySports collection, which aims to advance domain-specific language modeling in the sports domain.
16
 
17
+ ## Model Architecture
18
+
19
+ - Base architecture: RWKV-v6
20
+ - Parameters: 196 million
21
+ - Structure: 20 layers, 640 dimensions
22
+
23
+ ## Training
24
+
25
+ - Dataset: OnlySports Dataset (subset of 315B tokens out of 600B total)
26
+ - Training setup: 8 H100 GPUs
27
+ - Optimizer: AdamW with weight decay of 0.1
28
+ - Learning rate: Initially 6e-4, adjusted to 1e-4 due to observed loss spikes
29
+ - Context length: 1024 tokens
30
+
31
+ ## Performance
32
+
33
+ OnlySportsLM shows impressive performance on sports-related tasks:
34
+
35
+ - Outperforms previous SOTA 135M/360M models by 37.62%/34.08% on the OnlySports Benchmark
36
+ - Competitive with larger models like SomlLM 1.7B and Qwen 1.5B in the sports domain
37
+ - Demonstrates improved performance on general zero-shot tasks throughout training
38
+
39
+ For detailed performance metrics, please refer to our [technical report](https://github.com/chrischenhub/OnlySportsLM).
40
+
41
+ ## Usage
42
+
43
+ You can use this model for various sports-related content generation.
44
+
45
+ Download all files in this repo. Open RWKV_v6_demo.py for inference.
46
+
47
+ ## Limitations
48
+
49
+ - The model is specifically trained on sports-related content and may not perform as well on general topics
50
+ - Training was stopped at 315B tokens due to resource constraints, potentially limiting its full capabilities
51
+
52
+ ## Related Resources
53
+
54
+ - [OnlySports Dataset](https://huggingface.co/collections/Chrisneverdie/onlysports-66b3e5cf595eb81220cc27a6)
55
+ - [Sports Text Classifier](https://huggingface.co/Chrisneverdie/OnlySports_Classifier)
56
+ - [GitHub Repository](https://github.com/chrischenhub/OnlySportsLM)
57
+
58
+ ## Citation
59
+
60
+ If you use OnlySportsLM in your research, please cite our paper (citation details to be added upon publication).
61
+
62
+ ## Contact
63
+
64
+ For more information or inquiries about OnlySportsLM, please visit our [GitHub repository](https://github.com/chrischenhub/OnlySportsLM) or email [email protected].