File size: 2,288 Bytes
2405ee6
bc3e7c2
2405ee6
 
 
 
 
 
 
1cb2ca7
885550a
1cb2ca7
885550a
1cb2ca7
a50bd8a
d4850ad
885550a
 
 
 
 
 
 
 
a50bd8a
885550a
a50bd8a
885550a
 
 
 
 
 
 
 
 
 
6dfb650
 
 
885550a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b684761
885550a
 
 
69d0bdb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
---
license: cc-by-sa-4.0
datasets:
- Chrisneverdie/OnlySports_Dataset
language:
- en
pipeline_tag: text-generation
tags:
- Sports
---
# OnlySportsLM

## Model Overview

OnlySportsLM is a 196M language model specifically designed and trained for sports-related natural language processing tasks. It is part of the larger OnlySports collection, which aims to advance domain-specific language modeling in sports.

## Model Architecture

- Base architecture: RWKV-v6
- Parameters: 196 million
- Structure: 20 layers, 640 dimensions

## Training

- Dataset: [OnlySports Dataset](https://huggingface.co/datasets/Chrisneverdie/OnlySports_Dataset) (subset of 315B tokens out of 600B total)
- Training setup: 8 H100 GPUs
- Optimizer: AdamW
- Learning rate: Initially 6e-4, adjusted to 1e-4 due to observed loss spikes
- Context length: 1024 tokens

## Performance

OnlySportsLM shows impressive performance on sports-related tasks:

- Outperforms previous SOTA 135M/360M models by 37.62%/34.08% on the OnlySports Benchmark
- Competitive with larger models like SomlLM 1.7B and Qwen 1.5B in the sports domain

![image/png](https://cdn-uploads.huggingface.co/production/uploads/656590bd40440ddcc051ade7/3_mPSjpzIngX-__cjlAqu.png)

<!-- For detailed performance metrics, please refer to our [technical report](https://github.com/chrischenhub/OnlySportsLM). -->

## Usage

You can use this model for various sports-related content generation.

Download all files in this repo. Open RWKV_v6_demo.py for inference.

## Limitations

- The model is specifically trained on sports-related content and may not perform as well on general topics
- Training was stopped at 315B tokens due to resource constraints, potentially limiting its full capabilities

## Related Resources

- [OnlySports Dataset](https://huggingface.co/collections/Chrisneverdie/onlysports-66b3e5cf595eb81220cc27a6)
- [Sports Text Classifier](https://huggingface.co/Chrisneverdie/OnlySports_Classifier)
- [GitHub Repository](https://github.com/chrischenhub/OnlySportsLM)

## Citation

If you use OnlySportsLM in your research, please cite our [paper](https://arxiv.org/abs/2409.00286).

## Contact

For more information or inquiries about OnlySportsLM, please visit our [GitHub repository](https://github.com/chrischenhub/OnlySportsLM).