tingyuansen
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -31,7 +31,7 @@ AstroLLaMA-3-8B is a specialized base language model for astronomy, developed by
|
|
31 |
- No gradient accumulation
|
32 |
- BF16 format
|
33 |
- Cosine decay schedule for learning rate reduction
|
34 |
-
- Training duration: 1 epoch
|
35 |
- **Primary Use**: Next token prediction for astronomy-related text generation and analysis
|
36 |
- **Reference**: Pan et al. 2024 [Link to be added]
|
37 |
|
@@ -78,8 +78,8 @@ Here's a performance comparison chart based upon the astronomical benchmarking Q
|
|
78 |
| Model | Score (%) |
|
79 |
|-------|-----------|
|
80 |
| LLaMA-3.1-8B | 73.7 |
|
81 |
-
| **<span style="color:red">AstroLLaMA-3-8B-Base_AIC (AstroMLab)</span>** | **<span style="color:red">71.9</span>** |
|
82 |
| LLaMA-3-8B | 72.0 |
|
|
|
83 |
| Gemma-2-9B | 71.5 |
|
84 |
| Qwen-2.5-7B | 70.4 |
|
85 |
| Yi-1.5-9B | 68.4 |
|
|
|
31 |
- No gradient accumulation
|
32 |
- BF16 format
|
33 |
- Cosine decay schedule for learning rate reduction
|
34 |
+
- Training duration: 1 epoch
|
35 |
- **Primary Use**: Next token prediction for astronomy-related text generation and analysis
|
36 |
- **Reference**: Pan et al. 2024 [Link to be added]
|
37 |
|
|
|
78 |
| Model | Score (%) |
|
79 |
|-------|-----------|
|
80 |
| LLaMA-3.1-8B | 73.7 |
|
|
|
81 |
| LLaMA-3-8B | 72.0 |
|
82 |
+
| **<span style="color:red">AstroLLaMA-3-8B-Base_AIC (AstroMLab)</span>** | **<span style="color:red">71.9</span>** |
|
83 |
| Gemma-2-9B | 71.5 |
|
84 |
| Qwen-2.5-7B | 70.4 |
|
85 |
| Yi-1.5-9B | 68.4 |
|