tingyuansen commited on
Commit
bfa7b8b
·
verified ·
1 Parent(s): 3576e8d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -31,7 +31,7 @@ AstroLLaMA-3-8B is a specialized base language model for astronomy, developed by
31
  - No gradient accumulation
32
  - BF16 format
33
  - Cosine decay schedule for learning rate reduction
34
- - Training duration: 1 epoch (approximately 32 A100 GPU hours)
35
  - **Primary Use**: Next token prediction for astronomy-related text generation and analysis
36
  - **Reference**: Pan et al. 2024 [Link to be added]
37
 
@@ -78,8 +78,8 @@ Here's a performance comparison chart based upon the astronomical benchmarking Q
78
  | Model | Score (%) |
79
  |-------|-----------|
80
  | LLaMA-3.1-8B | 73.7 |
81
- | **<span style="color:red">AstroLLaMA-3-8B-Base_AIC (AstroMLab)</span>** | **<span style="color:red">71.9</span>** |
82
  | LLaMA-3-8B | 72.0 |
 
83
  | Gemma-2-9B | 71.5 |
84
  | Qwen-2.5-7B | 70.4 |
85
  | Yi-1.5-9B | 68.4 |
 
31
  - No gradient accumulation
32
  - BF16 format
33
  - Cosine decay schedule for learning rate reduction
34
+ - Training duration: 1 epoch
35
  - **Primary Use**: Next token prediction for astronomy-related text generation and analysis
36
  - **Reference**: Pan et al. 2024 [Link to be added]
37
 
 
78
  | Model | Score (%) |
79
  |-------|-----------|
80
  | LLaMA-3.1-8B | 73.7 |
 
81
  | LLaMA-3-8B | 72.0 |
82
+ | **<span style="color:red">AstroLLaMA-3-8B-Base_AIC (AstroMLab)</span>** | **<span style="color:red">71.9</span>** |
83
  | Gemma-2-9B | 71.5 |
84
  | Qwen-2.5-7B | 70.4 |
85
  | Yi-1.5-9B | 68.4 |