tingyuansen commited on
Commit
7786a5f
·
verified ·
1 Parent(s): 7810a9c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -33,7 +33,7 @@ AstroLLaMA-3-8B is a specialized base language model for astronomy, developed by
33
  - Cosine decay schedule for learning rate reduction
34
  - Training duration: 1 epoch
35
  - **Primary Use**: Next token prediction for astronomy-related text generation and analysis
36
- - **Reference**: Pan et al. 2024 [Link to be added]
37
 
38
  ## Generating text from a prompt
39
 
@@ -73,10 +73,11 @@ print(generated_text[0]['generated_text'])
73
 
74
  A key limitation identified during the development of this model is that training solely on astro-ph data may not be sufficient to significantly improve performance over the base model, especially for the already highly performant LLaMA-3 series. This suggests that to achieve substantial gains, future iterations may need to incorporate a broader range of high-quality astronomical data beyond arXiv, such as textbooks, Wikipedia, and curated summaries.
75
 
76
- Here's a performance comparison chart based upon the astronomical benchmarking Q&A as described in [Ting et al. 2024](https://arxiv.org/abs/2407.11194), and Pan et al. 2024:
77
 
78
  | Model | Score (%) |
79
  |-------|-----------|
 
80
  | LLaMA-3.1-8B | 73.7 |
81
  | LLaMA-3-8B | 72.9 |
82
  | **<span style="color:green">AstroLLaMA-3-8B-Base_AIC (AstroMLab)</span>** | **<span style="color:green">71.9</span>** |
 
33
  - Cosine decay schedule for learning rate reduction
34
  - Training duration: 1 epoch
35
  - **Primary Use**: Next token prediction for astronomy-related text generation and analysis
36
+ - **Reference**: [Pan et al. 2024](https://arxiv.org/abs/2409.19750)
37
 
38
  ## Generating text from a prompt
39
 
 
73
 
74
  A key limitation identified during the development of this model is that training solely on astro-ph data may not be sufficient to significantly improve performance over the base model, especially for the already highly performant LLaMA-3 series. This suggests that to achieve substantial gains, future iterations may need to incorporate a broader range of high-quality astronomical data beyond arXiv, such as textbooks, Wikipedia, and curated summaries.
75
 
76
+ Here's a performance comparison chart based upon the astronomical benchmarking Q&A as described in [Ting et al. 2024](https://arxiv.org/abs/2407.11194):
77
 
78
  | Model | Score (%) |
79
  |-------|-----------|
80
+ | **AstroSage-LLaMA-3.1-8B (AstroMLab)** | **80.9** |
81
  | LLaMA-3.1-8B | 73.7 |
82
  | LLaMA-3-8B | 72.9 |
83
  | **<span style="color:green">AstroLLaMA-3-8B-Base_AIC (AstroMLab)</span>** | **<span style="color:green">71.9</span>** |