tingyuansen commited on
Commit
f6a45d1
·
verified ·
1 Parent(s): ee8578f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -2
README.md CHANGED
@@ -65,9 +65,26 @@ print(response)
65
 
66
  ## Model Limitations and Biases
67
 
68
- This model is specifically trained on astronomy literature and conversation data, and may not generalize well to other domains. Users should be aware of potential biases in the training data, which may reflect historical trends and biases in astronomical research publications and the datasets used for fine-tuning.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
69
 
70
- Importantly, this model has been superseded by more advanced versions. For state-of-the-art performance, we recommend using the latest models from AstroMLab.
71
 
72
  ## Ethical Considerations
73
 
 
65
 
66
  ## Model Limitations and Biases
67
 
68
+ This model is specifically trained on astronomy literature (abstracts, introductions, and conclusions) and may not generalize well to other domains. Users should be aware of potential biases in the training data, which may reflect historical trends and biases in astronomical research publications. Additionally, the regex-based extraction method used for processing the LaTeX source files may introduce some biases or inconsistencies in the training data.
69
+
70
+ Importantly, this model has been superseded by more advanced versions. Here's a performance comparison chart based upon the astronomical benchmarking Q&A as described in [Ting et al. 2024](https://arxiv.org/abs/2407.11194), and Pan et al. 2024.
71
+
72
+ | Model | Score (%) |
73
+ |-------|-----------|
74
+ | **AstroLLaMA-3.1-8B-Plus (AstroMLab)** | **77.2** |
75
+ | LLaMA-3.1-8B | 73.7 |
76
+ | **AstroLLaMA-2-70B (AstroMLab)** | **72.3** |
77
+ | Gemma-2-9B | 71.5 |
78
+ | Qwen-2.5-7B | 70.4 |
79
+ | Yi-1.5-9B | 68.4 |
80
+ | InternLM-2.5-7B | 64.0 |
81
+ | Mistral-7B-v0.3 | 63.9 |
82
+ | ChatGLM3-6B | 50.4 |
83
+ | <span style="color:red">AstroLLaMA-2-7B-AIC</span> | <span style="color:red">44.3</span> |
84
+ | AstroLLaMA-2-7B-Abstract | 43.5 |
85
+
86
+ As shown, AstroLLaMA-2-7B series are outperformed by newer models. For state-of-the-art performance, we recommend using the latest models.
87
 
 
88
 
89
  ## Ethical Considerations
90