AstroMLab
/

astrollama-2-7b-chat_aic

Text Generation

text-generation-inference

Model card Files Files and versions

tingyuansen commited on Sep 29, 2024

Commit

f6a45d1

·

verified ·

1 Parent(s): ee8578f

Update README.md

Files changed (1) hide show

README.md +19 -2

README.md CHANGED Viewed

@@ -65,9 +65,26 @@ print(response)
 ## Model Limitations and Biases
-This model is specifically trained on astronomy literature and conversation data, and may not generalize well to other domains. Users should be aware of potential biases in the training data, which may reflect historical trends and biases in astronomical research publications and the datasets used for fine-tuning.
-Importantly, this model has been superseded by more advanced versions. For state-of-the-art performance, we recommend using the latest models from AstroMLab.
 ## Ethical Considerations

 ## Model Limitations and Biases
+This model is specifically trained on astronomy literature (abstracts, introductions, and conclusions) and may not generalize well to other domains. Users should be aware of potential biases in the training data, which may reflect historical trends and biases in astronomical research publications. Additionally, the regex-based extraction method used for processing the LaTeX source files may introduce some biases or inconsistencies in the training data.
+Importantly, this model has been superseded by more advanced versions. Here's a performance comparison chart based upon the astronomical benchmarking Q&A as described in [Ting et al. 2024](https://arxiv.org/abs/2407.11194), and Pan et al. 2024.
+| Model | Score (%) |
+|-------|-----------|
+| **AstroLLaMA-3.1-8B-Plus (AstroMLab)** | **77.2** |
+| LLaMA-3.1-8B | 73.7 |
+| **AstroLLaMA-2-70B (AstroMLab)** | **72.3** |
+| Gemma-2-9B | 71.5 |
+| Qwen-2.5-7B | 70.4 |
+| Yi-1.5-9B | 68.4 |
+| InternLM-2.5-7B | 64.0 |
+| Mistral-7B-v0.3 | 63.9 |
+| ChatGLM3-6B | 50.4 |
+| <span style="color:red">AstroLLaMA-2-7B-AIC</span> | <span style="color:red">44.3</span> |
+| AstroLLaMA-2-7B-Abstract | 43.5 |
+As shown, AstroLLaMA-2-7B series are outperformed by newer models. For state-of-the-art performance, we recommend using the latest models.
 ## Ethical Considerations