File size: 2,191 Bytes
ca740fe
51616ef
d92d70b
ca740fe
 
d92d70b
51616ef
d92d70b
 
 
e45d635
51616ef
d92d70b
51616ef
d92d70b
51616ef
d92d70b
 
 
51616ef
d92d70b
51616ef
de745c3
51616ef
 
 
cb1c543
51616ef
b72e8f0
51616ef
b72e8f0
51616ef
 
ffd4c20
ff69839
51616ef
 
 
de745c3
e45d635
0aa57ab
51616ef
d92d70b
51616ef
d92d70b
51616ef
d92d70b
51616ef
d92d70b
51616ef
d92d70b
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
# AstroMLab

AstroMLab is a diverse group of researchers dedicated to advancing the application of Large Language Models (LLMs) in astronomy. Our team includes:
- Leading astronomers, astrophysicists, and cosmologists.
- Natural language processing experts.
- Frontier arXivists from the NASA Astrophysics Data System

## Objectives
- Develop specialized LLMs for astronomy
- Create open-source models for advanced research
- Facilitate LLM-driven end-to-end agentic research in astronomy

## Current Work

Our ongoing projects include:

- Curation of an astronomy-based benchmarking dataset
- Development of specialized astronomy LLMs
- Performance evaluation of models on astronomical tasks

## Models and Performance

We have developed several models, including AstroSage-LLaMA-3.1-8B ([de Haan et al. 2024](https://arxiv.org/abs/2411.09012)), AstroLLaMA-2-70B ([Pan et al. 2024](https://arxiv.org/abs/2409.19750)), and AstroLLaMA-3-8B ([Pan et al. 2024](https://arxiv.org/abs/2409.19750)). Our AstroSage-LLaMA-3.1-8B model has demonstrated strong performance in astronomy Q&A tasks ([Ting et al. 2024](https://arxiv.org/abs/2407.11194)):

| Model | Score (%) |
|-------|-----------|
| **AstroSage-LLaMA-3.1-8B (AstroMLab)** | **80.9** |
| LLaMA-3.1-8B | 73.7 |
| Phi-3.5-4B | 72.8 |
| Gemma-2-9B | 71.5 |
| LLaMA-2-70B | 70.7 |
| Qwen-2.5-7B | 70.4 |
| Yi-1.5-9B | 68.4 |
| InternLM-2.5-7B | 64.5 |
| Mistral-7B-v0.3 | 63.9 |
| ChatGLM3-6B | 50.4 |
| AstroLLaMA-2-7B (UniverseTBD) | 44.3 |

AstroSage-LLaMA-3.1-8B ([de Haan et al. 2024](https://arxiv.org/abs/2411.09012)), our lightweight model, currently achieves the highest score among the ~8B parameter models in its astronomy knowledge recall ability.

![Cost and performance trade-off in astronomical Q&A](https://cdn-uploads.huggingface.co/production/uploads/643f1ddce2ea47d170103537/ip0Bk-LZRrCArimets4H7.png)

## Support and Resources

Our research benefits from:
- Access to the Frontier nodes at Oak Ridge Leadership Computing Facility
- Support from Microsoft's Accelerating Foundation Models Research (AFMR) program

## Contact

For inquiries or collaboration opportunities, please contact: [email protected]