Update README.md
Browse files
README.md
CHANGED
@@ -69,6 +69,86 @@ outputs = tokenizer.batch_decode(outputs_id, skip_special_tokens=True)[0]
|
|
69 |
print(outputs)
|
70 |
```
|
71 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
72 |
## Citations
|
73 |
```bibtex
|
74 |
@article{chen2024preparedllm,
|
|
|
69 |
print(outputs)
|
70 |
```
|
71 |
|
72 |
+
## Model Performance
|
73 |
+
|
74 |
+
### Geoscience Ability
|
75 |
+
We evaluate the performance of JiuZhou using the GeoBench benchmark.<br>
|
76 |
+
JiuZhou outperforms GPT-3.5 in objective tasks:
|
77 |
+
<p align="center">
|
78 |
+
<br>
|
79 |
+
<img src="image/objective_score.png" width="800"/>
|
80 |
+
<br>
|
81 |
+
</p>
|
82 |
+
JiuZhou also scores higher than JiuZhou across six criteria in subjective tasks:
|
83 |
+
<p align="center">
|
84 |
+
<br>
|
85 |
+
<img src="image/subjective_score.png" width="800"/>
|
86 |
+
<br>
|
87 |
+
</p>
|
88 |
+
### General Ability
|
89 |
+
We evaluate the performance of JiuZhou using three benchmark datasets: C-Eval, CMMLU, and MMLU.<br>
|
90 |
+
Compared to other variants of Llama and Mistral models, JiuZhou shows outstanding performance:
|
91 |
+
<p align="center">
|
92 |
+
<br>
|
93 |
+
<img src="image/general_score.png" width="800"/>
|
94 |
+
<br>
|
95 |
+
</p>
|
96 |
+
## Model Training Process
|
97 |
+
### Training Corpus
|
98 |
+
The corpus consists of 50 million general documents and 3.4 million geoscience-related documents.
|
99 |
+
<p align="center">
|
100 |
+
<br>
|
101 |
+
<img src="image/JiuZhou-Corpus.png" width="800"/>
|
102 |
+
<br>
|
103 |
+
</p>
|
104 |
+
### Training Framework
|
105 |
+
We use the JiuZhou-Framework proposed in this study.
|
106 |
+
<p align="center">
|
107 |
+
<br>
|
108 |
+
<img src="image/JiuZhou-Framework.png" width="800"/>
|
109 |
+
<br>
|
110 |
+
</p>
|
111 |
+
### Two-stage Pre-adaptation Pre-training (TSPT)
|
112 |
+
TSPT improves the efficiency of using limited geoscience data and overcomes some of the technical bottlenecks in continual pretraining for LLMs.<br>
|
113 |
+
The difference between TSPT and single-stage training algorithms:
|
114 |
+
<p align="center">
|
115 |
+
<br>
|
116 |
+
<img src="image/TSPT.png" width="800"/>
|
117 |
+
<br>
|
118 |
+
</p>
|
119 |
+
Comparison of TSPT and one-stage pre-training algorithm performance:
|
120 |
+
<p align="center">
|
121 |
+
<br>
|
122 |
+
<img src="image/TSPT_score.png" width="800"/>
|
123 |
+
<br>
|
124 |
+
</p>
|
125 |
+
## Model Training Code
|
126 |
+
We use [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) to fine-tune JiuZhou.
|
127 |
+
|
128 |
+
### Project Deployment
|
129 |
+
```bash
|
130 |
+
git clone https://github.com/THU-ESIS/JiuZhou.git
|
131 |
+
cd JiuZhou
|
132 |
+
pip install -e ".[torch,metrics]"
|
133 |
+
```
|
134 |
+
### Model Training
|
135 |
+
Pre-training:
|
136 |
+
```bash
|
137 |
+
llamafactory-cli train examples/train_lora/JiuZhou_pretrain_sft.yaml
|
138 |
+
```
|
139 |
+
Instruction-tuning:
|
140 |
+
```bash
|
141 |
+
llamafactory-cli train examples/train_lora/JiuZhou_lora_sft.yaml
|
142 |
+
```
|
143 |
+
Chat with the fine-tuned JiuZhou::
|
144 |
+
```bash
|
145 |
+
llamafactory-cli chat examples/inference/JiuZhou_lora_sft.yaml
|
146 |
+
```
|
147 |
+
Merge the instruction-tuned LoRA weights with the original JiuZhou weights:
|
148 |
+
```bash
|
149 |
+
llamafactory-cli export examples/merge_lora/JiuZhou_lora_sft.yaml
|
150 |
+
```
|
151 |
+
|
152 |
## Citations
|
153 |
```bibtex
|
154 |
@article{chen2024preparedllm,
|