itpossible commited on
Commit
f8913cb
·
verified ·
1 Parent(s): 3133908

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -12
README.md CHANGED
@@ -4,8 +4,9 @@
4
  </h1>
5
  </div>
6
 
 
7
  ## 🎉 News
8
- - [2024-12-31] **Article [JiuZhou: Open Foundation Language Models and Effective Pre-training Framework for Geoscience](https://www.tandfonline.com/doi/full/10.1080/17538947.2025.2449708) has been accepted for publication in the *International Journal fo Digital Earth*. [Code and Data](https://github.com/THU-ESIS/JiuZhou).**
9
  - [2024-10-11] WeChat article: [PreparedLLM: Effective Pre-pretraining Framework for Domain-specific Large Language Models](https://mp.weixin.qq.com/s/ugJQ9tbp6Y87xA3TOWteqw).
10
  - [2024-09-06] Released [ClimateChat](https://huggingface.co/itpossible/ClimateChat) instruct model.
11
  - [2024-08-31] **Article [PreparedLLM: Effective Pre-pretraining Framework for Domain-specific Large Language Models](https://www.tandfonline.com/doi/full/10.1080/20964471.2024.2396159) has been accepted for publication in the *Big Earth Data* journal**.
@@ -79,26 +80,21 @@ JiuZhou outperforms GPT-3.5 in objective tasks:
79
  <img src="image/objective_score.png" width="800"/>
80
  <br>
81
  </p>
82
-
83
- JiuZhou also scores higher than ClimateChat across six criteria in subjective tasks:
84
  <p align="center">
85
  <br>
86
  <img src="image/subjective_score.png" width="800"/>
87
  <br>
88
  </p>
89
-
90
  ### General Ability
91
-
92
- We evaluate the performance of Chinese-Mistral-7B using three benchmark datasets: C-Eval, CMMLU, and MMLU.<br>
93
  Compared to other variants of Llama and Mistral models, JiuZhou shows outstanding performance:
94
  <p align="center">
95
  <br>
96
  <img src="image/general_score.png" width="800"/>
97
  <br>
98
  </p>
99
-
100
  ## Model Training Process
101
-
102
  ### Training Corpus
103
  The corpus consists of 50 million general documents and 3.4 million geoscience-related documents.
104
  <p align="center">
@@ -106,7 +102,6 @@ The corpus consists of 50 million general documents and 3.4 million geoscience-r
106
  <img src="image/JiuZhou-Corpus.png" width="800"/>
107
  <br>
108
  </p>
109
-
110
  ### Training Framework
111
  We use the JiuZhou-Framework proposed in this study.
112
  <p align="center">
@@ -114,7 +109,6 @@ We use the JiuZhou-Framework proposed in this study.
114
  <img src="image/JiuZhou-Framework.png" width="800"/>
115
  <br>
116
  </p>
117
-
118
  ### Two-stage Pre-adaptation Pre-training (TSPT)
119
  TSPT improves the efficiency of using limited geoscience data and overcomes some of the technical bottlenecks in continual pretraining for LLMs.<br>
120
  The difference between TSPT and single-stage training algorithms:
@@ -129,8 +123,6 @@ Comparison of TSPT and one-stage pre-training algorithm performance:
129
  <img src="image/TSPT_score.png" width="800"/>
130
  <br>
131
  </p>
132
-
133
-
134
  ## Model Training Code
135
  We use [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) to fine-tune JiuZhou.
136
 
 
4
  </h1>
5
  </div>
6
 
7
+
8
  ## 🎉 News
9
+ - [2024-12-31] **Article [JiuZhou: Open Foundation Language Models and Effective Pre-training Framework for Geoscience](https://www.tandfonline.com/doi/full/10.1080/17538947.2025.2449708) has been accepted for publication in the *International Journal fo Digital Earth***. [Code and Data](https://github.com/THU-ESIS/JiuZhou).
10
  - [2024-10-11] WeChat article: [PreparedLLM: Effective Pre-pretraining Framework for Domain-specific Large Language Models](https://mp.weixin.qq.com/s/ugJQ9tbp6Y87xA3TOWteqw).
11
  - [2024-09-06] Released [ClimateChat](https://huggingface.co/itpossible/ClimateChat) instruct model.
12
  - [2024-08-31] **Article [PreparedLLM: Effective Pre-pretraining Framework for Domain-specific Large Language Models](https://www.tandfonline.com/doi/full/10.1080/20964471.2024.2396159) has been accepted for publication in the *Big Earth Data* journal**.
 
80
  <img src="image/objective_score.png" width="800"/>
81
  <br>
82
  </p>
83
+ JiuZhou also scores higher than JiuZhou across six criteria in subjective tasks:
 
84
  <p align="center">
85
  <br>
86
  <img src="image/subjective_score.png" width="800"/>
87
  <br>
88
  </p>
 
89
  ### General Ability
90
+ We evaluate the performance of JiuZhou using three benchmark datasets: C-Eval, CMMLU, and MMLU.<br>
 
91
  Compared to other variants of Llama and Mistral models, JiuZhou shows outstanding performance:
92
  <p align="center">
93
  <br>
94
  <img src="image/general_score.png" width="800"/>
95
  <br>
96
  </p>
 
97
  ## Model Training Process
 
98
  ### Training Corpus
99
  The corpus consists of 50 million general documents and 3.4 million geoscience-related documents.
100
  <p align="center">
 
102
  <img src="image/JiuZhou-Corpus.png" width="800"/>
103
  <br>
104
  </p>
 
105
  ### Training Framework
106
  We use the JiuZhou-Framework proposed in this study.
107
  <p align="center">
 
109
  <img src="image/JiuZhou-Framework.png" width="800"/>
110
  <br>
111
  </p>
 
112
  ### Two-stage Pre-adaptation Pre-training (TSPT)
113
  TSPT improves the efficiency of using limited geoscience data and overcomes some of the technical bottlenecks in continual pretraining for LLMs.<br>
114
  The difference between TSPT and single-stage training algorithms:
 
123
  <img src="image/TSPT_score.png" width="800"/>
124
  <br>
125
  </p>
 
 
126
  ## Model Training Code
127
  We use [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) to fine-tune JiuZhou.
128