itpossible
/

JiuZhou-base

Text Generation

Transformers

Safetensors

mistral

text-generation-inference

Model card Files Files and versions Community

itpossible commited on Jan 14

Commit

f8913cb

verified ·

1 Parent(s): 3133908

Update README.md

Browse files

Files changed (1) hide show

README.md +4 -12

README.md CHANGED Viewed

@@ -4,8 +4,9 @@
     </h1>
 </div>
 ## 🎉 News
-- [2024-12-31] **Article [JiuZhou: Open Foundation Language Models and Effective Pre-training Framework for Geoscience](https://www.tandfonline.com/doi/full/10.1080/17538947.2025.2449708) has been accepted for publication in the *International Journal fo Digital Earth*. [Code and Data](https://github.com/THU-ESIS/JiuZhou).**
 - [2024-10-11] WeChat article: [PreparedLLM: Effective Pre-pretraining Framework for Domain-specific Large Language Models](https://mp.weixin.qq.com/s/ugJQ9tbp6Y87xA3TOWteqw).
 - [2024-09-06] Released [ClimateChat](https://huggingface.co/itpossible/ClimateChat) instruct model.
 - [2024-08-31] **Article [PreparedLLM: Effective Pre-pretraining Framework for Domain-specific Large Language Models](https://www.tandfonline.com/doi/full/10.1080/20964471.2024.2396159) has been accepted for publication in the *Big Earth Data* journal**.
@@ -79,26 +80,21 @@ JiuZhou outperforms GPT-3.5 in objective tasks:
     <img src="image/objective_score.png" width="800"/>
     <br>
 </p>
-JiuZhou also scores higher than ClimateChat across six criteria in subjective tasks:
 <p align="center">
     <br>
     <img src="image/subjective_score.png" width="800"/>
     <br>
 </p>
 ### General Ability
-We evaluate the performance of Chinese-Mistral-7B using three benchmark datasets: C-Eval, CMMLU, and MMLU.<br>
 Compared to other variants of Llama and Mistral models, JiuZhou shows outstanding performance:
 <p align="center">
     <br>
     <img src="image/general_score.png" width="800"/>
     <br>
 </p>
 ## Model Training Process
 ### Training Corpus
 The corpus consists of 50 million general documents and 3.4 million geoscience-related documents.
 <p align="center">
@@ -106,7 +102,6 @@ The corpus consists of 50 million general documents and 3.4 million geoscience-r
     <img src="image/JiuZhou-Corpus.png" width="800"/>
     <br>
 </p>
 ### Training Framework
 We use the JiuZhou-Framework proposed in this study.
 <p align="center">
@@ -114,7 +109,6 @@ We use the JiuZhou-Framework proposed in this study.
     <img src="image/JiuZhou-Framework.png" width="800"/>
     <br>
 </p>
 ### Two-stage Pre-adaptation Pre-training (TSPT)
 TSPT improves the efficiency of using limited geoscience data and overcomes some of the technical bottlenecks in continual pretraining for LLMs.<br>
 The difference between TSPT and single-stage training algorithms:
@@ -129,8 +123,6 @@ Comparison of TSPT and one-stage pre-training algorithm performance:
     <img src="image/TSPT_score.png" width="800"/>
     <br>
 </p>
 ## Model Training Code
 We use [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) to fine-tune JiuZhou.

     </h1>
 </div>
 ## 🎉 News
+- [2024-12-31] **Article [JiuZhou: Open Foundation Language Models and Effective Pre-training Framework for Geoscience](https://www.tandfonline.com/doi/full/10.1080/17538947.2025.2449708) has been accepted for publication in the *International Journal fo Digital Earth***. [Code and Data](https://github.com/THU-ESIS/JiuZhou).
 - [2024-10-11] WeChat article: [PreparedLLM: Effective Pre-pretraining Framework for Domain-specific Large Language Models](https://mp.weixin.qq.com/s/ugJQ9tbp6Y87xA3TOWteqw).
 - [2024-09-06] Released [ClimateChat](https://huggingface.co/itpossible/ClimateChat) instruct model.
 - [2024-08-31] **Article [PreparedLLM: Effective Pre-pretraining Framework for Domain-specific Large Language Models](https://www.tandfonline.com/doi/full/10.1080/20964471.2024.2396159) has been accepted for publication in the *Big Earth Data* journal**.
     <img src="image/objective_score.png" width="800"/>
     <br>
 </p>
+JiuZhou also scores higher than JiuZhou across six criteria in subjective tasks:
 <p align="center">
     <br>
     <img src="image/subjective_score.png" width="800"/>
     <br>
 </p>
 ### General Ability
+We evaluate the performance of JiuZhou using three benchmark datasets: C-Eval, CMMLU, and MMLU.<br>
 Compared to other variants of Llama and Mistral models, JiuZhou shows outstanding performance:
 <p align="center">
     <br>
     <img src="image/general_score.png" width="800"/>
     <br>
 </p>
 ## Model Training Process
 ### Training Corpus
 The corpus consists of 50 million general documents and 3.4 million geoscience-related documents.
 <p align="center">
     <img src="image/JiuZhou-Corpus.png" width="800"/>
     <br>
 </p>
 ### Training Framework
 We use the JiuZhou-Framework proposed in this study.
 <p align="center">
     <img src="image/JiuZhou-Framework.png" width="800"/>
     <br>
 </p>
 ### Two-stage Pre-adaptation Pre-training (TSPT)
 TSPT improves the efficiency of using limited geoscience data and overcomes some of the technical bottlenecks in continual pretraining for LLMs.<br>
 The difference between TSPT and single-stage training algorithms:
     <img src="image/TSPT_score.png" width="800"/>
     <br>
 </p>
 ## Model Training Code
 We use [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) to fine-tune JiuZhou.