emrgnt-cmplxty commited on
Commit
ea27610
1 Parent(s): 9fc8cf5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -9
README.md CHANGED
@@ -1,12 +1,36 @@
 
 
 
 
 
1
  ---
2
- license: llama2
3
- ---
4
 
5
- Training is currently still underway, but this is the first epoch of a 32k context fine-tuning run of Mistral-7b over the following datasets:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
 
7
- - emrgnt-cmplxty/sciphi-textbooks-are-all-you-need
8
- - open-phi/rag-textbook-instruct-full
9
- - open-phi/programming_books_llama
10
- - open-phi/textbooks
11
- - Open-Orca/SlimOrca
12
- - WizardLM/WizardLM_evol_instruct_70k
 
1
+ To include citations in your README, you can follow the steps below:
2
+
3
+ 1. Add a "References" or "Citations" section at the end of your README.
4
+ 2. List each citation under this section in the format you've provided.
5
+
6
  ---
 
 
7
 
8
+ # SciPhi-Mistral-7B-32k Model Card
9
+
10
+ **License:** llama2
11
+
12
+ The SciPhi-Mistral-7B-32k is a Large Language Model (LLM) fine-tuned from the Mistral-7B-v0.1. This model underwent a fine-tuning process over four epochs using more than 1 billion tokens, which include regular instruction tuning data and synthetic textbooks. The objective of this work was to increase the model's scientific reasoning abilities.
13
+
14
+ ## Model Architecture
15
+
16
+ Base Model: Mistral-7B-v0.1
17
+
18
+ **Architecture Features:**
19
+ - Transformer-based model
20
+ - Grouped-Query Attention
21
+ - Sliding-Window Attention
22
+ - Byte-fallback BPE tokenizer
23
+
24
+ [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
25
+
26
+ ## References
27
+
28
+ 1. Lian, W., Goodson, B., Wang, G., Pentland, E., Cook, A., Vong, C., & Teknium. (2023). MistralOrca: Mistral-7B Model Instruct-tuned on Filtered OpenOrcaV1 GPT-4 Dataset. *HuggingFace repository*. [Link](https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca)
29
+ 2. Mukherjee, S., Mitra, A., Jawahar, G., Agarwal, S., Palangi, H., & Awadallah, A. (2023). Orca: Progressive Learning from Complex Explanation Traces of GPT-4. *arXiv preprint arXiv:2306.02707*.
30
+ 3. Longpre, S., Hou, L., Vu, T., Webson, A., Chung, H. W., Tay, Y., Zhou, D., Le, Q. V., Zoph, B., Wei, J., & Roberts, A. (2023). The Flan Collection: Designing Data and Methods for Effective Instruction Tuning. *arXiv preprint arXiv:2301.13688*.
31
+ 4. Mistral AI. (2023). Model Card for Mistral-7B-v0.1. The Mistral-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters. Mistral-7B-v0.1 outperforms Llama 2 13B on all benchmarks tested. For full details, please refer to the paper and release blog post. Model Architecture: Transformer with Grouped-Query Attention, Sliding-Window Attention, and Byte-fallback BPE tokenizer. [Link](https://huggingface.co/mistralai/Mistral-7B-v0.1)
32
+
33
+
34
+ ## Acknowledgements
35
 
36
+ Thank you to the [AI Alignment Lab](https://huggingface.co/Alignment-Lab-AI), [vikp](https://huggingface.co/vikp), [jph00](https://huggingface.co/jph00) and others who contributed to this work.