rubenroy commited on
Commit
fba8acf
·
verified ·
1 Parent(s): 08be82d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +113 -8
README.md CHANGED
@@ -1,22 +1,127 @@
1
  ---
2
- base_model: unsloth/mistral-nemo-instruct-2407-bnb-4bit
3
  tags:
4
  - text-generation-inference
5
  - transformers
6
  - unsloth
7
- - mistral
8
  - trl
 
 
 
 
 
9
  license: apache-2.0
10
  language:
11
  - en
 
 
 
 
 
 
 
 
 
 
 
 
12
  ---
13
 
14
- # Uploaded model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
 
16
- - **Developed by:** rubenroy
17
- - **License:** apache-2.0
18
- - **Finetuned from model :** unsloth/mistral-nemo-instruct-2407-bnb-4bit
19
 
20
- This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
1
  ---
2
+ base_model: mistralai/Mistral-Nemo-Instruct-2407
3
  tags:
4
  - text-generation-inference
5
  - transformers
6
  - unsloth
 
7
  - trl
8
+ - gammacorpus
9
+ - geneva
10
+ - chat
11
+ - mistral
12
+ - conversational
13
  license: apache-2.0
14
  language:
15
  - en
16
+ - fr
17
+ - de
18
+ - es
19
+ - it
20
+ - pt
21
+ - ru
22
+ - zh
23
+ - ja
24
+ datasets:
25
+ - rubenroy/GammaCorpus-v2-1m
26
+ pipeline_tag: text-generation
27
+ library_name: transformers
28
  ---
29
 
30
+ ![Geneva Banner](https://cdn.ruben-roy.com/AI/Geneva/img/banner-12B-1m.png)
31
+
32
+ # Geneva 12B GammaCorpus v2-1m
33
+ *A Mistral NeMo model fine-tuned on the GammaCorpus dataset*
34
+
35
+ ## Overview
36
+ Geneva 12B GammaCorpus v2-1m is a fine-tune of Mistral's **Mistral Nemo Instruct 2407** model. Geneva is designed to outperform other models that have a similar size while also showcasing [GammaCorpus v2-1m](https://huggingface.co/datasets/rubenroy/GammaCorpus-v2-1m).
37
+
38
+ ## Model Details
39
+ - **Base Model:** [mistralai/Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407)
40
+ - **Parameters:** 12B
41
+ - **Layers:** 40
42
+ - **Dim:** 5,120
43
+ - **Head dim:** 128
44
+ - **Hidden dim:** 14,336
45
+ - **Activation Function:** SwiGLU
46
+ - **Number of heads:** 32
47
+ - **Number of kv-heads:** 8 (GQA)
48
+ - **Vocabulary size:** 2**17 ~= 128k
49
+ - **Rotary embeddings (theta = 1M)**
50
+
51
+ ## Training Details
52
+
53
+ Geneva-12B-GCv2-1m underwent fine-tuning with 1 A100 GPU for ~40 minutes and trained with the [Unsloth](https://unsloth.ai/) framework. Geneva-12B-GCv2-1m was trained for **60 Epochs**.
54
+
55
+ ## Usage
56
+
57
+ ### Requirements
58
+
59
+ Please use the following Transformers version here:
60
+
61
+ ```
62
+ pip install git+https://github.com/huggingface/transformers.git
63
+ ```
64
+
65
+ ### Quickstart
66
+
67
+ If you want to use Hugging Face `transformers` to generate text, you can do something like this:
68
+
69
+ ```python
70
+ from transformers import pipeline
71
+
72
+ prompt = "How tall is the Eiffel tower?"
73
+
74
+ messages = [
75
+ {"role": "system", "content": "You are a helpful assistant named Geneva, built on the Mistral NeMo model developed by Mistral AI, and fine-tuned by Ruben Roy."},
76
+ {"role": "user", "content": prompt},
77
+ ]
78
+
79
+ infer = pipeline("text-generation", model="rubenroy/Geneva-12B-GCv2-1m", max_new_tokens=128)
80
+
81
+ infer(messages)
82
+ ```
83
+
84
+ ## About GammaCorpus
85
+
86
+ This model, and all Geneva models, are trained with GammaCorpus. GammaCorpus is a dataset on HuggingFace that is filled with structured and filtered multi-turn conversations.
87
+ GammaCorpus has 4 version with different sizes in each. These are the following versions and sizes:
88
+
89
+ ### GammaCorpus v1
90
+ - 10k UNFILTERED
91
+ - 50k UNFILTERED
92
+ - 70k UNFILTERED
93
+
94
+ Here is a link to the GCv1 dataset collection:<br>
95
+ https://huggingface.co/collections/rubenroy/gammacorpus-v1-67935e4e52a04215f15a7a60
96
+
97
+ ### GammaCorpus v2
98
+ - 10k
99
+ - 50k
100
+ - 100k
101
+ - 500k
102
+ - **1m <-- This is the version of GammaCorpus v2 that the Geneva model you are using was trained on.**
103
+ - 5m
104
+
105
+ Here is a link to the GCv2 dataset collection:<br>
106
+ https://huggingface.co/collections/rubenroy/gammacorpus-v2-67935e895e1259c404a579df
107
+
108
+ ### GammaCorpus CoT
109
+ - Math 170k
110
+
111
+ Here is a link to the GC-CoT dataset collection:<br>
112
+ https://huggingface.co/collections/rubenroy/gammacorpus-cot-6795bbc950b62b1ced41d14f
113
+
114
+ ### GammaCorpus QA
115
+ - Fact 450k
116
+
117
+ Here is a link to the GC-QA dataset collection:<br>
118
+ https://huggingface.co/collections/rubenroy/gammacorpus-qa-679857017bb3855234c1d8c7
119
+
120
+ ### The link to the full GammaCorpus dataset collection can be found [here](https://huggingface.co/collections/rubenroy/gammacorpus-67765abf607615a0eb6d61ac).
121
 
122
+ ## Known Limitations:
 
 
123
 
124
+ - **Bias:** We have tried our best to mitigate as much bias we can, but please be aware of the possibility that the model might generate some biased answers.
125
 
126
+ ## Licence:
127
+ The model is released under the **[Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0)**. Please refer to the license for usage rights and restrictions.