ibalampanis commited on
Commit
7e3cc96
·
verified ·
1 Parent(s): 97805a3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +47 -59
README.md CHANGED
@@ -5,102 +5,90 @@ model_name: Meltemi-7B-Instruct-v1
5
  pipeline_tag: text-generation
6
  quantized_by: SPAHE
7
  tags:
8
- - finetuned
9
  ---
 
10
  <!-- markdownlint-disable MD041 -->
11
 
12
  # Meltemi 7B Instruct v1 - GGUF
 
13
  - Original model: [Meltemi 7B Instruct v1](https://huggingface.co/ilsp/Meltemi-7B-Instruct-v1)
14
 
15
  <!-- description start -->
 
16
  ## Description
17
 
18
- This repo contains GGUF format model files for [ilsp's Meltemi 7B Instruct v1](https://huggingface.co/ilsp/Meltemi-7B-Instruct-v1).
19
 
20
  <!-- description end -->
21
- <!-- README_GGUF.md-about-gguf start -->
22
- ### About GGUF
23
-
24
- GGUF is a new format introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp.
25
-
26
- Here is an incomplete list of clients and libraries that are known to support GGUF:
27
-
28
- * [llama.cpp](https://github.com/ggerganov/llama.cpp). The source project for GGUF. Offers a CLI and a server option.
29
- * [text-generation-webui](https://github.com/oobabooga/text-generation-webui), the most widely used web UI, with many features and powerful extensions. Supports GPU acceleration.
30
- * [KoboldCpp](https://github.com/LostRuins/koboldcpp), a fully featured web UI, with GPU accel across all platforms and GPU architectures. Especially good for story telling.
31
- * [GPT4All](https://gpt4all.io/index.html), a free and open source local running GUI, supporting Windows, Linux and macOS with full GPU accel.
32
- * [LM Studio](https://lmstudio.ai/), an easy-to-use and powerful local GUI for Windows and macOS (Silicon), with GPU acceleration. Linux available, in beta as of 27/11/2023.
33
- * [LoLLMS Web UI](https://github.com/ParisNeo/lollms-webui), a great web UI with many interesting and unique features, including a full model library for easy model selection.
34
- * [Faraday.dev](https://faraday.dev/), an attractive and easy to use character-based chat GUI for Windows and macOS (both Silicon and Intel), with GPU acceleration.
35
- * [llama-cpp-python](https://github.com/abetlen/llama-cpp-python), a Python library with GPU accel, LangChain support, and OpenAI-compatible API server.
36
- * [candle](https://github.com/huggingface/candle), a Rust ML framework with a focus on performance, including GPU support, and ease of use.
37
- * [ctransformers](https://github.com/marella/ctransformers), a Python library with GPU accel, LangChain support, and OpenAI-compatible AI server. Note, as of time of writing (November 27th 2023), ctransformers has not been updated in a long time and does not support many recent models.
38
-
39
-
40
- <!-- compatibility_gguf start -->
41
- ## Compatibility
42
-
43
- These quantised GGUFv2 files are compatible with llama.cpp from August 27th onwards, as of commit [d0cee0d](https://github.com/ggerganov/llama.cpp/commit/d0cee0d36d5be95a0d9088b674dbb27354107221)
44
 
45
  <!-- README_GGUF.md-provided-files start -->
 
46
  ## Provided files
47
 
48
- | Name | Quant method | Bits/Floats | Size | Max RAM required | Use case |
49
- | ---- | ---- | ---- | ---- | ---- | ----- |
50
- | [meltemi-7B-instruct-v1_q8_0.gguf](https://huggingface.co/SPAHE/Meltemi-7B-Instruct-v1-GGUF/blob/main/meltemi-7B-instruct-v1_q8_0.gguf) | Q8_0 | 5 | 7.40 GB| 7.30 GB | very low quality loss - recommended |
51
- | [meltemi-7B-instruct-v1_f16.gguf](https://huggingface.co/SPAHE/Meltemi-7B-Instruct-v1-GGUF/blob/main/meltemi-7B-instruct-v1_f16.gguf) | F16 | 16 | 13.90 GB| 14.20 GB | very large, extremely low quality loss |
52
- | [meltemi-7B-instruct-v1_f32.gguf](https://huggingface.co/SPAHE/Meltemi-7B-Instruct-v1-GGUF/blob/main/meltemi-7B-instruct-v1_f32.gguf) | F32 | 32 | 27.80 GB| 29.30 GB | very large, extremely low quality loss - not recommended |
53
 
54
- **Note**: the above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.
55
 
56
  <!-- README_GGUF.md-provided-files end -->
57
 
58
  <!-- README_GGUF.md-how-to-download start -->
59
- ## How to download GGUF files
60
 
61
- **Note for manual downloaders:** You almost never want to clone the entire repo! Multiple different quantisation formats are provided, and most users only want to pick and download a single file.
 
 
 
 
62
 
63
- The following clients/libraries will automatically download models for you, providing a list of available models to choose from:
64
 
65
- * LM Studio
66
- * LoLLMS Web UI
67
- * Faraday.dev
68
 
69
- ### On the command line, including multiple files at once
70
 
71
- I recommend using the `huggingface-hub` Python library:
 
 
72
 
73
  ```shell
74
- pip3 install huggingface-hub
75
  ```
76
 
77
- Then you can download any individual model file to the current directory, at high speed, with a command like this:
78
 
79
  ```shell
80
- huggingface-cli download SPAHE/Meltemi-7B-Instruct-v1-GGUF meltemi-7B-instruct-v1_q8_0.gguf --local-dir . --local-dir-use-symlinks False
81
  ```
82
 
 
 
 
 
83
  <!-- original-model-card start -->
 
84
  # Original model card: ilsp's Meltemi 7B Instruct v1
85
 
86
  # Meltemi Instruct Large Language Model for the Greek language
87
 
88
  We present Meltemi-7B-Instruct-v1 Large Language Model (LLM), an instruct fine-tuned version of [Meltemi-7B-v1](https://huggingface.co/ilsp/Meltemi-7B-v1).
89
 
90
-
91
  # Model Information
92
 
93
  - Vocabulary extension of the Mistral-7b tokenizer with Greek tokens
94
  - 8192 context length
95
  - Fine-tuned with 100k Greek machine translated instructions extracted from:
96
- * [Open-Platypus](https://huggingface.co/datasets/garage-bAInd/Open-Platypus) (only subsets with permissive licenses)
97
- * [Evol-Instruct](https://huggingface.co/datasets/WizardLM/WizardLM_evol_instruct_V2_196k)
98
- * [Capybara](https://huggingface.co/datasets/LDJnr/Capybara)
99
- * A hand-crafted Greek dataset with multi-turn examples steering the instruction-tuned model towards safe and harmless responses
100
  - Our SFT procedure is based on the [Hugging Face finetuning recipes](https://github.com/huggingface/alignment-handbook)
101
 
102
-
103
  # Instruction format
 
104
  The prompt format is the same as the [Zephyr](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) format and can be
105
  utilized through the tokenizer's [chat template](https://huggingface.co/docs/transformers/main/chat_templating) functionality as follows:
106
 
@@ -164,25 +152,25 @@ print(tokenizer.batch_decode(outputs)[0])
164
 
165
  The evaluation suite we created includes 6 test sets. The suite is integrated with [lm-eval-harness](https://github.com/EleutherAI/lm-evaluation-harness).
166
 
167
- Our evaluation suite includes:
168
- * Four machine-translated versions ([ARC Greek](https://huggingface.co/datasets/ilsp/arc_greek), [Truthful QA Greek](https://huggingface.co/datasets/ilsp/truthful_qa_greek), [HellaSwag Greek](https://huggingface.co/datasets/ilsp/hellaswag_greek), [MMLU Greek](https://huggingface.co/datasets/ilsp/mmlu_greek)) of established English benchmarks for language understanding and reasoning ([ARC Challenge](https://arxiv.org/abs/1803.05457), [Truthful QA](https://arxiv.org/abs/2109.07958), [Hellaswag](https://arxiv.org/abs/1905.07830), [MMLU](https://arxiv.org/abs/2009.03300)).
169
- * An existing benchmark for question answering in Greek ([Belebele](https://arxiv.org/abs/2308.16884))
170
- * A novel benchmark created by the ILSP team for medical question answering based on the medical exams of [DOATAP](https://www.doatap.gr) ([Medical MCQA](https://huggingface.co/datasets/ilsp/medical_mcqa_greek)).
171
 
172
- Our evaluation for Meltemi-7b is performed in a few-shot setting, consistent with the settings in the [Open LLM leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard). We can see that our training enhances performance across all Greek test sets by a **+14.9%** average improvement. The results for the Greek test sets are shown in the following table:
 
 
173
 
174
- | | Medical MCQA EL (15-shot) | Belebele EL (5-shot) | HellaSwag EL (10-shot) | ARC-Challenge EL (25-shot) | TruthfulQA MC2 EL (0-shot) | MMLU EL (5-shot) | Average |
175
- |----------------|----------------|-------------|--------------|------------------|-------------------|---------|---------|
176
- | Mistral 7B | 29.8% | 45.0% | 36.5% | 27.1% | 45.8% | 35% | 36.5% |
177
- | Meltemi 7B | 41.0% | 63.6% | 61.6% | 43.2% | 52.1% | 47% | 51.4% |
178
 
 
 
 
 
179
 
180
  # Ethical Considerations
181
 
182
  This model has not been aligned with human preferences, and therefore might generate misleading, harmful, and toxic content.
183
 
184
-
185
  # Acknowledgements
186
 
187
- The ILSP team utilized Amazon’s cloud computing services, which were made available via GRNET under the [OCRE Cloud framework](https://www.ocre-project.eu/), providing Amazon Web Services for the Greek Academic and Research Community.
 
188
  <!-- original-model-card end -->
 
5
  pipeline_tag: text-generation
6
  quantized_by: SPAHE
7
  tags:
8
+ - finetuned
9
  ---
10
+
11
  <!-- markdownlint-disable MD041 -->
12
 
13
  # Meltemi 7B Instruct v1 - GGUF
14
+
15
  - Original model: [Meltemi 7B Instruct v1](https://huggingface.co/ilsp/Meltemi-7B-Instruct-v1)
16
 
17
  <!-- description start -->
18
+
19
  ## Description
20
 
21
+ This repository contains GGUF format model files for [ilsp's Meltemi 7B Instruct v1](https://huggingface.co/ilsp/Meltemi-7B-Instruct-v1), optimized for different performance and storage requirements. Each model variant has been carefully quantized or preserved in floating-point format to suit varying demands for quality, speed, and memory usage.
22
 
23
  <!-- description end -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24
 
25
  <!-- README_GGUF.md-provided-files start -->
26
+
27
  ## Provided files
28
 
29
+ | Name | Quantization Method | Precision (Bits) | File Size | Max RAM Required | Use Case |
30
+ | --------------------------------------------------------------------------------------------------------------------------------------- | ------------------- | ---------------- | --------- | ---------------- | ------------------------------------------------------------- |
31
+ | [meltemi-7b-instruct-v1_q8_0.gguf](https://huggingface.co/SPAHE/Meltemi-7B-Instruct-v1-GGUF/blob/main/meltemi-7b-instruct-v1_q8_0.gguf) | Q8_0 | 8 | 7.40 GB | 7.30 GB | Low quality loss - recommended |
32
+ | [meltemi-7b-instruct-v1_f16.gguf](https://huggingface.co/SPAHE/Meltemi-7B-Instruct-v1-GGUF/blob/main/meltemi-7b-instruct-v1_f16.gguf) | F16 | 16 | 13.90 GB | 14.20 GB | Very large, extremely low quality loss - recommended |
33
+ | [meltemi-7b-instruct-v1_f32.gguf](https://huggingface.co/SPAHE/Meltemi-7B-Instruct-v1-GGUF/blob/main/meltemi-7b-instruct-v1_f32.gguf) | F32 | 32 | 27.80 GB | 29.30 GB | Very very large, extremely low quality loss - not recommended |
34
 
35
+ **Note**: The above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.
36
 
37
  <!-- README_GGUF.md-provided-files end -->
38
 
39
  <!-- README_GGUF.md-how-to-download start -->
 
40
 
41
+ ## How to Download GGUF Files
42
+
43
+ ### For Manual Downloaders
44
+
45
+ It is recommended not to clone the entire repository due to the large file sizes and multiple quantization formats available. Most users will benefit from selecting and downloading a single, specific model file that best suits their requirements.
46
 
47
+ ### Automated Download via Client Libraries
48
 
49
+ For convenience, the following clients and libraries can automate the download process and offer a selection of available models:
 
 
50
 
51
+ - **LM Studio**: Provides an integrated environment for downloading and utilizing models directly.
52
 
53
+ ### Downloading with Command Line
54
+
55
+ The `huggingface-hub` Python library simplifies the process of downloading specific model files. Install the library with:
56
 
57
  ```shell
58
+ pip install huggingface-hub
59
  ```
60
 
61
+ To download a model file directly to your current directory, execute:
62
 
63
  ```shell
64
+ huggingface-cli download SPAHE/Meltemi-7B-Instruct-v1-GGUF --filename meltemi-7b-instruct-v1_q8_0.gguf --output-dir .
65
  ```
66
 
67
+ This command ensures a high-speed download of the specific GGUF file you need without unnecessary data.
68
+
69
+ <!-- README_GGUF.md-how-to-download end -->
70
+
71
  <!-- original-model-card start -->
72
+
73
  # Original model card: ilsp's Meltemi 7B Instruct v1
74
 
75
  # Meltemi Instruct Large Language Model for the Greek language
76
 
77
  We present Meltemi-7B-Instruct-v1 Large Language Model (LLM), an instruct fine-tuned version of [Meltemi-7B-v1](https://huggingface.co/ilsp/Meltemi-7B-v1).
78
 
 
79
  # Model Information
80
 
81
  - Vocabulary extension of the Mistral-7b tokenizer with Greek tokens
82
  - 8192 context length
83
  - Fine-tuned with 100k Greek machine translated instructions extracted from:
84
+ - [Open-Platypus](https://huggingface.co/datasets/garage-bAInd/Open-Platypus) (only subsets with permissive licenses)
85
+ - [Evol-Instruct](https://huggingface.co/datasets/WizardLM/WizardLM_evol_instruct_V2_196k)
86
+ - [Capybara](https://huggingface.co/datasets/LDJnr/Capybara)
87
+ - A hand-crafted Greek dataset with multi-turn examples steering the instruction-tuned model towards safe and harmless responses
88
  - Our SFT procedure is based on the [Hugging Face finetuning recipes](https://github.com/huggingface/alignment-handbook)
89
 
 
90
  # Instruction format
91
+
92
  The prompt format is the same as the [Zephyr](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) format and can be
93
  utilized through the tokenizer's [chat template](https://huggingface.co/docs/transformers/main/chat_templating) functionality as follows:
94
 
 
152
 
153
  The evaluation suite we created includes 6 test sets. The suite is integrated with [lm-eval-harness](https://github.com/EleutherAI/lm-evaluation-harness).
154
 
155
+ Our evaluation suite includes:
 
 
 
156
 
157
+ - Four machine-translated versions ([ARC Greek](https://huggingface.co/datasets/ilsp/arc_greek), [Truthful QA Greek](https://huggingface.co/datasets/ilsp/truthful_qa_greek), [HellaSwag Greek](https://huggingface.co/datasets/ilsp/hellaswag_greek), [MMLU Greek](https://huggingface.co/datasets/ilsp/mmlu_greek)) of established English benchmarks for language understanding and reasoning ([ARC Challenge](https://arxiv.org/abs/1803.05457), [Truthful QA](https://arxiv.org/abs/2109.07958), [Hellaswag](https://arxiv.org/abs/1905.07830), [MMLU](https://arxiv.org/abs/2009.03300)).
158
+ - An existing benchmark for question answering in Greek ([Belebele](https://arxiv.org/abs/2308.16884))
159
+ - A novel benchmark created by the ILSP team for medical question answering based on the medical exams of [DOATAP](https://www.doatap.gr) ([Medical MCQA](https://huggingface.co/datasets/ilsp/medical_mcqa_greek)).
160
 
161
+ Our evaluation for Meltemi-7b is performed in a few-shot setting, consistent with the settings in the [Open LLM leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard). We can see that our training enhances performance across all Greek test sets by a **+14.9%** average improvement. The results for the Greek test sets are shown in the following table:
 
 
 
162
 
163
+ | | Medical MCQA EL (15-shot) | Belebele EL (5-shot) | HellaSwag EL (10-shot) | ARC-Challenge EL (25-shot) | TruthfulQA MC2 EL (0-shot) | MMLU EL (5-shot) | Average |
164
+ | ---------- | ------------------------- | -------------------- | ---------------------- | -------------------------- | -------------------------- | ---------------- | ------- |
165
+ | Mistral 7B | 29.8% | 45.0% | 36.5% | 27.1% | 45.8% | 35% | 36.5% |
166
+ | Meltemi 7B | 41.0% | 63.6% | 61.6% | 43.2% | 52.1% | 47% | 51.4% |
167
 
168
  # Ethical Considerations
169
 
170
  This model has not been aligned with human preferences, and therefore might generate misleading, harmful, and toxic content.
171
 
 
172
  # Acknowledgements
173
 
174
+ The ILSP team utilized Amazon’s cloud computing services, which were made available via GRNET under the [OCRE Cloud framework](https://www.ocre-project.eu/), providing Amazon Web Services for the Greek Academic and Research Community.
175
+
176
  <!-- original-model-card end -->