Add Phi 3 model card, meta data, original summary, and basic quant info
Browse files
README.md
ADDED
@@ -0,0 +1,36 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
license_link: https://huggingface.co/microsoft/Phi-3-mini-128k-instruct/resolve/main/LICENSE
|
4 |
+
library: llama.cpp
|
5 |
+
library_link: https://github.com/ggerganov/llama.cpp
|
6 |
+
language:
|
7 |
+
- en
|
8 |
+
pipeline_tag: text-generation
|
9 |
+
tags:
|
10 |
+
- nlp
|
11 |
+
- code
|
12 |
+
- gguf
|
13 |
+
---
|
14 |
+
|
15 |
+
## Model Summary
|
16 |
+
|
17 |
+
The Phi-3-Mini-128K-Instruct is a 3.8 billion-parameter, lightweight, state-of-the-art open model trained using the Phi-3 datasets.
|
18 |
+
This dataset includes both synthetic data and filtered publicly available website data, with an emphasis on high-quality and reasoning-dense properties.
|
19 |
+
The model belongs to the Phi-3 family with the Mini version in two variants [4K](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) and [128K](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct) which is the context length (in tokens) that it can support.
|
20 |
+
|
21 |
+
After initial training, the model underwent a post-training process that involved supervised fine-tuning and direct preference optimization to enhance its ability to follow instructions and adhere to safety measures.
|
22 |
+
When evaluated against benchmarks that test common sense, language understanding, mathematics, coding, long-term context, and logical reasoning, the Phi-3 Mini-128K-Instruct demonstrated robust and state-of-the-art performance among models with fewer than 13 billion parameters.
|
23 |
+
Resources and Technical Documentation:
|
24 |
+
|
25 |
+
+ [Phi-3 Microsoft Blog](https://aka.ms/phi3blog-april)
|
26 |
+
+ [Phi-3 Technical Report](https://aka.ms/phi3-tech-report)
|
27 |
+
|
28 |
+
## Quantized Model Files
|
29 |
+
|
30 |
+
Phi-1 is available in several formats, catering to different computational needs:
|
31 |
+
|
32 |
+
- **ggml-model-q4_0.gguf**: 4-bit quantization, offering a compact size of 2.1 GB for efficient inference.
|
33 |
+
- **ggml-model-q8_0.gguf**: 8-bit quantization, providing robust performance with a file size of 3.8 GB.
|
34 |
+
- **ggml-model-f16.gguf**: Standard 16-bit floating-point format, with a larger file size of 7.2 GB for enhanced precision.
|
35 |
+
|
36 |
+
These formats, ranging from 4-bit to 16-bit, accommodate various computational environments, from resource-constrained devices to high-end server
|