RichardErkhov commited on
Commit
4ac7eda
1 Parent(s): 98d7426

uploaded readme

Browse files
Files changed (1) hide show
  1. README.md +132 -0
README.md ADDED
@@ -0,0 +1,132 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quantization made by Richard Erkhov.
2
+
3
+ [Github](https://github.com/RichardErkhov)
4
+
5
+ [Discord](https://discord.gg/pvy7H8DZMG)
6
+
7
+ [Request more models](https://github.com/RichardErkhov/quant_request)
8
+
9
+
10
+ Phi-3-Context-Obedient-RAG - GGUF
11
+ - Model creator: https://huggingface.co/TroyDoesAI/
12
+ - Original model: https://huggingface.co/TroyDoesAI/Phi-3-Context-Obedient-RAG/
13
+
14
+
15
+ | Name | Quant method | Size |
16
+ | ---- | ---- | ---- |
17
+ | [Phi-3-Context-Obedient-RAG.Q2_K.gguf](https://huggingface.co/RichardErkhov/TroyDoesAI_-_Phi-3-Context-Obedient-RAG-gguf/blob/main/Phi-3-Context-Obedient-RAG.Q2_K.gguf) | Q2_K | 1.32GB |
18
+ | [Phi-3-Context-Obedient-RAG.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/TroyDoesAI_-_Phi-3-Context-Obedient-RAG-gguf/blob/main/Phi-3-Context-Obedient-RAG.IQ3_XS.gguf) | IQ3_XS | 1.51GB |
19
+ | [Phi-3-Context-Obedient-RAG.IQ3_S.gguf](https://huggingface.co/RichardErkhov/TroyDoesAI_-_Phi-3-Context-Obedient-RAG-gguf/blob/main/Phi-3-Context-Obedient-RAG.IQ3_S.gguf) | IQ3_S | 1.57GB |
20
+ | [Phi-3-Context-Obedient-RAG.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/TroyDoesAI_-_Phi-3-Context-Obedient-RAG-gguf/blob/main/Phi-3-Context-Obedient-RAG.Q3_K_S.gguf) | Q3_K_S | 1.57GB |
21
+ | [Phi-3-Context-Obedient-RAG.IQ3_M.gguf](https://huggingface.co/RichardErkhov/TroyDoesAI_-_Phi-3-Context-Obedient-RAG-gguf/blob/main/Phi-3-Context-Obedient-RAG.IQ3_M.gguf) | IQ3_M | 1.73GB |
22
+ | [Phi-3-Context-Obedient-RAG.Q3_K.gguf](https://huggingface.co/RichardErkhov/TroyDoesAI_-_Phi-3-Context-Obedient-RAG-gguf/blob/main/Phi-3-Context-Obedient-RAG.Q3_K.gguf) | Q3_K | 1.82GB |
23
+ | [Phi-3-Context-Obedient-RAG.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/TroyDoesAI_-_Phi-3-Context-Obedient-RAG-gguf/blob/main/Phi-3-Context-Obedient-RAG.Q3_K_M.gguf) | Q3_K_M | 1.82GB |
24
+ | [Phi-3-Context-Obedient-RAG.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/TroyDoesAI_-_Phi-3-Context-Obedient-RAG-gguf/blob/main/Phi-3-Context-Obedient-RAG.Q3_K_L.gguf) | Q3_K_L | 1.94GB |
25
+ | [Phi-3-Context-Obedient-RAG.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/TroyDoesAI_-_Phi-3-Context-Obedient-RAG-gguf/blob/main/Phi-3-Context-Obedient-RAG.IQ4_XS.gguf) | IQ4_XS | 1.93GB |
26
+ | [Phi-3-Context-Obedient-RAG.Q4_0.gguf](https://huggingface.co/RichardErkhov/TroyDoesAI_-_Phi-3-Context-Obedient-RAG-gguf/blob/main/Phi-3-Context-Obedient-RAG.Q4_0.gguf) | Q4_0 | 2.03GB |
27
+ | [Phi-3-Context-Obedient-RAG.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/TroyDoesAI_-_Phi-3-Context-Obedient-RAG-gguf/blob/main/Phi-3-Context-Obedient-RAG.IQ4_NL.gguf) | IQ4_NL | 2.04GB |
28
+ | [Phi-3-Context-Obedient-RAG.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/TroyDoesAI_-_Phi-3-Context-Obedient-RAG-gguf/blob/main/Phi-3-Context-Obedient-RAG.Q4_K_S.gguf) | Q4_K_S | 2.04GB |
29
+ | [Phi-3-Context-Obedient-RAG.Q4_K.gguf](https://huggingface.co/RichardErkhov/TroyDoesAI_-_Phi-3-Context-Obedient-RAG-gguf/blob/main/Phi-3-Context-Obedient-RAG.Q4_K.gguf) | Q4_K | 2.23GB |
30
+ | [Phi-3-Context-Obedient-RAG.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/TroyDoesAI_-_Phi-3-Context-Obedient-RAG-gguf/blob/main/Phi-3-Context-Obedient-RAG.Q4_K_M.gguf) | Q4_K_M | 2.23GB |
31
+ | [Phi-3-Context-Obedient-RAG.Q4_1.gguf](https://huggingface.co/RichardErkhov/TroyDoesAI_-_Phi-3-Context-Obedient-RAG-gguf/blob/main/Phi-3-Context-Obedient-RAG.Q4_1.gguf) | Q4_1 | 2.24GB |
32
+ | [Phi-3-Context-Obedient-RAG.Q5_0.gguf](https://huggingface.co/RichardErkhov/TroyDoesAI_-_Phi-3-Context-Obedient-RAG-gguf/blob/main/Phi-3-Context-Obedient-RAG.Q5_0.gguf) | Q5_0 | 2.46GB |
33
+ | [Phi-3-Context-Obedient-RAG.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/TroyDoesAI_-_Phi-3-Context-Obedient-RAG-gguf/blob/main/Phi-3-Context-Obedient-RAG.Q5_K_S.gguf) | Q5_K_S | 2.46GB |
34
+ | [Phi-3-Context-Obedient-RAG.Q5_K.gguf](https://huggingface.co/RichardErkhov/TroyDoesAI_-_Phi-3-Context-Obedient-RAG-gguf/blob/main/Phi-3-Context-Obedient-RAG.Q5_K.gguf) | Q5_K | 2.62GB |
35
+ | [Phi-3-Context-Obedient-RAG.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/TroyDoesAI_-_Phi-3-Context-Obedient-RAG-gguf/blob/main/Phi-3-Context-Obedient-RAG.Q5_K_M.gguf) | Q5_K_M | 2.62GB |
36
+ | [Phi-3-Context-Obedient-RAG.Q5_1.gguf](https://huggingface.co/RichardErkhov/TroyDoesAI_-_Phi-3-Context-Obedient-RAG-gguf/blob/main/Phi-3-Context-Obedient-RAG.Q5_1.gguf) | Q5_1 | 2.68GB |
37
+ | [Phi-3-Context-Obedient-RAG.Q6_K.gguf](https://huggingface.co/RichardErkhov/TroyDoesAI_-_Phi-3-Context-Obedient-RAG-gguf/blob/main/Phi-3-Context-Obedient-RAG.Q6_K.gguf) | Q6_K | 2.92GB |
38
+ | [Phi-3-Context-Obedient-RAG.Q8_0.gguf](https://huggingface.co/RichardErkhov/TroyDoesAI_-_Phi-3-Context-Obedient-RAG-gguf/blob/main/Phi-3-Context-Obedient-RAG.Q8_0.gguf) | Q8_0 | 3.78GB |
39
+
40
+
41
+
42
+
43
+ Original model description:
44
+ ---
45
+ license: cc-by-sa-4.0
46
+ ---
47
+
48
+ Base Model : microsoft/Phi-3-mini-128k-instruct
49
+
50
+ Overview
51
+ This model is meant to enhance adherence to provided context (e.g., for RAG applications) and reduce hallucinations, inspired by airoboros context-obedient question answer format.
52
+
53
+ ---
54
+ license: cc-by-4.0
55
+ ---
56
+
57
+ # Contextual DPO
58
+
59
+ ## Overview
60
+
61
+ The format for a contextual prompt is as follows:
62
+ ```
63
+ BEGININPUT
64
+ BEGINCONTEXT
65
+ [key0: value0]
66
+ [key1: value1]
67
+ ... other metdata ...
68
+ ENDCONTEXT
69
+ [insert your text blocks here]
70
+ ENDINPUT
71
+ [add as many other blocks, in the exact same format]
72
+ BEGININSTRUCTION
73
+ [insert your instruction(s). The model was tuned with single questions, paragraph format, lists, etc.]
74
+ ENDINSTRUCTION
75
+ ```
76
+
77
+ I know it's a bit verbose and annoying, but after much trial and error, using these explicit delimiters helps the model understand where to find the responses and how to associate specific sources with it.
78
+ - `BEGININPUT` - denotes a new input block
79
+ - `BEGINCONTEXT` - denotes the block of context (metadata key/value pairs) to associate with the current input block
80
+ - `ENDCONTEXT` - denotes the end of the metadata block for the current input
81
+ - [text] - Insert whatever text you want for the input block, as many paragraphs as can fit in the context.
82
+ - `ENDINPUT` - denotes the end of the current input block
83
+ - [repeat as many input blocks in this format as you want]
84
+ - `BEGININSTRUCTION` - denotes the start of the list (or one) instruction(s) to respond to for all of the input blocks above.
85
+ - [instruction(s)]
86
+ - `ENDINSTRUCTION` - denotes the end of instruction set
87
+
88
+ Here's a trivial, but important example to prove the point:
89
+ ```
90
+ BEGININPUT
91
+ BEGINCONTEXT
92
+ date: 2021-01-01
93
+ url: https://web.site/123
94
+ ENDCONTEXT
95
+ In a shocking turn of events, blueberries are now green, but will be sticking with the same name.
96
+ ENDINPUT
97
+ BEGININSTRUCTION
98
+ What color are bluberries? Source?
99
+ ENDINSTRUCTION
100
+ ```
101
+
102
+ And the expected response:
103
+ ```
104
+ Blueberries are now green.
105
+ Source:
106
+ date: 2021-01-01
107
+ url: https://web.site/123
108
+ ```
109
+
110
+ ### References in response
111
+
112
+ As shown in the example, the dataset includes many examples of including source details in the response, when the question asks for source/citation/references.
113
+
114
+ Why do this? Well, the R in RAG seems to be the weakest link in the chain.
115
+ Retrieval accuracy, depending on many factors including the overall dataset size, can be quite low.
116
+ This accuracy increases when retrieving more documents, but then you have the issue of actually using
117
+ the retrieved documents in prompts. If you use one prompt per document (or document chunk), you know
118
+ exactly which document the answer came from, so there's no issue. If, however, you include multiple
119
+ chunks in a single prompt, it's useful to include the specific reference chunk(s) used to generate the
120
+ response, rather than naively including references to all of the chunks included in the prompt.
121
+
122
+ For example, suppose I have two documents:
123
+ ```
124
+ url: http://foo.bar/1
125
+ Strawberries are tasty.
126
+
127
+ url: http://bar.foo/2
128
+ The cat is blue.
129
+ ```
130
+
131
+ If the question being asked is `What color is the cat?`, I would only expect the 2nd document to be referenced in the response, as the other link is irrelevant.
132
+