doberst commited on
Commit
a1ec97f
·
verified ·
1 Parent(s): 70dfea6

Upload 2 files

Browse files
Files changed (2) hide show
  1. README.md +37 -3
  2. config.json +0 -0
README.md CHANGED
@@ -1,3 +1,37 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ inference: false
4
+ ---
5
+
6
+ BLING-QWEN-NANO-TOOL
7
+
8
+
9
+ **bling-qwen-nano-tool** is a RAG-finetuned version on Qwen2-0.5B for use in fact-based context question-answering, packaged with 4_K_M GGUF quantization, providing a very fast, very small inference implementation for use on CPUs.
10
+
11
+ To pull the model via API:
12
+
13
+ from huggingface_hub import snapshot_download
14
+ snapshot_download("llmware/bling-qwen-nano-tool", local_dir="/path/on/your/machine/", local_dir_use_symlinks=False)
15
+
16
+
17
+ Load in your favorite GGUF inference engine, or try with llmware as follows:
18
+
19
+ from llmware.models import ModelCatalog
20
+ model = ModelCatalog().load_model("bling-qwen-nano-tool")
21
+ response = model.inference(query, add_context=text_sample)
22
+
23
+ Note: please review [**config.json**](https://huggingface.co/llmware/bling-qwen-nano-tool/blob/main/config.json) in the repository for prompt wrapping information, details on the model, and full test set.
24
+
25
+
26
+ ### Model Description
27
+
28
+ <!-- Provide a longer summary of what this model is. -->
29
+
30
+ - **Developed by:** llmware
31
+ - **Model type:** GGUF
32
+ - **Language(s) (NLP):** English
33
+ - **License:** Apache 2.0
34
+
35
+ ## Model Card Contact
36
+
37
+ Darren Oberst & llmware team
config.json CHANGED
The diff for this file is too large to render. See raw diff