GGML converted version of Databricks Dolly-V2 models
Description
Dolly is trained on ~15k instruction/response fine tuning records databricks-dolly-15k generated by Databricks employees in capability domains from the InstructGPT paper, including brainstorming, classification, closed QA, generation, information extraction, open QA and summarization.
Converted Models
Name | Based on | Type | Container | GGML Version |
---|---|---|---|---|
dolly-v2-12b-f16.bin | databricks/dolly-v2-12b | F16 | GGML | V3 |
dolly-v2-12b-q4_0.bin | databricks/dolly-v2-12b | Q4_0 | GGML | V3 |
dolly-v2-12b-q4_0-ggjt.bin | databricks/dolly-v2-12b | Q4_0 | GGJT | V3 |
dolly-v2-3b-f16.bin | databricks/dolly-v2-3b | F16 | GGML | V3 |
dolly-v2-3b-q4_0.bin | databricks/dolly-v2-3b | Q4_0 | GGML | V3 |
dolly-v2-3b-q4_0-ggjt.bin | databricks/dolly-v2-3b | Q4_0 | GGJT | V3 |
dolly-v2-3b-q5_1.bin | databricks/dolly-v2-3b | Q5_1 | GGML | V3 |
dolly-v2-3b-q5_1-ggjt.bin | databricks/dolly-v2-3b | Q5_1 | GGJT | V3 |
dolly-v2-7b-f16.bin | databricks/dolly-v2-7b | F16 | GGML | V3 |
dolly-v2-7b-q4_0.bin | databricks/dolly-v2-7b | Q4_0 | GGML | V3 |
dolly-v2-7b-q4_0-ggjt.bin | databricks/dolly-v2-7b | Q4_0 | GGJT | V3 |
dolly-v2-7b-q5_1.bin | databricks/dolly-v2-7b | Q5_1 | GGML | V3 |
dolly-v2-7b-q5_1-ggjt.bin | databricks/dolly-v2-7b | Q5_1 | GGJT | V3 |
Usage
Python via llm-rs:
Installation
Via pip: pip install llm-rs
Run inference
from llm_rs import AutoModel
#Load the model, define any model you like from the list above as the `model_file`
model = AutoModel.from_pretrained("rustformers/dolly-v2-ggml",model_file="dolly-v2-12b-q4_0-ggjt.bin")
#Generate
print(model.generate("The meaning of life is"))
Rust via Rustformers/llm:
Installation
git clone --recurse-submodules https://github.com/rustformers/llm.git
cd llm
cargo build --release
Run inference
cargo run --release -- gptneox infer -m path/to/model.bin -p "Tell me how cool the Rust programming language is:"
- Downloads last month
- 102
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.