Upload README.md with huggingface_hub
Browse files
README.md
ADDED
@@ -0,0 +1,88 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: other
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
pipeline_tag: text-generation
|
6 |
+
inference: false
|
7 |
+
tags:
|
8 |
+
- transformers
|
9 |
+
- gguf
|
10 |
+
- imatrix
|
11 |
+
- granite-3.1-8b-instruct
|
12 |
+
---
|
13 |
+
Quantizations of https://huggingface.co/ibm-granite/granite-3.1-8b-instruct
|
14 |
+
|
15 |
+
### Inference Clients/UIs
|
16 |
+
* [llama.cpp](https://github.com/ggerganov/llama.cpp)
|
17 |
+
* [KoboldCPP](https://github.com/LostRuins/koboldcpp)
|
18 |
+
* [ollama](https://github.com/ollama/ollama)
|
19 |
+
* [jan](https://github.com/janhq/jan)
|
20 |
+
* [text-generation-webui](https://github.com/oobabooga/text-generation-webui)
|
21 |
+
* [GPT4All](https://github.com/nomic-ai/gpt4all)
|
22 |
+
---
|
23 |
+
|
24 |
+
# From original readme
|
25 |
+
|
26 |
+
Granite-3.1-8B-Instruct is a 8B parameter long-context instruct model finetuned from Granite-3.1-8B-Base using a combination of open source instruction datasets with permissive license and internally collected synthetic datasets tailored for solving long context problems. This model is developed using a diverse set of techniques with a structured chat format, including supervised finetuning, model alignment using reinforcement learning, and model merging.
|
27 |
+
|
28 |
+
- **Developers:** Granite Team, IBM
|
29 |
+
- **GitHub Repository:** [ibm-granite/granite-3.1-language-models](https://github.com/ibm-granite/granite-3.1-language-models)
|
30 |
+
- **Website**: [Granite Docs](https://www.ibm.com/granite/docs/)
|
31 |
+
- **Paper:** [Granite 3.1 Language Models (coming soon)](https://huggingface.co/collections/ibm-granite/granite-31-language-models-6751dbbf2f3389bec5c6f02d)
|
32 |
+
- **Release Date**: December 18th, 2024
|
33 |
+
- **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
|
34 |
+
|
35 |
+
**Supported Languages:**
|
36 |
+
English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese. Users may finetune Granite 3.1 models for languages beyond these 12 languages.
|
37 |
+
|
38 |
+
**Intended Use:**
|
39 |
+
The model is designed to respond to general instructions and can be used to build AI assistants for multiple domains, including business applications.
|
40 |
+
|
41 |
+
*Capabilities*
|
42 |
+
* Summarization
|
43 |
+
* Text classification
|
44 |
+
* Text extraction
|
45 |
+
* Question-answering
|
46 |
+
* Retrieval Augmented Generation (RAG)
|
47 |
+
* Code related tasks
|
48 |
+
* Function-calling tasks
|
49 |
+
* Multilingual dialog use cases
|
50 |
+
* Long-context tasks including long document/meeting summarization, long document QA, etc.
|
51 |
+
|
52 |
+
**Generation:**
|
53 |
+
This is a simple example of how to use Granite-3.1-8B-Instruct model.
|
54 |
+
|
55 |
+
Install the following libraries:
|
56 |
+
|
57 |
+
```shell
|
58 |
+
pip install torch torchvision torchaudio
|
59 |
+
pip install accelerate
|
60 |
+
pip install transformers
|
61 |
+
```
|
62 |
+
Then, copy the snippet from the section that is relevant for your use case.
|
63 |
+
|
64 |
+
```python
|
65 |
+
import torch
|
66 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
67 |
+
|
68 |
+
device = "auto"
|
69 |
+
model_path = "ibm-granite/granite-3.1-8b-instruct"
|
70 |
+
tokenizer = AutoTokenizer.from_pretrained(model_path)
|
71 |
+
# drop device_map if running on CPU
|
72 |
+
model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
|
73 |
+
model.eval()
|
74 |
+
# change input text as desired
|
75 |
+
chat = [
|
76 |
+
{ "role": "user", "content": "Please list one IBM Research laboratory located in the United States. You should only output its name and location." },
|
77 |
+
]
|
78 |
+
chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
|
79 |
+
# tokenize the text
|
80 |
+
input_tokens = tokenizer(chat, return_tensors="pt").to(device)
|
81 |
+
# generate output tokens
|
82 |
+
output = model.generate(**input_tokens,
|
83 |
+
max_new_tokens=100)
|
84 |
+
# decode output tokens into text
|
85 |
+
output = tokenizer.batch_decode(output)
|
86 |
+
# print output
|
87 |
+
print(output)
|
88 |
+
```
|