duyntnet
/

granite-3.1-8b-instruct-imatrix-GGUF

Text Generation

granite-3.1-8b-instruct

Model card Files Files and versions Community

granite-3.1-8b-instruct-imatrix-GGUF / README.md

duyntnet's picture

Upload README.md with huggingface_hub

1edd83e verified 7 days ago

|

3.46 kB

	---
	license: other
	language:
	- en
	pipeline_tag: text-generation
	inference: false
	tags:
	- transformers
	- gguf
	- imatrix
	- granite-3.1-8b-instruct
	---
	Quantizations of https://huggingface.co/ibm-granite/granite-3.1-8b-instruct

	### Inference Clients/UIs
	* [llama.cpp](https://github.com/ggerganov/llama.cpp)
	* [KoboldCPP](https://github.com/LostRuins/koboldcpp)
	* [ollama](https://github.com/ollama/ollama)
	* [jan](https://github.com/janhq/jan)
	* [text-generation-webui](https://github.com/oobabooga/text-generation-webui)
	* [GPT4All](https://github.com/nomic-ai/gpt4all)
	---

	# From original readme

	Granite-3.1-8B-Instruct is a 8B parameter long-context instruct model finetuned from Granite-3.1-8B-Base using a combination of open source instruction datasets with permissive license and internally collected synthetic datasets tailored for solving long context problems. This model is developed using a diverse set of techniques with a structured chat format, including supervised finetuning, model alignment using reinforcement learning, and model merging.

	- Developers: Granite Team, IBM
	- GitHub Repository: [ibm-granite/granite-3.1-language-models](https://github.com/ibm-granite/granite-3.1-language-models)
	- Website: [Granite Docs](https://www.ibm.com/granite/docs/)
	- Paper: [Granite 3.1 Language Models (coming soon)](https://huggingface.co/collections/ibm-granite/granite-31-language-models-6751dbbf2f3389bec5c6f02d)
	- Release Date: December 18th, 2024
	- License: [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)

	Supported Languages:
	English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese. Users may finetune Granite 3.1 models for languages beyond these 12 languages.

	Intended Use:
	The model is designed to respond to general instructions and can be used to build AI assistants for multiple domains, including business applications.

	Capabilities
	* Summarization
	* Text classification
	* Text extraction
	* Question-answering
	* Retrieval Augmented Generation (RAG)
	* Code related tasks
	* Function-calling tasks
	* Multilingual dialog use cases
	* Long-context tasks including long document/meeting summarization, long document QA, etc.

	Generation:
	This is a simple example of how to use Granite-3.1-8B-Instruct model.

	Install the following libraries:

	```shell
	pip install torch torchvision torchaudio
	pip install accelerate
	pip install transformers
	```
	Then, copy the snippet from the section that is relevant for your use case.

	```python
	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer

	device = "auto"
	model_path = "ibm-granite/granite-3.1-8b-instruct"
	tokenizer = AutoTokenizer.from_pretrained(model_path)
	# drop device_map if running on CPU
	model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
	model.eval()
	# change input text as desired
	chat = [
	{ "role": "user", "content": "Please list one IBM Research laboratory located in the United States. You should only output its name and location." },
	]
	chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
	# tokenize the text
	input_tokens = tokenizer(chat, return_tensors="pt").to(device)
	# generate output tokens
	output = model.generate(**input_tokens,
	max_new_tokens=100)
	# decode output tokens into text
	output = tokenizer.batch_decode(output)
	# print output
	print(output)
	```