Triangle104 commited on
Commit
b3ae419
1 Parent(s): 992b399

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +94 -0
README.md CHANGED
@@ -15,6 +15,100 @@ base_model: ibm-granite/granite-3.1-8b-instruct
15
  This model was converted to GGUF format from [`ibm-granite/granite-3.1-8b-instruct`](https://huggingface.co/ibm-granite/granite-3.1-8b-instruct) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
16
  Refer to the [original model card](https://huggingface.co/ibm-granite/granite-3.1-8b-instruct) for more details on the model.
17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  ## Use with llama.cpp
19
  Install llama.cpp through brew (works on Mac and Linux)
20
 
 
15
  This model was converted to GGUF format from [`ibm-granite/granite-3.1-8b-instruct`](https://huggingface.co/ibm-granite/granite-3.1-8b-instruct) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
16
  Refer to the [original model card](https://huggingface.co/ibm-granite/granite-3.1-8b-instruct) for more details on the model.
17
 
18
+ ---
19
+ Model details:
20
+ -
21
+ Granite-3.1-8B-Instruct is a 8B parameter long-context instruct model
22
+ finetuned from Granite-3.1-8B-Base using a combination of open source
23
+ instruction datasets with permissive license and internally collected
24
+ synthetic datasets tailored for solving long context problems. This
25
+ model is developed using a diverse set of techniques with a structured
26
+ chat format, including supervised finetuning, model alignment using
27
+ reinforcement learning, and model merging.
28
+
29
+ Developers: Granite Team, IBM
30
+ GitHub Repository: ibm-granite/granite-3.1-language-models
31
+ Website: Granite Docs
32
+ Paper: Granite 3.1 Language Models (coming soon)
33
+ Release Date: December 18th, 2024
34
+ License: Apache 2.0
35
+
36
+
37
+ Supported Languages:
38
+ English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech,
39
+ Italian, Korean, Dutch, and Chinese. Users may finetune Granite 3.1
40
+ models for languages beyond these 12 languages.
41
+
42
+
43
+ Intended Use:
44
+ The model is designed to respond to general instructions and can be used
45
+ to build AI assistants for multiple domains, including business
46
+ applications.
47
+
48
+
49
+ Capabilities
50
+
51
+
52
+ Summarization
53
+ Text classification
54
+ Text extraction
55
+ Question-answering
56
+ Retrieval Augmented Generation (RAG)
57
+ Code related tasks
58
+ Function-calling tasks
59
+ Multilingual dialog use cases
60
+ Long-context tasks including long document/meeting summarization, long document QA, etc.
61
+
62
+
63
+ Generation:
64
+ This is a simple example of how to use Granite-3.1-8B-Instruct model.
65
+
66
+
67
+ Install the following libraries:
68
+
69
+
70
+ pip install torch torchvision torchaudio
71
+ pip install accelerate
72
+ pip install transformers
73
+
74
+
75
+
76
+ Then, copy the snippet from the section that is relevant for your use case.
77
+
78
+
79
+ import torch
80
+ from transformers import AutoModelForCausalLM, AutoTokenizer
81
+
82
+ device = "auto"
83
+ model_path = "ibm-granite/granite-3.1-8b-instruct"
84
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
85
+ # drop device_map if running on CPU
86
+ model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
87
+ model.eval()
88
+ # change input text as desired
89
+ chat = [
90
+ { "role": "user", "content": "Please list one IBM Research laboratory located in the United States. You should only output its name and location." },
91
+ ]
92
+ chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
93
+ # tokenize the text
94
+ input_tokens = tokenizer(chat, return_tensors="pt").to(device)
95
+ # generate output tokens
96
+ output = model.generate(**input_tokens,
97
+ max_new_tokens=100)
98
+ # decode output tokens into text
99
+ output = tokenizer.batch_decode(output)
100
+ # print output
101
+ print(output)
102
+
103
+
104
+
105
+ Model Architecture:
106
+
107
+ Granite-3.1-8B-Instruct is based on a decoder-only dense transformer
108
+ architecture. Core components of this architecture are: GQA and RoPE,
109
+ MLP with SwiGLU, RMSNorm, and shared input/output embeddings.
110
+
111
+ ---
112
  ## Use with llama.cpp
113
  Install llama.cpp through brew (works on Mac and Linux)
114