pszemraj commited on
Commit
f5e8289
·
1 Parent(s): eb38f49

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +39 -0
README.md CHANGED
@@ -1,3 +1,42 @@
1
  ---
2
  license: mit
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ datasets:
4
+ - databricks/databricks-dolly-15k
5
+ language:
6
+ - en
7
+ pipeline_tag: text-generation
8
+ tags:
9
+ - dolly
10
+ - dolly-v2
11
+ - instruct
12
+ - sharded
13
+ inference: False
14
  ---
15
+
16
+ # dolly-v2-12b: sharded checkpoint
17
+
18
+ This is a sharded checkpoint (with ~4GB shards) of the `databricks/dolly-v2-12b` model. Refer to the [original model](https://huggingface.co/databricks/dolly-v2-12b) for all details.
19
+
20
+ - this enables low-RAM loading, i.e. Colab :)
21
+
22
+ ## Basic Usage
23
+
24
+
25
+ install `transformers`, `accelerate`, and `bitsandbytes`.
26
+
27
+ ```bash
28
+ pip install -U -q transformers bitsandbytes accelerate
29
+ ```
30
+
31
+ Load the model in 8bit, then [run inference](https://huggingface.co/docs/transformers/generation_strategies#contrastive-search):
32
+
33
+ ```python
34
+ from transformers import AutoTokenizer, AutoModelForCausalLM
35
+
36
+ model_name = "ethzanalytics/dolly-v2-12b-sharded"
37
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
38
+
39
+ model = AutoModelForCausalLM.from_pretrained(
40
+ model_name, load_in_8bit=True, device_map="auto",
41
+ )
42
+ ```