Question Answering
PEFT
English
Marcus Cedric R. Idia commited on
Commit
bb8ceb9
·
1 Parent(s): 51d6404

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +79 -13
README.md CHANGED
@@ -9,20 +9,86 @@ language:
9
  - en
10
  pipeline_tag: question-answering
11
  ---
12
- ## Training procedure
13
 
 
14
 
15
- The following `bitsandbytes` quantization config was used during training:
16
- - load_in_8bit: False
17
- - load_in_4bit: True
18
- - llm_int8_threshold: 6.0
19
- - llm_int8_skip_modules: None
20
- - llm_int8_enable_fp32_cpu_offload: False
21
- - llm_int8_has_fp16_weight: False
22
- - bnb_4bit_quant_type: nf4
23
- - bnb_4bit_use_double_quant: False
24
- - bnb_4bit_compute_dtype: float16
25
- ### Framework versions
26
 
 
27
 
28
- - PEFT 0.5.0.dev0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  - en
10
  pipeline_tag: question-answering
11
  ---
12
+ Here is a README.md explaining how to run the Archimedes model locally:
13
 
14
+ # Archimedes Model
15
 
16
+ This README provides instructions for running the Archimedes conversational AI assistant locally.
 
 
 
 
 
 
 
 
 
 
17
 
18
+ ## Requirements
19
 
20
+ - Python 3.6+
21
+ - [Transformers](https://huggingface.co/docs/transformers/installation)
22
+ - [Peft](https://github.com/hazyresearch/peft)
23
+ - PyTorch
24
+ - Access to the LLAMA 2 model files or a cloned public model
25
+
26
+ Install requirements:
27
+
28
+ ```
29
+ !pip install huggingface
30
+ !pip install -q -U trl transformers accelerate git+https://github.com/huggingface/peft.git
31
+ !pip install -q datasets bitsandbytes einops wandb
32
+ ```
33
+
34
+ ## Usage
35
+
36
+ ```python
37
+ import transformers
38
+ from peft import LoraConfig, get_peft_model
39
+ import torch
40
+ from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
41
+
42
+ # Load LLAMA 2 model
43
+ model_name = "meta-llama/Llama-2-13b-chat-hf"
44
+
45
+ # Quantization configuration
46
+ bnb_config = BitsAndBytesConfig(
47
+ load_in_4bit=True,
48
+ bnb_4bit_quant_type="nf4",
49
+ bnb_4bit_compute_dtype=torch.float16,
50
+ )
51
+
52
+ # Load model
53
+ model = AutoModelForCausalLM.from_pretrained(
54
+ model_name,
55
+ quantization_config=bnb_config,
56
+ trust_remote_code=True
57
+ )
58
+
59
+ # Load LoRA configuration
60
+ lora_config = LoraConfig.from_pretrained('harpyerr/archimedes-300s-7b-chat')
61
+ model = get_peft_model(model, lora_config)
62
+
63
+ # Load tokenizer
64
+ tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
65
+ tokenizer.pad_token = tokenizer.eos_token
66
+
67
+ # Define prompt
68
+ text = "Can you tell me who made Space-X?"
69
+ prompt = "You are a helpful assistant. Please provide an informative response. \n\n" + text
70
+
71
+ # Generate response
72
+ device = "cuda:0"
73
+ inputs = tokenizer(prompt, return_tensors="pt").to(device)
74
+ outputs = model.generate(**inputs, max_new_tokens=100)
75
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
76
+ ```
77
+
78
+ This loads the LLAMA 2 model, applies 4-bit quantization and LoRA optimizations, constructs a prompt, and generates a response.
79
+
80
+ See the [docs](https://huggingface.co/docs/transformers/model_doc/auto#transformers.AutoModelForCausalLM) for more details.
81
+
82
+ ## Training
83
+
84
+ The model was trained by Anthropic using self-supervised learning. See the [model card](https://huggingface.co/USERNAME/archimedes) for details.
85
+
86
+ ## License
87
+
88
+ Archimedes is released under the Apache 2.0 license.
89
+
90
+ ## Citation
91
+
92
+ Coming soon!
93
+
94
+ Please ⭐ if this repository was helpful!