Sandiago21
commited on
Commit
•
6e1cb11
1
Parent(s):
325d3f4
update readme with instructions on how to load and test the model
Browse files
README.md
CHANGED
@@ -20,7 +20,7 @@ This repository contains a LLaMA-13B further fine-tuned model on conversations a
|
|
20 |
|
21 |
## Model Details
|
22 |
|
23 |
-
Anyone can use (ask prompts) and play with the model using the pre-existing Jupyter Notebook in the **noteboooks** folder.
|
24 |
|
25 |
### Model Description
|
26 |
|
@@ -102,7 +102,7 @@ Use the code below to get started with the model.
|
|
102 |
import torch
|
103 |
from transformers import GenerationConfig, LlamaTokenizer, LlamaForCausalLM
|
104 |
|
105 |
-
MODEL_NAME = "Sandiago21/llama-
|
106 |
|
107 |
config = PeftConfig.from_pretrained(MODEL_NAME)
|
108 |
|
@@ -159,8 +159,8 @@ print(response)
|
|
159 |
import torch
|
160 |
from transformers import GenerationConfig, LlamaTokenizer, LlamaForCausalLM
|
161 |
|
162 |
-
MODEL_NAME = "Sandiago21/llama-
|
163 |
-
BASE_MODEL = "decapoda-research/llama-
|
164 |
|
165 |
config = PeftConfig.from_pretrained(MODEL_NAME)
|
166 |
|
@@ -213,6 +213,29 @@ print(response)
|
|
213 |
|
214 |
## Training Details
|
215 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
216 |
|
217 |
### Training Data
|
218 |
|
@@ -226,4 +249,4 @@ The decapoda-research/llama-13b-hf model was further trained and finetuned on qu
|
|
226 |
|
227 |
## Model Architecture and Objective
|
228 |
|
229 |
-
The model is based on decapoda-research/llama-13b-hf model and finetuned adapters on top of the main model on conversations and question answering data.
|
|
|
20 |
|
21 |
## Model Details
|
22 |
|
23 |
+
Anyone can use (ask prompts) and play with the model using the pre-existing Jupyter Notebook in the **noteboooks** folder. The Jupyter Notebook contains example code to load the model and ask prompts to it as well as example prompts to get you started.
|
24 |
|
25 |
### Model Description
|
26 |
|
|
|
102 |
import torch
|
103 |
from transformers import GenerationConfig, LlamaTokenizer, LlamaForCausalLM
|
104 |
|
105 |
+
MODEL_NAME = "Sandiago21/llama-13b-hf-prompt-answering"
|
106 |
|
107 |
config = PeftConfig.from_pretrained(MODEL_NAME)
|
108 |
|
|
|
159 |
import torch
|
160 |
from transformers import GenerationConfig, LlamaTokenizer, LlamaForCausalLM
|
161 |
|
162 |
+
MODEL_NAME = "Sandiago21/llama-13b-hf-prompt-answering"
|
163 |
+
BASE_MODEL = "decapoda-research/llama-13b-hf
|
164 |
|
165 |
config = PeftConfig.from_pretrained(MODEL_NAME)
|
166 |
|
|
|
213 |
|
214 |
## Training Details
|
215 |
|
216 |
+
## Training procedure
|
217 |
+
|
218 |
+
### Training hyperparameters
|
219 |
+
|
220 |
+
The following hyperparameters were used during training:
|
221 |
+
- learning_rate: 2e-05
|
222 |
+
- train_batch_size: 4
|
223 |
+
- eval_batch_size: 8
|
224 |
+
- seed: 42
|
225 |
+
- gradient_accumulation_steps: 2
|
226 |
+
- total_train_batch_size: 8
|
227 |
+
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
228 |
+
- lr_scheduler_type: linear
|
229 |
+
- lr_scheduler_warmup_steps: 50
|
230 |
+
- num_epochs: 2
|
231 |
+
- mixed_precision_training: Native AMP
|
232 |
+
|
233 |
+
### Framework versions
|
234 |
+
|
235 |
+
- Transformers 4.28.1
|
236 |
+
- Pytorch 2.0.0+cu117
|
237 |
+
- Datasets 2.12.0
|
238 |
+
- Tokenizers 0.12.1
|
239 |
|
240 |
### Training Data
|
241 |
|
|
|
249 |
|
250 |
## Model Architecture and Objective
|
251 |
|
252 |
+
The model is based on decapoda-research/llama-13b-hf model and finetuned adapters on top of the main model on conversations and question answering data.
|