aorogat commited on
Commit
bf88737
·
verified ·
1 Parent(s): fcfe6c5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +45 -75
README.md CHANGED
@@ -1,6 +1,6 @@
1
  ---
2
  datasets:
3
- - USERNAME/QueryBridge
4
  ---
5
 
6
  # Model Overview
@@ -28,7 +28,7 @@ The tagged questions in the QueryBridge dataset are designed to train language m
28
  | `<ref>`| **References**: Tags in questions that refer back to previously mentioned entities or concepts. These can indicate cycles or self-references in queries. Example: In "Who is the CEO of the company founded by himself?", the word 'himself' is tagged as `<ref>himself</ref>`. |
29
 
30
 
31
- ## How to use the model?
32
  To use the model, you can run it with TorchTune commands. I have provided the necessary Python code to automate the process. Follow these steps to get started:
33
 
34
  <details>
@@ -168,50 +168,58 @@ python command.py
168
  </details>
169
 
170
 
171
- ## How we finetuned the model?
 
 
 
 
 
 
 
 
 
 
172
 
173
 
174
  <details>
175
  <summary>Steps</summary>
176
-
177
- ### Model Configuration
178
- See https://pytorch.org/torchtune/stable/tutorials/e2e_flow.html to know how to use torchtune.
179
 
180
- To finetune the model:
181
- - Download the model:
182
- tune download \
 
 
 
183
  meta-llama/Meta-Llama-3-8B \
184
  --output-dir /home/YOUR_USERNAME/Meta-Llama-3-8B \
185
  --hf-token <ACCESS TOKEN>
186
-
187
- - Prepare the config file.
188
 
189
- ### Download config file
190
- Run the command:
 
 
191
  tune cp llama3/8B_lora_single_device custom_config.yaml
 
 
192
 
193
- Update the file as follows:
194
- <details>
195
- <summary>Configuration File</summary>
196
  ```yaml
197
  # Config for single device LoRA finetuning in lora_finetune_single_device.py
198
  # using a Llama3 8B model
199
  #
200
- # This config assumes that you've run the following command before launching
201
- # this run:
202
  # tune download meta-llama/Meta-Llama-3-8B --output-dir /tmp/Meta-Llama-3-8B --hf-token <HF_TOKEN>
203
  #
204
- # To launch on a single device, run the following command from root:
205
  # tune run lora_finetune_single_device --config llama3/8B_lora_single_device
206
  #
207
- # You can add specific overrides through the command line. For example
208
- # to override the checkpointer directory while launching training
209
- # you can run:
210
  # tune run lora_finetune_single_device --config llama3/8B_lora_single_device checkpointer.checkpoint_dir=<YOUR_CHECKPOINT_DIR>
211
  #
212
- # This config works only for training on single device.
213
 
214
- \# Model Arguments
215
  model:
216
  _component_: torchtune.models.llama3.lora_llama3_8b
217
  lora_attn_modules: ['q_proj', 'v_proj']
@@ -220,7 +228,7 @@ model:
220
  lora_rank: 8
221
  lora_alpha: 16
222
 
223
- \# Tokenizer
224
  tokenizer:
225
  _component_: torchtune.models.llama3.llama3_tokenizer
226
  path: /home/YOUR_USERNAME/Meta-Llama-3-8B/original/tokenizer.model
@@ -236,7 +244,7 @@ checkpointer:
236
  model_type: LLAMA3
237
  resume_from_checkpoint: False
238
 
239
- \# Dataset and Sampler
240
  dataset:
241
  _component_: torchtune.datasets.instruct_dataset
242
  split: train
@@ -247,7 +255,7 @@ seed: null
247
  shuffle: True
248
  batch_size: 1
249
 
250
- \# Optimizer and Scheduler
251
  optimizer:
252
  _component_: torch.optim.AdamW
253
  weight_decay: 0.01
@@ -259,75 +267,37 @@ lr_scheduler:
259
  loss:
260
  _component_: torch.nn.CrossEntropyLoss
261
 
262
- \# Training
263
  epochs: 1
264
  max_steps_per_epoch: null
265
  gradient_accumulation_steps: 64
266
  compile: False
267
 
268
- \# Logging
269
  output_dir: /home/YOUR_USERNAME/lora_finetune_output
270
  metric_logger:
271
  _component_: torchtune.utils.metric_logging.DiskLogger
272
  log_dir: ${output_dir}
273
  log_every_n_steps: null
274
 
275
- \# Environment
276
  device: cuda
277
  dtype: bf16
278
  enable_activation_checkpointing: True
279
 
280
- \# Profiler (disabled)
281
  profiler:
282
  _component_: torchtune.utils.profiler
283
  enabled: False
284
  ```
285
- </summary>
286
- </details>
287
-
288
- Run the finetune: tune run lora_finetune_single_device --config /home/YOUR_USERNAME/.../custom_config.yaml
289
- Inference Configuration
290
- Copy the generation config: tune cp generation ./custom_generation_config.yaml
291
-
292
- Update the file:
293
- ```yaml
294
- # Config for running the InferenceRecipe in generate.py to generate output from an LLM
295
- #
296
- # To launch, run the following command from root torchtune directory:
297
- # tune run generate --config generation
298
 
299
- # Model arguments
300
- model:
301
- _component_: torchtune.models.llama3.llama3_8b
302
 
303
- checkpointer:
304
- _component_: torchtune.utils.FullModelMetaCheckpointer
305
-
306
- checkpoint_dir: /home/YOUR_USERNAME/Meta-Llama-3-8B/
307
- checkpoint_files: [
308
- meta_model_0.pt
309
- ]
310
- output_dir: /home/YOUR_USERNAME/Meta-Llama-3-8B/
311
- model_type: LLAMA3
312
-
313
- device: cuda
314
- dtype: bf16
315
-
316
- seed: 1234
317
-
318
- # Tokenizer arguments
319
- tokenizer:
320
- _component_: torchtune.models.llama3.llama3_tokenizer
321
- path: /home/YOUR_USERNAME/Meta-Llama-3-8B/original/tokenizer.model
322
-
323
- # Generation arguments; defaults taken from gpt-fast
324
- prompt: "### Instruction: \nYou are a powerful model trained to convert questions to tagged questions. Use the tags as follows: \n<qt> to surround question keywords like 'What', 'Who', 'Which', 'How many', 'Return' or any word that represents requests. \n<o> to surround entities as an object like person name, place name, etc. It must be a noun or a noun phrase. \n<s> to surround entities as a subject like person name, place name, etc. The difference between <s> and <o>, <s> only appear in yes/no questions as in the training data you saw before. \n<cc> to surround coordinating conjunctions that connect two or more phrases like 'and', 'or', 'nor', etc. \n<p> to surround predicates that may be an entity attribute or a relationship between two entities. It can be a verb phrase or a noun phrase. The question must contain at least one predicate. \n<off> for offset in questions asking for the second, third, etc. For example, the question 'What is the second largest country?', <off> will be located as follows. 'What is the <off>second</off> largest country?' \n<t> to surround entity types like person, place, etc. \n<op> to surround operators that compare quantities or values, like 'greater than', 'more than', etc. \n<ref> to indicate a reference within the question that requires a cycle to refer back to an entity (e.g., 'Who is the CEO of a company founded by himself?' where 'himself' would be tagged as <ref>himself</ref>). \nInput: Which films directed by a director died in 2014 and starring both Julia Roberts and Richard Gere?\nResponse:"
325
- max_new_tokens: 100
326
- temperature: 0.6 # 0.8 and 0.6 are popular values to try
327
- top_k: 1
328
-
329
- quantizer: null
330
  ```
331
 
332
- Run the generation: tune run generate --config /home/YOUR_USERNAME/.../custom_generation_config.yaml
 
333
  </details>
 
1
  ---
2
  datasets:
3
+ - aorogat/QueryBridge
4
  ---
5
 
6
  # Model Overview
 
28
  | `<ref>`| **References**: Tags in questions that refer back to previously mentioned entities or concepts. These can indicate cycles or self-references in queries. Example: In "Who is the CEO of the company founded by himself?", the word 'himself' is tagged as `<ref>himself</ref>`. |
29
 
30
 
31
+ # How to use the model?
32
  To use the model, you can run it with TorchTune commands. I have provided the necessary Python code to automate the process. Follow these steps to get started:
33
 
34
  <details>
 
168
  </details>
169
 
170
 
171
+ # How We Fine-Tuned the Model
172
+
173
+ We fine-tuned the `Meta-Llama-3-8B` model by two key steps: preparing the dataset and executing the fine-tuning process.
174
+
175
+ ### Prepare the Dataset
176
+
177
+ For this fine-tuning, we utilized the [QueryBridge dataset](https://huggingface.co/datasets/USERNAME/QueryBridge), specifically the pairs of questions and their corresponding tagged questions. However, before we can use this dataset, it is necessary to convert the data into instruct prompts suitable for fine-tuning the model. You can find these prompts at [this link](https://huggingface.co/datasets/aorogat/Questions_to_Tagged_Questions_Prompts). Download the prompts and save them in the directory: `/home/YOUR_USERNAME/data`
178
+
179
+ ### Fine-Tune the Model
180
+
181
+ To fine-tune the `Meta-Llama-3-8B` model, we leveraged [Torchtune](https://pytorch.org/torchtune/stable/index.html). Follow these steps to complete the process:
182
 
183
 
184
  <details>
185
  <summary>Steps</summary>
 
 
 
186
 
187
+
188
+ ### Step 1: Download the Model
189
+ Begin by downloading the model with the following command. Replace `<ACCESS TOKEN>` with your actual Huggingface token and adjust the output directory as needed:
190
+
191
+ ```bash
192
+ tune download \
193
  meta-llama/Meta-Llama-3-8B \
194
  --output-dir /home/YOUR_USERNAME/Meta-Llama-3-8B \
195
  --hf-token <ACCESS TOKEN>
196
+ ```
 
197
 
198
+ ### Step 2: Prepare the Configuration File
199
+ Next, you need to set up a configuration file. Start by downloading the default configuration:
200
+
201
+ ```bash
202
  tune cp llama3/8B_lora_single_device custom_config.yaml
203
+ ```
204
+ Then, open custom_config.yaml and update it as follows:
205
 
 
 
 
206
  ```yaml
207
  # Config for single device LoRA finetuning in lora_finetune_single_device.py
208
  # using a Llama3 8B model
209
  #
210
+ # Ensure the model is downloaded using the following command before launching:
 
211
  # tune download meta-llama/Meta-Llama-3-8B --output-dir /tmp/Meta-Llama-3-8B --hf-token <HF_TOKEN>
212
  #
213
+ # To launch on a single device, run this command from the root directory:
214
  # tune run lora_finetune_single_device --config llama3/8B_lora_single_device
215
  #
216
+ # You can add specific overrides through the command line. For example,
217
+ # to override the checkpointer directory, use:
 
218
  # tune run lora_finetune_single_device --config llama3/8B_lora_single_device checkpointer.checkpoint_dir=<YOUR_CHECKPOINT_DIR>
219
  #
220
+ # This config is for training on a single device.
221
 
222
+ # Model Arguments
223
  model:
224
  _component_: torchtune.models.llama3.lora_llama3_8b
225
  lora_attn_modules: ['q_proj', 'v_proj']
 
228
  lora_rank: 8
229
  lora_alpha: 16
230
 
231
+ # Tokenizer
232
  tokenizer:
233
  _component_: torchtune.models.llama3.llama3_tokenizer
234
  path: /home/YOUR_USERNAME/Meta-Llama-3-8B/original/tokenizer.model
 
244
  model_type: LLAMA3
245
  resume_from_checkpoint: False
246
 
247
+ # Dataset and Sampler
248
  dataset:
249
  _component_: torchtune.datasets.instruct_dataset
250
  split: train
 
255
  shuffle: True
256
  batch_size: 1
257
 
258
+ # Optimizer and Scheduler
259
  optimizer:
260
  _component_: torch.optim.AdamW
261
  weight_decay: 0.01
 
267
  loss:
268
  _component_: torch.nn.CrossEntropyLoss
269
 
270
+ # Training
271
  epochs: 1
272
  max_steps_per_epoch: null
273
  gradient_accumulation_steps: 64
274
  compile: False
275
 
276
+ # Logging
277
  output_dir: /home/YOUR_USERNAME/lora_finetune_output
278
  metric_logger:
279
  _component_: torchtune.utils.metric_logging.DiskLogger
280
  log_dir: ${output_dir}
281
  log_every_n_steps: null
282
 
283
+ # Environment
284
  device: cuda
285
  dtype: bf16
286
  enable_activation_checkpointing: True
287
 
288
+ # Profiler (disabled)
289
  profiler:
290
  _component_: torchtune.utils.profiler
291
  enabled: False
292
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
293
 
294
+ ### Step 3: Run the Finetuning Process
295
+ After configuring the file, you can start the finetuning process with the following command:
 
296
 
297
+ ```bash
298
+ tune run lora_finetune_single_device --config /home/YOUR_USERNAME/.../custom_config.yaml
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
299
  ```
300
 
301
+ The new model can be found in `/home/YOUR_USERNAME/Meta-Llama-3-8B/` directory.
302
+
303
  </details>