frankmorales2020
/

Mistral-7B-text-to-sql-flash-attention-2-dataeval

Generated from Trainer

Model card Files Files and versions Community

frankmorales2020 commited on Jun 25, 2024

Commit

4b47e25

·

verified ·

1 Parent(s): 0f028c0

Update README.md

Files changed (1) hide show

README.md +1 -5

README.md CHANGED Viewed

@@ -50,11 +50,8 @@ The following hyperparameters were used during training:
 - num_epochs: 3
 from transformers import TrainingArguments
 args = TrainingArguments(
-    output_dir="Mistral-7B-text-to-sql-flash-attention-2-dataeval",
     num_train_epochs=3,                     # number of training epochs
     per_device_train_batch_size=3,          # batch size per device during training
     gradient_accumulation_steps=8,      #2  # number of steps before performing a backward/update pass
@@ -74,7 +71,6 @@ args = TrainingArguments(
     hub_token=access_token_write,           # Add this line
     load_best_model_at_end=True,
     logging_dir="/content/gdrive/MyDrive/model/Mistral-7B-text-to-sql-flash-attention-2-dataeval/logs",
     evaluation_strategy="steps",
     eval_steps=10,
     save_strategy="steps",

 - num_epochs: 3
 from transformers import TrainingArguments
 args = TrainingArguments(
+    output_dir="Mistral-7B-text-to-sql-flash-attention-2-dataeval",
     num_train_epochs=3,                     # number of training epochs
     per_device_train_batch_size=3,          # batch size per device during training
     gradient_accumulation_steps=8,      #2  # number of steps before performing a backward/update pass
     hub_token=access_token_write,           # Add this line
     load_best_model_at_end=True,
     logging_dir="/content/gdrive/MyDrive/model/Mistral-7B-text-to-sql-flash-attention-2-dataeval/logs",
     evaluation_strategy="steps",
     eval_steps=10,
     save_strategy="steps",