File size: 3,885 Bytes
d19a3de 5b4b3ce d19a3de 5b4b3ce 53d2861 5b4b3ce 53d2861 5b4b3ce b036e70 5b4b3ce 53d2861 69d22bb 5b4b3ce 2b27777 5b4b3ce 2b27777 5b4b3ce 2b27777 5b4b3ce c4a7a1e 69d22bb 5b4b3ce 2b27777 5b4b3ce 2b27777 5b4b3ce 2b27777 5b4b3ce 02b3e48 41241a9 02b3e48 41241a9 02b3e48 41241a9 02b3e48 eeb77d8 02b3e48 d19a3de |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 |
---
license: apache-2.0
library_name: peft
tags:
- trl
- sft
- generated_from_trainer
datasets:
- generator
base_model: mistralai/Mistral-7B-Instruct-v0.1
model-index:
- name: Mistral-7B-text-to-sql-without-flash-attention-2
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# Mistral-7B-text-to-sql-without-flash-attention-2
This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1) on the generator dataset.
with dataset b-mc2/sql-create-context
## Model description
More information needed
### Testing results
import torch
from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer, pipeline
peft_model_id = "frankmorales2020/Mistral-7B-text-to-sql-without-flash-attention-2"
model = AutoPeftModelForCausalLM.from_pretrained(
peft_model_id,
device_map="auto",
torch_dtype=torch.float16
)
tokenizer = AutoTokenizer.from_pretrained(peft_model_id)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
CASE Number 1:
prompt='What was the first album Beyoncé released as a solo artist?'
prompt = f"Instruct: generate a SQL query.\n{prompt}\nOutput:\n" # for dataset b-mc2/sql-create-context
outputs = pipe(prompt, max_new_tokens=1024, do_sample=True, temperature=0.9, top_k=50, top_p=0.1, eos_token_id=pipe.tokenizer.eos_token_id, pad_token_id=pipe.tokenizer.eos_token_id)
print('Question: %s'%prompt)
print(f"Generated Answer:\n{outputs[0]['generated_text'][len(prompt):].strip()}")
Question: Instruct: generate a SQL query.
What was the first album Beyoncé released as a solo artist?
Output:
Generated Answer:
SELECT first_album FROM table_name_82 WHERE solo_artist = "beyoncé"
CASE Number 2:
prompt='What was the first album Beyoncé released as a solo artist?'
prompt = f"Instruct: Answer the following question.\n{prompt}\nOutput:\n"
outputs = pipe(prompt, max_new_tokens=1024, do_sample=True, temperature=0.9, top_k=50, top_p=0.1, eos_token_id=pipe.tokenizer.eos_token_id, pad_token_id=pipe.tokenizer.eos_token_id)
print('Question: %s'%prompt)
print(f"Generated Answer:\n{outputs[0]['generated_text'][len(prompt):].strip()}")
Question: Instruct: Answer the following question.
What was the first album Beyoncé released as a solo artist?
Output:
Generated Answer:
The first album Beyoncé released as a solo artist was "Dangerously in Love".
CASE Number 3:
prompt='What was the first album Beyoncé released as a solo artist?'
prompt = f"Instruct: generate a SQL query.\n{prompt}\n\n" # for dataset b-mc2/sql-create-context
outputs = pipe(prompt, max_new_tokens=1024, do_sample=True, temperature=0.9, top_k=50, top_p=0.1, eos_token_id=pipe.tokenizer.eos_token_id, pad_token_id=pipe.tokenizer.eos_token_id)
print('Question: %s'%prompt)
print(f"Generated Answer:\n{outputs[0]['generated_text'][len(prompt):].strip()}")
Question: Instruct: generate a SQL query.
What was the first album Beyoncé released as a solo artist?
Generated Answer:
```sql
SELECT first_album FROM table_name_84 WHERE solo_artist = "beyoncé"
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 3
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 6
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: constant
- lr_scheduler_warmup_ratio: 0.03
- num_epochs: 3
### Training results
### Framework versions
- PEFT 0.10.0
- Transformers 4.39.1
- Pytorch 2.2.1+cu121
- Datasets 2.18.0
- Tokenizers 0.15.2 |