File size: 2,085 Bytes
3ea671d f7744f2 3ea671d 07f0461 4474184 07f0461 f7744f2 3ea671d f7744f2 3ea671d f7744f2 3ea671d f7744f2 3ea671d f7744f2 3ea671d f7744f2 3ea671d f7744f2 3ea671d f7744f2 3ea671d f7744f2 3ea671d f7744f2 3ea671d f7744f2 3ea671d f7744f2 3ea671d f7744f2 3ea671d f7744f2 3ea671d f7744f2 3ea671d f7744f2 3ea671d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 |
---
library_name: peft
base_model: mistralai/Mixtral-8x7B-v0.1
license: apache-2.0
datasets:
- knowrohit07/know_sql
language:
- en
---
<!---->
<img src="project-9.png" width="50%" height="50%" >
## SQL-Converter Mixtral 8x7B v0.1
**Convert Natural Language to SQL**
### Overview
Mixtral-8x7B-sql-ft-v1 is fine-tuned from Mixtral 8x7B to convert natural language to SQL queries.
### Base Model
mistralai/Mixtral-8x7B-v0.1
### Fine-Tuning
- **Dataset**: 5,000 natural language-SQL pairs.
### Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch
base_model_id = 'mistralai/Mixtral-8x7B-v0.1'
adapter_id = 'sharadsin/Mixtral-8x7B-sql-ft-v1'
bnb_config = BitsAndBytesConfig(
load_in_4bit = True,
bnb_4bit_use_double_quant = True,
bnb_4bit_compute_dtype = torch.bfloat16,
bnb_4bit_quant_type = "nf4",
)
base_model = AutoModelForCausalLM.from_pretrained(
base_model_id,
quantization_config = bnb_config,
device_map = "auto",
trust_remote_code = True,
)
tokenizer = AutoTokenizer.from_pretrained(base_model_id, add_bos_token = True, trust_remote_code = True)
ft_model = PeftModel.from_pretrained(base_model, adapter_id)
eval_prompt= """SYSTEM: Use the following contextual information to concisely answer the question.
USER: CREATE TABLE EmployeeInfo (EmpID INTEGER, EmpFname VARCHAR, EmpLname VARCHAR, Department VARCHAR, Project VARCHAR,Address VARCHAR, DOB DATE, Gender CHAR)
===
Write a query to fetch details of employees whose EmpLname ends with an alphabet 'A' and contains five alphabets?
ASSISTANT:"""
model_input = tokenizer(eval_prompt, return_tensors="pt").to("cuda")
ft_model.eval()
with torch.inference_mode():
print(tokenizer.decode(ft_model.generate(**model_input, max_new_tokens=70,top_k=4, penalty_alpha = 0.6, repetition_penalty=1.15)[0], skip_special_tokens= False))
```
### Limitations
- Less accurate with very complex queries.
- Generates extra gibberish content after providing the answers.
### Framework versions
- PEFT 0.7.1 |