Create README.md

1351e37 verified 6 months ago

3.86 kB

	---
	language:
	- en
	- it
	license: apache-2.0
	tags:
	- text-generation-inference
	- transformers
	- ruslanmv
	- llama
	- trl
	- sft
	---
	# Meta-Llama 3.1 8B Text-to-SQL GPTQ Model

	This repository provides a quantized 8-billion-parameter Meta-Llama model fine-tuned for text-to-SQL tasks. The model is optimized with GPTQ quantization for efficient inference. Below you'll find instructions to load, use, and fine-tune the model.

	## Model Details

	- Model Size: 8B
	- Quantization: GPTQ (4-bit)
	- Languages Supported: English, Italian
	- Task: Text-to-SQL generation
	- License: Apache 2.0

	## Installation Requirements

	Before using the model, ensure that you have the following dependencies installed. We recommend using the same versions to avoid any compatibility issues.

	```bash
	# Install the required PyTorch version with CUDA support (ensure CUDA 12.1 is installed)
	!pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu121

	# Install AutoGPTQ for quantized model handling
	!pip install auto-gptq --no-build-isolation

	# Install Optimum for model optimization
	!pip install optimum
	```

	After installing the dependencies, reset your instance to ensure everything works correctly.

	## Loading the Model

	To load the quantized Meta-Llama 3.1 model and use it for text-to-SQL tasks, use the following Python code:

	```python
	from transformers import AutoTokenizer, pipeline
	from auto_gptq import AutoGPTQForCausalLM
	import torch

	# Define the Alpaca-style prompt template
	alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

	### Instruction:
	{}

	### Input:
	{}

	### Response:
	"""

	# Model directory and tokenizer
	quantized_model_dir = "meta-llama-8b-quantized-4bit" # Path where quantized model is saved
	tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir)

	# Load the quantized model
	model = AutoGPTQForCausalLM.from_quantized(
	quantized_model_dir,
	device_map="auto", # Automatically map the model to the available device (GPU or CPU)
	torch_dtype=torch.float16, # Ensure FP16 for efficiency
	use_safetensors=True # If you saved the model using safetensors format, set this to True
	)

	# Set up the text generation pipeline without specifying the device
	pipeline = pipeline(
	"text-generation",
	model=model,
	tokenizer=tokenizer
	)

	# Function to generate SQL query from input text using the Alpaca prompt
	def generate_sql(input_text):
	# Format the prompt
	prompt = alpaca_prompt.format(
	"Provide the SQL query",
	input_text
	)

	# Generate the response using the pipeline
	generated_text = pipeline(
	prompt,
	max_length=200,
	eos_token_id=tokenizer.eos_token_id
	)[0]["generated_text"]

	# Clean the output by removing the prompt and any extra newlines
	cleaned_output = generated_text.replace(prompt, '').strip()

	return cleaned_output

	# Example usage
	italian_input = "Seleziona tutte le colonne della tabella table1 dove la colonna anni è uguale a 2020"
	sql_query = generate_sql(italian_input)
	print(sql_query)
	```

	## Example Usage

	The example script shows how to generate SQL queries from natural language text. Simply provide a request in Italian or English, and the model will generate an appropriate SQL query.

	Example input:

	```python
	italian_input = "Seleziona tutte le colonne della tabella table1 dove la colonna anni è uguale a 2020"
	sql_query = generate_sql(italian_input)
	print(sql_query)
	```

	Example output:

	```sql
	SELECT * FROM table1 WHERE anni = 2020;
	```

	## Model Tags

	- text-generation-inference
	- transformers
	- llama
	- trl
	- sft

	## License

	This model is released under the [Apache License 2.0](LICENSE).