|
--- |
|
datasets: |
|
- Alpaca69B/semeval2016-full-absa-reviews-english-translated-resampled |
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
tags: |
|
- absa |
|
- qlora |
|
--- |
|
# llama-2-7b-absa-semeval-2016 |
|
|
|
## Model Details |
|
|
|
- **Model Name:** Alpaca69B/llama-2-7b-absa-semeval-2016 |
|
- **Base Model:** NousResearch/Llama-2-7b-chat-hf |
|
- **Fine-Tuned On:** Alpaca69B/semeval2016-full-absa-reviews-english-translated-resampled |
|
- **Fine-Tuning Techniques:** LoRA attention, 4-bit precision base model loading, gradient checkpointing, etc. |
|
- **Training Resources:** Low resource usage |
|
|
|
## Model Description |
|
|
|
This model is an aspect based sentiment analysis model fine-tuned from the Llama-2-7b-chat model on an adjusted semeval-2016 dataset. |
|
|
|
## Fine-Tuning Techniques |
|
|
|
### LoRA Attention |
|
- LoRA attention dimension: 64 |
|
- Alpha parameter for LoRA scaling: 16 |
|
- Dropout probability for LoRA layers: 0.1 |
|
|
|
### bitsandbytes (4-bit precision) |
|
- Activated 4-bit precision base model loading |
|
- Compute dtype for 4-bit base models: "float16" |
|
- Quantization type: "nf4" |
|
- Nested quantization for 4-bit base models: Disabled |
|
|
|
### TrainingArguments |
|
- Output directory: "./results" |
|
- Number of training epochs: 2 |
|
- Enabled fp16/bf16 training: False |
|
- Batch size per GPU for training: 4 |
|
- Batch size per GPU for evaluation: 4 |
|
- Gradient accumulation steps: 1 |
|
- Enabled gradient checkpointing: True |
|
- Maximum gradient norm (gradient clipping): 0.3 |
|
- Initial learning rate: 2e-4 |
|
- Weight decay: 0.001 |
|
- Optimizer: paged_adamw_32bit |
|
- Learning rate scheduler: cosine |
|
- Maximum training steps: -1 (overrides num_train_epochs) |
|
- Ratio of steps for linear warmup: 0.03 |
|
- Group sequences into batches with the same length: True |
|
- Save checkpoint every X update steps: 0 (disabled) |
|
- Log every X update steps: 100 |
|
|
|
### SFT (Sequence-level Fine-Tuning) |
|
- Maximum sequence length: Not specified |
|
- Packing multiple short examples in the same input sequence: False |
|
- Load the entire model on GPU 0 |
|
|
|
## Evaluation |
|
|
|
The model's performance and usage can be observed in the provided [Google Colab notebook](https://colab.research.google.com/drive/1ArLpQFfXJiHcAT3VuYZndDqvskM6SkOM?usp=sharing). |
|
|
|
## Model Usage |
|
|
|
To use the model, follow the provided code snippet: |
|
|
|
```python |
|
from transformers import AutoTokenizer |
|
import transformers |
|
import torch |
|
|
|
model = "Alpaca69B/llama-2-7b-absa-semeval-2016" |
|
tokenizer = AutoTokenizer.from_pretrained(model) |
|
pipeline = transformers.pipeline( |
|
"text-generation", |
|
model=model, |
|
torch_dtype=torch.float16, |
|
device_map="auto", |
|
) |
|
|
|
def process_user_prompt(input_sentence): |
|
sequences = pipeline( |
|
f'### Human: {input_sentence} ### Assistant: aspect: ', |
|
do_sample=True, |
|
top_k=10, |
|
num_return_sequences=1, |
|
eos_token_id=tokenizer.eos_token_id, |
|
max_length=200, |
|
) |
|
result_dict = process_output(sequences[0]['generated_text']) |
|
return result_dict |
|
|
|
def process_output(output): |
|
result_dict = {} |
|
|
|
# Extract user_prompt |
|
user_prompt_start = output.find("### Human:") |
|
user_prompt_end = output.find("aspect: ") + len("aspect: ") |
|
result_dict['user_prompt'] = output[user_prompt_start:user_prompt_end].strip() |
|
|
|
# Extract cleared_generated_output |
|
cleared_output_end = output.find(")") |
|
result_dict['cleared_generated_output'] = output[:cleared_output_end+1].strip() |
|
|
|
# Extract review |
|
human_start = output.find("Human:") + len("Human:") |
|
assistant_start = output.find("### Assistant:") |
|
result_dict['review'] = output[human_start:assistant_start].strip() |
|
|
|
# Extract aspect and sentiment |
|
aspect_start = output.find("aspect: ") + len("aspect: ") |
|
sentiment_start = output.find("sentiment: ") |
|
aspect_text = output[aspect_start:sentiment_start].strip() |
|
result_dict['aspect'] = aspect_text |
|
|
|
sentiment_end = output[sentiment_start:].find(")") + sentiment_start |
|
sentiment_text = output[sentiment_start+len("sentiment:"):sentiment_end].strip() |
|
result_dict['sentiment'] = sentiment_text |
|
|
|
return result_dict |
|
|
|
|
|
output = process_user_prompt('the first thing that attracts attention is the warm reception and the smiling receptionists.') |
|
print(output) |
|
|
|
``` |
|
|
|
## Fine-Tuning Details |
|
|
|
Details of the fine-tuning process are available in the [fine-tuning Colab notebook](https://colab.research.google.com/drive/1PQfBsDyM8TSSBchL6PPA4o6rOyLFLUnu?usp=sharing). |
|
|
|
|
|
**Note:** Ensure that you have the necessary dependencies and resources before running the model. |