File size: 4,452 Bytes
ccd089c a785c6f ccd089c fb52d4f 9a78c83 a785c6f 9a78c83 2d6c752 9a78c83 fb52d4f 9a78c83 37420be 9a78c83 37420be 9a78c83 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 |
---
datasets:
- Alpaca69B/semeval2016-full-absa-reviews-english-translated-resampled
language:
- en
pipeline_tag: text-generation
tags:
- absa
- qlora
---
# llama-2-7b-absa-semeval-2016
## Model Details
- **Model Name:** Alpaca69B/llama-2-7b-absa-semeval-2016
- **Base Model:** NousResearch/Llama-2-7b-chat-hf
- **Fine-Tuned On:** Alpaca69B/semeval2016-full-absa-reviews-english-translated-resampled
- **Fine-Tuning Techniques:** LoRA attention, 4-bit precision base model loading, gradient checkpointing, etc.
- **Training Resources:** Low resource usage
## Model Description
This model is an aspect based sentiment analysis model fine-tuned from the Llama-2-7b-chat model on an adjusted semeval-2016 dataset.
## Fine-Tuning Techniques
### LoRA Attention
- LoRA attention dimension: 64
- Alpha parameter for LoRA scaling: 16
- Dropout probability for LoRA layers: 0.1
### bitsandbytes (4-bit precision)
- Activated 4-bit precision base model loading
- Compute dtype for 4-bit base models: "float16"
- Quantization type: "nf4"
- Nested quantization for 4-bit base models: Disabled
### TrainingArguments
- Output directory: "./results"
- Number of training epochs: 2
- Enabled fp16/bf16 training: False
- Batch size per GPU for training: 4
- Batch size per GPU for evaluation: 4
- Gradient accumulation steps: 1
- Enabled gradient checkpointing: True
- Maximum gradient norm (gradient clipping): 0.3
- Initial learning rate: 2e-4
- Weight decay: 0.001
- Optimizer: paged_adamw_32bit
- Learning rate scheduler: cosine
- Maximum training steps: -1 (overrides num_train_epochs)
- Ratio of steps for linear warmup: 0.03
- Group sequences into batches with the same length: True
- Save checkpoint every X update steps: 0 (disabled)
- Log every X update steps: 100
### SFT (Sequence-level Fine-Tuning)
- Maximum sequence length: Not specified
- Packing multiple short examples in the same input sequence: False
- Load the entire model on GPU 0
## Evaluation
The model's performance and usage can be observed in the provided [Google Colab notebook](https://colab.research.google.com/drive/1ArLpQFfXJiHcAT3VuYZndDqvskM6SkOM?usp=sharing).
## Model Usage
To use the model, follow the provided code snippet:
```python
from transformers import AutoTokenizer
import transformers
import torch
model = "Alpaca69B/llama-2-7b-absa-semeval-2016"
tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
"text-generation",
model=model,
torch_dtype=torch.float16,
device_map="auto",
)
def process_user_prompt(input_sentence):
sequences = pipeline(
f'### Human: {input_sentence} ### Assistant: aspect: ',
do_sample=True,
top_k=10,
num_return_sequences=1,
eos_token_id=tokenizer.eos_token_id,
max_length=200,
)
result_dict = process_output(sequences[0]['generated_text'])
return result_dict
def process_output(output):
result_dict = {}
# Extract user_prompt
user_prompt_start = output.find("### Human:")
user_prompt_end = output.find("aspect: ") + len("aspect: ")
result_dict['user_prompt'] = output[user_prompt_start:user_prompt_end].strip()
# Extract cleared_generated_output
cleared_output_end = output.find(")")
result_dict['cleared_generated_output'] = output[:cleared_output_end+1].strip()
# Extract review
human_start = output.find("Human:") + len("Human:")
assistant_start = output.find("### Assistant:")
result_dict['review'] = output[human_start:assistant_start].strip()
# Extract aspect and sentiment
aspect_start = output.find("aspect: ") + len("aspect: ")
sentiment_start = output.find("sentiment: ")
aspect_text = output[aspect_start:sentiment_start].strip()
result_dict['aspect'] = aspect_text
sentiment_end = output[sentiment_start:].find(")") + sentiment_start
sentiment_text = output[sentiment_start+len("sentiment:"):sentiment_end].strip()
result_dict['sentiment'] = sentiment_text
return result_dict
output = process_user_prompt('the first thing that attracts attention is the warm reception and the smiling receptionists.')
print(output)
```
## Fine-Tuning Details
Details of the fine-tuning process are available in the [fine-tuning Colab notebook](https://colab.research.google.com/drive/1PQfBsDyM8TSSBchL6PPA4o6rOyLFLUnu?usp=sharing).
**Note:** Ensure that you have the necessary dependencies and resources before running the model. |