|
--- |
|
datasets: |
|
- sethanimesh/banking_knowledge |
|
base_model: |
|
- meta-llama/Llama-3.1-8B |
|
--- |
|
|
|
## Model Description |
|
|
|
### Overview |
|
|
|
This model is a fine-tuned version of **Llama 3.1**, specifically tailored for question answering tasks. Utilizing the **unsloth** library, the model has been trained on a custom dataset formatted in the Alpaca prompt style. It is designed to generate accurate answers along with explanations based on user queries. |
|
|
|
### Architecture |
|
|
|
- **Base Model**: `unsloth/Meta-Llama-3.1-8B` |
|
- **Model Size**: 8 Billion parameters |
|
- **Architecture Type**: Transformer-based Language Model |
|
- **Modifications**: Fine-tuned on a custom dataset using unsloth with 4-bit quantization for efficient training. |
|
|
|
### Hyperparameters |
|
|
|
- **Maximum Sequence Length**: 512 tokens |
|
- **Batch Size**: 4 (per device) |
|
- **Gradient Accumulation Steps**: 4 |
|
- **Learning Rate**: 2e-4 |
|
- **Optimizer**: `adamw_8bit` |
|
- **Weight Decay**: 0.01 |
|
- **Learning Rate Scheduler**: Linear |
|
- **Number of Epochs**: 1 |
|
- **Warmup Steps**: 5 |
|
- **Max Training Steps**: 60 |
|
- **Seed**: 3407 |
|
- **Mixed Precision Training**: |
|
- **FP16**: Enabled if BF16 is not supported |
|
- **BF16**: Enabled if supported by the hardware |
|
|
|
## Intended Use |
|
|
|
### Primary Use Cases |
|
|
|
- **Question Answering**: The model is intended to answer user queries and provide explanations based on the provided context in the dataset. |
|
- **Educational Tools**: Can be used in applications that require answering questions with additional explanations. |
|
|
|
### Users |
|
|
|
- **Developers**: Integrating the model into applications requiring question-answering capabilities. |
|
- **Researchers**: Studying fine-tuning techniques on large language models. |
|
|
|
### Out-of-Scope Uses |
|
|
|
- **Undefined Domains**: The model may not perform well on queries outside the scope of the training data. |
|
- **Sensitive Content**: Should not be used for generating content that includes disallowed or harmful information. |
|
<!-- |
|
## Evaluation Metrics |
|
|
|
- **Metrics Used**: [Not specified] |
|
- **Performance Results**: [Please provide evaluation results, such as Exact Match and F1 scores] |
|
- **Benchmark Datasets**: [Specify if any standard datasets were used for evaluation] |
|
- **Limitations of Evaluation**: The model has not been evaluated on external datasets; performance on out-of-domain data is unknown. --> |
|
|
|
## Limitations |
|
|
|
### Known Issues |
|
|
|
- **Generalization**: May not generalize well to questions outside the training data domain. |
|
- **Biases**: Potential biases inherited from the training data are unknown due to lack of bias analysis. |
|
|
|
<!-- ### Uncertainty Estimates --> |
|
|
|
<!-- - **Confidence Scores**: The model does not provide uncertainty estimates for its predictions. |
|
--> |
|
## Ethical Considerations |
|
|
|
### Potential Risks |
|
|
|
- **Misinformation**: The model might generate incorrect or misleading answers if the input is ambiguous or out-of-scope. |
|
- **Bias**: Without a bias analysis, there is a risk of the model exhibiting unintended biases present in the training data. |
|
|
|
### Mitigation Strategies |
|
|
|
- **User Review**: Outputs should be reviewed by a human for critical applications. |
|
- **Further Evaluation**: Recommend conducting bias and fairness assessments before deployment. |
|
|
|
## Training and Evaluation Environment |
|
|
|
- **Hardware Used**: "Trained on a single NVIDIA Tesla T4 GPU" |
|
- **Software and Libraries**: |
|
- **Python Version**: Python 3.8 |
|
- **Transformers Library**: Transformers 4.8 |
|
- **Unsloth Library**: Version used as per the code snippet |
|
- **TRL (Transformers Reinforcement Learning)**: Used for SFTTrainer |
|
- **Pandas**: For data handling |
|
- **Training Time**: 4:00:00 |
|
|
|
## Usage Instructions |
|
|
|
### Installation |
|
|
|
1. **Clone the Repository**: [If applicable] |
|
2. **Install Dependencies**: |
|
```bash |
|
pip install unsloth transformers trl pandas torch |
|
``` |
|
|
|
### Loading the Model |
|
|
|
```python |
|
from unsloth import FastLanguageModel |
|
import torch |
|
|
|
max_seq_length = 2048 |
|
dtype = None |
|
load_in_4bit = True |
|
|
|
model, tokenizer = FastLanguageModel.from_pretrained( |
|
model_name="unsloth/Meta-Llama-3.1-8B", |
|
max_seq_length=max_seq_length, |
|
dtype=dtype, |
|
load_in_4bit=load_in_4bit, |
|
device_map="auto", |
|
) |
|
``` |
|
|
|
### Input Format |
|
|
|
- **Expected Input**: A user query formatted as per the Alpaca prompt template. |
|
- **Example**: |
|
``` |
|
Below is an instruction that describes a task, paired with an appropriate response. |
|
|
|
## Instruction: |
|
User Query: How do block credit card? |
|
|
|
### Input: |
|
None |
|
|
|
### Response: |
|
Answer: |
|
``` |
|
|
|
### Output Format |
|
|
|
- The model generates the answer and explanation following the prompt. |
|
- **Example Output**: |
|
``` |
|
Answer: Paris |
|
|
|
Explanation: Paris is the capital city of France. |
|
``` |
|
|
|
### Inference Example |
|
|
|
```python |
|
input_text = '''Below is an instruction that describes a task, paired with an appropriate response. |
|
|
|
## Instruction: |
|
User Query: How do block credit card? |
|
|
|
### Input: |
|
None |
|
|
|
### Response: |
|
Answer:''' |
|
|
|
inputs = tokenizer(input_text, return_tensors="pt").to(model.device) |
|
outputs = model.generate(**inputs, max_new_tokens=50) |
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
print(response) |
|
``` |
|
|
|
## Contact Information |
|
|
|
- **Support Email**: [email protected] |
|
- **GitHub Repository**: To be updated |
|
- **Feedback**: Users are encouraged to report issues or provide feedback. |
|
|
|
## Acknowledgments |
|
|
|
- **Base Model**: This model is built upon `unsloth/Meta-Llama-3.1-8B`. |
|
- **Libraries Used**: Thanks to the developers of Unsloth, Transformers, TRL, and other libraries that made this work possible. |
|
|
|
## Changelog |
|
|
|
- **Version 1.0**: Initial release with fine-tuning on custom question-answering dataset. |
|
|
|
|