---
datasets:
- sethanimesh/banking_knowledge
base_model:
- meta-llama/Llama-3.1-8B
---

## Model Description

### Overview

This model is a fine-tuned version of **Llama 3.1**, specifically tailored for question answering tasks. Utilizing the **unsloth** library, the model has been trained on a custom dataset formatted in the Alpaca prompt style. It is designed to generate accurate answers along with explanations based on user queries.

### Architecture

- **Base Model**: `unsloth/Meta-Llama-3.1-8B`
- **Model Size**: 8 Billion parameters
- **Architecture Type**: Transformer-based Language Model
- **Modifications**: Fine-tuned on a custom dataset using unsloth with 4-bit quantization for efficient training.

### Hyperparameters

- **Maximum Sequence Length**: 512 tokens
- **Batch Size**: 4 (per device)
- **Gradient Accumulation Steps**: 4
- **Learning Rate**: 2e-4
- **Optimizer**: `adamw_8bit`
- **Weight Decay**: 0.01
- **Learning Rate Scheduler**: Linear
- **Number of Epochs**: 1
- **Warmup Steps**: 5
- **Max Training Steps**: 60
- **Seed**: 3407
- **Mixed Precision Training**:
  - **FP16**: Enabled if BF16 is not supported
  - **BF16**: Enabled if supported by the hardware

## Intended Use

### Primary Use Cases

- **Question Answering**: The model is intended to answer user queries and provide explanations based on the provided context in the dataset.
- **Educational Tools**: Can be used in applications that require answering questions with additional explanations.

### Users

- **Developers**: Integrating the model into applications requiring question-answering capabilities.
- **Researchers**: Studying fine-tuning techniques on large language models.

### Out-of-Scope Uses

- **Undefined Domains**: The model may not perform well on queries outside the scope of the training data.
- **Sensitive Content**: Should not be used for generating content that includes disallowed or harmful information.
<!-- 
## Evaluation Metrics

- **Metrics Used**: [Not specified]
- **Performance Results**: [Please provide evaluation results, such as Exact Match and F1 scores]
- **Benchmark Datasets**: [Specify if any standard datasets were used for evaluation]
- **Limitations of Evaluation**: The model has not been evaluated on external datasets; performance on out-of-domain data is unknown. -->

## Limitations

### Known Issues

- **Generalization**: May not generalize well to questions outside the training data domain.
- **Biases**: Potential biases inherited from the training data are unknown due to lack of bias analysis.

<!-- ### Uncertainty Estimates -->

<!-- - **Confidence Scores**: The model does not provide uncertainty estimates for its predictions.
 -->
## Ethical Considerations

### Potential Risks

- **Misinformation**: The model might generate incorrect or misleading answers if the input is ambiguous or out-of-scope.
- **Bias**: Without a bias analysis, there is a risk of the model exhibiting unintended biases present in the training data.

### Mitigation Strategies

- **User Review**: Outputs should be reviewed by a human for critical applications.
- **Further Evaluation**: Recommend conducting bias and fairness assessments before deployment.

## Training and Evaluation Environment

- **Hardware Used**: "Trained on a single NVIDIA Tesla T4 GPU"
- **Software and Libraries**:
  - **Python Version**: Python 3.8
  - **Transformers Library**: Transformers 4.8
  - **Unsloth Library**: Version used as per the code snippet
  - **TRL (Transformers Reinforcement Learning)**: Used for SFTTrainer
  - **Pandas**: For data handling
- **Training Time**: 4:00:00

## Usage Instructions

### Installation

1. **Clone the Repository**: [If applicable]
2. **Install Dependencies**:
   ```bash
   pip install unsloth transformers trl pandas torch
   ```

### Loading the Model

```python
from unsloth import FastLanguageModel
import torch

max_seq_length = 2048
dtype = None
load_in_4bit = True

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="unsloth/Meta-Llama-3.1-8B",
    max_seq_length=max_seq_length,
    dtype=dtype,
    load_in_4bit=load_in_4bit,
    device_map="auto",
)
```

### Input Format

- **Expected Input**: A user query formatted as per the Alpaca prompt template.
- **Example**:
  ```
  Below is an instruction that describes a task, paired with an appropriate response.

  ## Instruction:
  User Query: How do block credit card?

  ### Input:
  None

  ### Response:
  Answer:
  ```

### Output Format

- The model generates the answer and explanation following the prompt.
- **Example Output**:
  ```
  Answer: Paris

  Explanation: Paris is the capital city of France.
  ```

### Inference Example

```python
input_text = '''Below is an instruction that describes a task, paired with an appropriate response.

## Instruction:
User Query: How do block credit card?

### Input:
None

### Response:
Answer:'''

inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=50)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```

## Contact Information

- **Support Email**: sethanimesh@hotmail.com
- **GitHub Repository**: To be updated
- **Feedback**: Users are encouraged to report issues or provide feedback.

## Acknowledgments

- **Base Model**: This model is built upon `unsloth/Meta-Llama-3.1-8B`.
- **Libraries Used**: Thanks to the developers of Unsloth, Transformers, TRL, and other libraries that made this work possible.

## Changelog

- **Version 1.0**: Initial release with fine-tuning on custom question-answering dataset.