--- datasets: - sethanimesh/banking_knowledge base_model: - meta-llama/Llama-3.1-8B --- ## Model Description ### Overview This model is a fine-tuned version of **Llama 3.1**, specifically tailored for question answering tasks. Utilizing the **unsloth** library, the model has been trained on a custom dataset formatted in the Alpaca prompt style. It is designed to generate accurate answers along with explanations based on user queries. ### Architecture - **Base Model**: `unsloth/Meta-Llama-3.1-8B` - **Model Size**: 8 Billion parameters - **Architecture Type**: Transformer-based Language Model - **Modifications**: Fine-tuned on a custom dataset using unsloth with 4-bit quantization for efficient training. ### Hyperparameters - **Maximum Sequence Length**: 512 tokens - **Batch Size**: 4 (per device) - **Gradient Accumulation Steps**: 4 - **Learning Rate**: 2e-4 - **Optimizer**: `adamw_8bit` - **Weight Decay**: 0.01 - **Learning Rate Scheduler**: Linear - **Number of Epochs**: 1 - **Warmup Steps**: 5 - **Max Training Steps**: 60 - **Seed**: 3407 - **Mixed Precision Training**: - **FP16**: Enabled if BF16 is not supported - **BF16**: Enabled if supported by the hardware ## Intended Use ### Primary Use Cases - **Question Answering**: The model is intended to answer user queries and provide explanations based on the provided context in the dataset. - **Educational Tools**: Can be used in applications that require answering questions with additional explanations. ### Users - **Developers**: Integrating the model into applications requiring question-answering capabilities. - **Researchers**: Studying fine-tuning techniques on large language models. ### Out-of-Scope Uses - **Undefined Domains**: The model may not perform well on queries outside the scope of the training data. - **Sensitive Content**: Should not be used for generating content that includes disallowed or harmful information. ## Limitations ### Known Issues - **Generalization**: May not generalize well to questions outside the training data domain. - **Biases**: Potential biases inherited from the training data are unknown due to lack of bias analysis. ## Ethical Considerations ### Potential Risks - **Misinformation**: The model might generate incorrect or misleading answers if the input is ambiguous or out-of-scope. - **Bias**: Without a bias analysis, there is a risk of the model exhibiting unintended biases present in the training data. ### Mitigation Strategies - **User Review**: Outputs should be reviewed by a human for critical applications. - **Further Evaluation**: Recommend conducting bias and fairness assessments before deployment. ## Training and Evaluation Environment - **Hardware Used**: "Trained on a single NVIDIA Tesla T4 GPU" - **Software and Libraries**: - **Python Version**: Python 3.8 - **Transformers Library**: Transformers 4.8 - **Unsloth Library**: Version used as per the code snippet - **TRL (Transformers Reinforcement Learning)**: Used for SFTTrainer - **Pandas**: For data handling - **Training Time**: 4:00:00 ## Usage Instructions ### Installation 1. **Clone the Repository**: [If applicable] 2. **Install Dependencies**: ```bash pip install unsloth transformers trl pandas torch ``` ### Loading the Model ```python from unsloth import FastLanguageModel import torch max_seq_length = 2048 dtype = None load_in_4bit = True model, tokenizer = FastLanguageModel.from_pretrained( model_name="unsloth/Meta-Llama-3.1-8B", max_seq_length=max_seq_length, dtype=dtype, load_in_4bit=load_in_4bit, device_map="auto", ) ``` ### Input Format - **Expected Input**: A user query formatted as per the Alpaca prompt template. - **Example**: ``` Below is an instruction that describes a task, paired with an appropriate response. ## Instruction: User Query: How do block credit card? ### Input: None ### Response: Answer: ``` ### Output Format - The model generates the answer and explanation following the prompt. - **Example Output**: ``` Answer: Paris Explanation: Paris is the capital city of France. ``` ### Inference Example ```python input_text = '''Below is an instruction that describes a task, paired with an appropriate response. ## Instruction: User Query: How do block credit card? ### Input: None ### Response: Answer:''' inputs = tokenizer(input_text, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=50) response = tokenizer.decode(outputs[0], skip_special_tokens=True) print(response) ``` ## Contact Information - **Support Email**: sethanimesh@hotmail.com - **GitHub Repository**: To be updated - **Feedback**: Users are encouraged to report issues or provide feedback. ## Acknowledgments - **Base Model**: This model is built upon `unsloth/Meta-Llama-3.1-8B`. - **Libraries Used**: Thanks to the developers of Unsloth, Transformers, TRL, and other libraries that made this work possible. ## Changelog - **Version 1.0**: Initial release with fine-tuning on custom question-answering dataset.