Llama-3.2B Finetuned Model

1. Introduction

This model is a finetuned version of the Llama-3.2B large language model. It has been specifically trained to provide detailed and accurate responses for university course-related queries. This model offers insights on course details, fee structures, duration, and campus options, along with links to corresponding course pages. The finetuning process ensured domain-specific accuracy by utilizing a tailored dataset.


GGUF Model:

This is a GGUF model made for running offline with Ollama. A Modelfile is also created to locally host and run this model with Ollama

2. Dataset Used for Finetuning

The finetuning of the Llama-3.2B model was performed using a private dataset obtained through web scraping. Data was collected from the University of Westminster website and included:

  • Course titles
  • Campus details
  • Duration options (full-time, part-time, distance learning)
  • Fee structures (for UK and international students)
  • Course descriptions
  • Direct links to course pages

This dataset was carefully cleaned and formatted to enhance the model's ability to provide precise responses to user queries.


3. How to Use This Model

To use the Llama-3.2B finetuned model, follow the steps below:

from transformers import TextStreamer

def chatml(question, model):
       messages = [{"role": "user", "content": question},]

       inputs = tokenizer.apply_chat_template(messages,
                                              tokenize=True,
                                              add_generation_prompt=True,
                                              return_tensors="pt",).to("cuda")

       print(tokenizer.decode(inputs[0]))
       text_streamer = TextStreamer(tokenizer, skip_special_tokens=True,
                                    skip_prompt=True)
       return model.generate(input_ids=inputs,
                             streamer=text_streamer,
                             max_new_tokens=512)
                             

#Use the following example to test the model:     
question = "Does the University of Westminster offer a course on AI, Data and Communication MA?"
x = chatml(question, model)

This setup ensures you can effectively query the Llama-3.2B finetuned model and receive detailed, relevant responses.


Uploaded model

  • Developed by: roger33303
  • License: apache-2.0
  • Finetuned from model : unsloth/Llama-3.2-3B-Instruct

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month
34
GGUF
Model size
3.21B params
Architecture
llama

16-bit

Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for roger33303/Best_Model-llama3.2-3b-16bit-Instruct-Finetune-website-QnA-gguf

Quantized
(40)
this model