Qwen2-1.5B-Instruct finetuned on my own synthetic data for summarization task for 2 epochs

More info on the project at my github: https://github.com/thepowerfuldeez/qwen2_1_5b_summarize

Usage

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2-1.5B-Instruct")
model = AutoModelForCausalLM.from_pretrained("thepowerfuldeez/Qwen2-1.5B-Summarize", 
                                             bnb_4bit_compute_dtype=torch.bfloat16,
                                             load_in_4bit=True, attn_implementation="flash_attention_2")

text = <YOUR_TEXT>
messages = [
    {"role": "system", "content": "You are helpful AI assistant."},
    {"role": "user", "content": f"Summarize following text: \n{text}"},
]
input_ids = tokenizer.apply_chat_template(messages, return_tensors='pt')
new_tokens = model.generate(input_ids, max_new_tokens=1024)[0][len(input_ids[0]):]
summary = tokenizer.decode(new_tokens, skip_special_tokens=True)

Dataset

Train split is here

Metrics

BERTScore

Model name Dataset size Result
Qwen2-1.5B-Instruct - 0.07
Qwen2-1.5B-Summarize 8000 0.14
Qwen2-1.5B-Summarize 20500 In progress

I have used BERTScore from official implementation with microsoft/deberta-xlarge-mnli model. Then I sampled 32 inputs from test set (longer sentences to summarize) and generated summaries. I have reference summaries generated from stronger, Qwen2-72B-Instruct model, which I used as targets for metric.

Built with Axolotl

Downloads last month
44
Safetensors
Model size
1.54B params
Tensor type
FP16
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using thepowerfuldeez/Qwen2-1.5B-Summarize 1