Model Card for Stefan Zweig Language Model

This model is a fine-tuned version of ibm-granite/granite-3.1-2b-instruct. It has been trained using TRL.

Model Details

This model is designed to emulate Stefan Zweig's distinctive writing and conversational style in chat format. Used a fine-tuning approach based on the methodology described in the DeepSeek-V3 technical report. The project aims to create a language model that emulates Stefan Zweig's distinctive writing and conversational style using a two-stage training process: Supervised Fine-Tuning (SFT) followed by Group Relative Policy Optimization (GRPO).

Quick start

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
device = "cuda" if torch.cuda.is_available() else "cpu"
model = AutoModelForCausalLM.from_pretrained("Chan-Y/Stefan-Zweig-Granite", device_map=device)
tokenizer = AutoTokenizer.from_pretrained("Chan-Y/Stefan-Zweig-Granite")

input_text = "As an experienced and famous writer Stefan Zweig, what's your opinion on artificial intelligence?"
inputs = tokenizer(input_text, return_tensors="pt").to(device)

with torch.no_grad():
  outputs = model.generate(
    **inputs,
    max_length=512,
    num_return_sequences=1,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
  )

# Decode the generated text
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text.split(input_text)[-1])

Training procedure

Dataset: Custom synthetic dataset generated using argilla/synthetic-data-generator with Qwen2.5:14b

Data Format: Structured conversations with specific role markers and custom tokens.

Data Processing: Implementation of special tokens and for style consistency

Training Type: Two-stage training pipeline
1. Supervised Fine-Tuning (SFT)
2. Group Relative Policy Optimization (GRPO)

Framework versions

TRL: 0.14.0.dev0
Transformers: 4.48.1
Pytorch: 2.5.1+cu124
Datasets: 3.2.0
Tokenizers: 0.21.0

Chan-Y
/

Stefan-Zweig-Granite-2B

Model Card for Stefan Zweig Language Model

Model Details

Quick start

Training procedure

Framework versions

Model tree for Chan-Y/Stefan-Zweig-Granite-2B

Dataset used to train Chan-Y/Stefan-Zweig-Granite-2B