Finetuned-qwen2.5-coder-0.5B model on 100000 rows of a cutom dataset containing. git-differences and respective commit messages

Each row of the dataset was formatted as below to suit finetuning requirement of Qwen2.5-coder model

'### Instruction:\nGenerate a concise and meaningful commit message based on the provided git diff.\n\n### Git Diff:\n{a given git-difference as in the dataset rows}\n\n### Commit Message:\nAdding the squeezing in the cost fuction<|im_end|>'

Code for inference of the gguf model is given below

from llama_cpp import Llama

# Configuration
gguf_model_path = "qwen0.5-finetuned.gguf"  # Path to your GGUF file

# Define the commit message prompt (Minimal format, avoids assistant behavior)
commit_prompt = """Generate a meaningful commit message explaining all the changes in the provided Git diff.

### Git Diff:
{}

### Commit Message:"""  # Removed {} after "Commit Message:" to prevent pre-filled text.

# Git diff example for commit message generation
git_diff_example = """
diff --git a/index.html b/index.html
index 89abcde..f123456 100644
--- a/index.html
+++ b/index.html
@@ -5,16 +5,6 @@ <body>
     <h1>Welcome to My Page</h1>

-    <table border="1">
-        <tr>
-            <th>Name</th>
-            <th>Age</th>
-        </tr>
-        <tr>
-            <td>John Doe</td>
-            <td>30</td>
-        </tr>
-    </table>

+    <p>This is a newly added paragraph replacing the table.</p>
 </body>
</html>
"""

# Load the GGUF model with increased context size (32768)
modelGGUF = Llama(
    model_path=gguf_model_path,
    rope_scaling={"type": "linear", "factor": 2.0},
    chat_format=None,  # Disables any chat formatting
    n_ctx=32768,  # Set the context size explicitly
)

# Prepare the raw input prompt
input_prompt = commit_prompt.format(git_diff_example)

# Generate commit message
output = modelGGUF(
    input_prompt,
    max_tokens=64,
    temperature=0.6, # Balanced randomness
    top_p=0.8,      # Controls nucleus sampling
    top_k=50,       # Limits vocabulary selection
)

# Decode and print the output
commit_message = output["choices"][0]["text"].strip()

print("\nGenerated Commit Message:\n{}".format(commit_message))
Downloads last month
18
GGUF
Model size
494M params
Architecture
qwen2
Hardware compatibility
Log In to view the estimation
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for seniruk/qwen2.5coder-0.5B_commit_msg

Base model

Qwen/Qwen2.5-0.5B
Quantized
(30)
this model

Dataset used to train seniruk/qwen2.5coder-0.5B_commit_msg