|
--- |
|
datasets: |
|
- fbnhnsl/FIM_Solidity_Dataset |
|
language: |
|
- en |
|
metrics: |
|
- bleu |
|
- meteor |
|
base_model: |
|
- deepseek-ai/deepseek-coder-1.3b-base |
|
pipeline_tag: text-generation |
|
tags: |
|
- code |
|
license: mit |
|
--- |
|
|
|
This is a fine-tuned deepseek-coder-1.3b-base model for automatic completion of Solidity code. The model was fine-tuned using the Parameter Efficient Fine-tuning (PEFT) method |
|
Quantized Low Rank Adaptation (QLoRA) and a Fill-in-the-Middle (FIM) transformed dataset consisting of Solidity constructs (functions, modifiers, mappings, etc.). The model has a maximum sequence length of 256 tokens. |
|
|
|
General Fine-tuning informations: |
|
|
|
- Epochs: 2 |
|
- Optimizer: paged AdamW 8-bit |
|
- Batch size: 8 |
|
- LoRA target modules: ["q_proj", "o_proj", "k_proj", "v_proj"] |
|
- Quantization type: normal float 4-bit |
|
- QLoRA compute type: brain float 16-bit |
|
- Total time: 1 hour 23 minutes |
|
- Accelerator: Nvidia L4 Tensor Core GPU |
|
|
|
Some of the Hyperparameters were determined using Hyperparameter optimization with Ray Tune. The corresponding result for the best trial were: |
|
|
|
- Learning rate: 0.00016 |
|
- Weight decay: 0.0534 |
|
- Warmup steps: 100 |
|
- Gradient accumulation steps: 2 |
|
- LoRA rank: 64 |
|
- LoRA alpha: 64 |
|
- LoRA dropout: 0.0934665 |
|
|
|
The Fine-tuning results are: |
|
|
|
- Training loss: ~0.7 |
|
- Validation loss: ~0.75 |
|
|
|
The model was evaluated with the test split compared to the base model. The metrics were used: Perplexity, BLEU and METEOR. The Perplexity results are: |
|
|
|
- Perplexity Base Model: 12.08 |
|
- Perplexity Fine-tuned Model: 2.19 |
|
|
|
The following code shows an example of how to use the model: |
|
```python |
|
# Load the fine-tuned model |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
from peft import PeftModel |
|
import torch |
|
|
|
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
|
|
|
pretrained_checkpoint = 'deepseek-ai/deepseek-coder-1.3b-base' |
|
finetuned_checkpoint = 'path/to/model' |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(finetuned_checkpoint) |
|
|
|
old_model = AutoModelForCausalLM.from_pretrained(pretrained_checkpoint) |
|
old_model.resize_token_embeddings(len(tokenizer)) |
|
|
|
finetuned_model = PeftModel.from_pretrained(old_model, checkpoint).to(device) |
|
|
|
# ---------------------------------------------------------------------------- |
|
# General automatic code completion |
|
code_example = '''<|secure_function|>\tfunction add(''' |
|
|
|
model_inputs = tokenizer(code_example, return_tensors="pt").to(device) |
|
|
|
input_ids = model_inputs["input_ids"] |
|
attention_mask = model_inputs["attention_mask"] |
|
|
|
generated_ids = finetuned_model.generate(input_ids, |
|
do_sample=True, |
|
max_length=256, |
|
num_beams=4, |
|
temperature=0.3, |
|
pad_token_id=tokenizer.eos_token_id, |
|
attention_mask=attention_mask) |
|
|
|
print(tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]) |
|
|
|
# Expected output: |
|
# function add(uint256 a, uint256 b) internal pure returns (uint256) { |
|
# return a + b; |
|
# } |
|
|
|
# ---------------------------------------------------------------------------- |
|
# Fill-in-the-middle |
|
def generate_fim(prefix, suffix, model, tokenizer, max_length=256): |
|
input_text = f"<|fim_begin|>{prefix}<|fim_hole|>{suffix}<|fim_end|>" |
|
inputs = tokenizer.encode(input_text, return_tensors="pt").to(model.device) |
|
outputs = model.generate( |
|
inputs, |
|
max_length=max_length, |
|
num_beams=8, |
|
temperature=0.3, |
|
do_sample=True, |
|
pad_token_id=tokenizer.eos_token_id |
|
) |
|
middle = tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True) |
|
return prefix + middle + suffix |
|
|
|
prefix = '''pragma solidity ^0.8.0;\n\n''' |
|
|
|
suffix = '''\n\ncontract FOO is Context, IERC20, Ownable {''' |
|
|
|
print(generate_fim(prefix, suffix, finetuned_model, tokenizer)) |
|
|
|
# Expected output: |
|
# pragma solidity ^0.8.0; |
|
# |
|
# import "@openzeppelin/contracts/utils/Context.sol" as Context; |
|
# import "@openzeppelin/contracts/interfaces/IERC20.sol" as IERC20; |
|
# import "@openzeppelin/contracts/access/Ownable.sol" as Ownable; |
|
# |
|
# contract FOO is Context, IERC20, Ownable { |
|
|
|
``` |
|
|
|
If you wish to use this model, you can cite it as follows: |
|
|
|
```latex |
|
@misc{hensel2025fim_model, |
|
title = {Finetuned deepseek-coder-1.3b-base model for automatic code completion of Solidity code}, |
|
author={Fabian Hensel}, |
|
year={2025} |
|
} |
|
``` |