I finetuned a RobertaForSequenceClassification model which is initialized from CodeBert [https://huggingface.co/microsoft/codebert-base] to judge whether a code is vulnerable or not. I selected balanced samples from MSR dataset [https://github.com/ZeoVan/MSR_20_Code_vulnerability_CSV_Dataset] for training, validation, and testing. The "func_before" is used for code classification. All the data is in the file "msr.csv".

Test Reulsts: acc 0.7022935779816514, f1 0.6482384823848238, precision 0.7920529801324503, recall 0.5486238532110091

Downloads last month
10
Safetensors
Model size
125M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support