|
--- |
|
license: mit |
|
tags: |
|
- Text Classification |
|
- Transformers |
|
- PyTorch |
|
- JAX |
|
- MSR |
|
- English |
|
- roberta |
|
- Inference Endpoints |
|
metrics: |
|
- accuracy |
|
pipeline_tag: text-classification |
|
--- |
|
|
|
I finetuned a RobertaForSequenceClassification model which is initialized |
|
from CodeBert [https://huggingface.co/microsoft/codebert-base] to judge whether a code is vulnerable or not. |
|
I selected balanced samples from MSR dataset [https://github.com/ZeoVan/MSR_20_Code_vulnerability_CSV_Dataset] for training, validation, and testing. |
|
The "func_before" is used for code classification. All the data is in the file "msr.csv". |
|
|
|
Test Reulsts: |
|
acc 0.7022935779816514, f1 0.6482384823848238, precision 0.7920529801324503, recall 0.5486238532110091 |