nielsr HF Staff commited on
Commit
ba9573a
·
verified ·
1 Parent(s): 48d7755

Improve model card: Add pipeline tag, language, paper, project, code, and usage

Browse files

This PR significantly enhances the model card for `vulnerability-severity-classification-chinese-macbert-base` by:

* Adding `pipeline_tag: text-classification` and `language: zh` to the metadata for improved discoverability and accurate filtering on the Hugging Face Hub.
* Including more descriptive `tags` such as `text-classification`, `classification`, `nlp`, `chinese`, and `vulnerability`.
* Updating the main title to `VLAI: A RoBERTa-Based Model for Automated Vulnerability Severity Classification` to align with the associated research paper.
* Adding a clear description of the model based on the paper's abstract.
* Providing direct links to the paper ([VLAI: A RoBERTa-Based Model for Automated Vulnerability Severity Classification](https://huggingface.co/papers/2507.03607)), the project page (`https://vulnerability.circl.lu`), and the associated GitHub repository (`https://github.com/vulnerability-lookup/ML-Gateway`).
* Adding a practical Python sample usage snippet using the `transformers` library, including an example with Chinese text.
* Removing the auto-generated comment as the model card has now been manually improved.

These additions provide users with richer context and make the model more accessible and understandable.

Files changed (1) hide show
  1. README.md +37 -10
README.md CHANGED
@@ -1,31 +1,58 @@
1
  ---
 
 
 
2
  library_name: transformers
3
  license: apache-2.0
4
- base_model: hfl/chinese-macbert-base
5
- tags:
6
- - generated_from_trainer
7
  metrics:
8
  - accuracy
 
 
 
 
 
 
 
 
 
9
  model-index:
10
  - name: vulnerability-severity-classification-chinese-macbert-base
11
  results: []
12
- datasets:
13
- - CIRCL/Vulnerability-CNVD
14
  ---
15
 
16
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
17
- should probably proofread and complete it, then remove this comment. -->
18
 
19
- # vulnerability-severity-classification-chinese-macbert-base
20
 
21
- This model is a fine-tuned version of [hfl/chinese-macbert-base](https://huggingface.co/hfl/chinese-macbert-base) on the dataset [CIRCL/Vulnerability-CNVD](https://huggingface.co/datasets/CIRCL/Vulnerability-CNVD).
22
 
23
- You can read [this page](https://www.vulnerability-lookup.org/user-manual/ai/) for more information.
 
 
24
 
25
  It achieves the following results on the evaluation set:
26
  - Loss: 0.5994
27
  - Accuracy: 0.7900
28
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
  ## Training procedure
30
 
31
  ### Training hyperparameters
 
1
  ---
2
+ base_model: hfl/chinese-macbert-base
3
+ datasets:
4
+ - CIRCL/Vulnerability-CNVD
5
  library_name: transformers
6
  license: apache-2.0
 
 
 
7
  metrics:
8
  - accuracy
9
+ tags:
10
+ - generated_from_trainer
11
+ - text-classification
12
+ - classification
13
+ - nlp
14
+ - chinese
15
+ - vulnerability
16
+ pipeline_tag: text-classification
17
+ language: zh
18
  model-index:
19
  - name: vulnerability-severity-classification-chinese-macbert-base
20
  results: []
 
 
21
  ---
22
 
23
+ # VLAI: A RoBERTa-Based Model for Automated Vulnerability Severity Classification
 
24
 
25
+ This model, named **VLAI**, is a fine-tuned version of [hfl/chinese-macbert-base](https://huggingface.co/hfl/chinese-macbert-base) on the dataset [CIRCL/Vulnerability-CNVD](https://huggingface.co/datasets/CIRCL/Vulnerability-CNVD).
26
 
27
+ The model was presented in the paper [VLAI: A RoBERTa-Based Model for Automated Vulnerability Severity Classification](https://huggingface.co/papers/2507.03607).
28
 
29
+ **Abstract:** VLAI is a transformer-based model that predicts software vulnerability severity levels directly from text descriptions. Built on RoBERTa, VLAI is fine-tuned on over 600,000 real-world vulnerabilities and achieves over 82% accuracy in predicting severity categories, enabling faster and more consistent triage ahead of manual CVSS scoring. The model and dataset are open-source and integrated into the Vulnerability-Lookup service.
30
+
31
+ For more information, visit the [Vulnerability-Lookup project page](https://vulnerability.circl.lu) or the [ML-Gateway GitHub repository](https://github.com/vulnerability-lookup/ML-Gateway), which demonstrates its usage in a FastAPI server.
32
 
33
  It achieves the following results on the evaluation set:
34
  - Loss: 0.5994
35
  - Accuracy: 0.7900
36
 
37
+ ## How to use
38
+
39
+ You can use this model directly with the Hugging Face `transformers` library for text classification:
40
+
41
+ ```python
42
+ from transformers import pipeline
43
+
44
+ classifier = pipeline(
45
+ "text-classification",
46
+ model="CIRCL/vulnerability-severity-classification-chinese-macbert-base"
47
+ )
48
+
49
+ # Example usage for a Chinese vulnerability description
50
+ description_chinese = "TOTOLINK A3600R是中国吉翁电子(TOTOLINK)公司的一款6天线1200M无线路由器。TOTOLINK A3600R存在缓冲区溢出漏洞,该漏洞源于/cgi-bin/cstecgi.cgi文件的UploadCustomModule函数中的File参数未能正确验证输入数据的长度大小,攻击者可利用该漏洞在系统上执行任意代码或者导致拒绝服务。"
51
+ result_chinese = classifier(description_chinese)
52
+ print(result_chinese)
53
+ # Expected output example: [{'label': '高', 'score': 0.9802}]
54
+ ```
55
+
56
  ## Training procedure
57
 
58
  ### Training hyperparameters