ibm-research
/

CTI-BERT

Generated from Trainer

Model card Files Files and versions Community

Youngja Park commited on Jan 17

Commit

a2cd454

·

verified ·

1 Parent(s): b12679f

Update README.md

Files changed (1) hide show

README.md +43 -3

README.md CHANGED Viewed

@@ -1,3 +1,43 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+language:
+- en
+metrics:
+- accuracy
+- bertscore
+base_model:
+- google-bert/bert-base-uncased
+pipeline_tag: text-classification
+---
+CTI-BERT is a pre-trained BERT model for the cybersecurity domain, especially for cyber-threat intelligence extraction and understanding.
+The model was trained on a security text corpus which contains about 1.2 billion words.
+The corpus includes many security news, vulnerability descriptions, books, academic publications, Wikipedia pages, etc.
+The model has shown improved performance for various cybersecurity text classification tasks.
+However, it is not inteded to be used as the main model for general-domain documents.
+For more details, please refer to [this paper](https://aclanthology.org/2023.emnlp-industry.12.pdf).
+#### Model description
+It has a vocabulary of 50,000 tokens and the sequence length of 256.
+The following hyperparameters were used during training:
+- learning_rate: 0.0005
+- train_batch_size: 128
+- eval_batch_size: 128
+- seed: 42
+- gradient_accumulation_steps: 16
+- total_train_batch_size: 2048
+- optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-06
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 10000
+- training_steps: 200000
+#### Framework versions
+- Transformers 4.18.0
+- Pytorch 1.12.1+cu102
+- Datasets 2.4.0
+- Tokenizers 0.12.1