Patent Classification Model
Model Description
multilabel_patent_classifier is a fine-tuned XLM-RoBERTa-large model that has been trained on patent class information between 1855-1883 made available here.
It has been trained to recognize 146 classes of named entities outlined by the British Patent Office. These are made available here.
We take the original xlm-roberta-large weights and fine tune on our custom dataset for 10 epochs with a learning rate of 2e-05 and a batch size of 64.
Usage
This model can be used with HuggingFace Transformer's Pipelines API for NER:
from transformers import pipeline, AutoModelForTokenClassification, AutoTokenizer
model_name = "matthewleechen/multilabel_patent_classifier"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForTokenClassification.from_pretrained(model_name)
pipe = pipeline(
task="text-classification",
model=model,
device = 0,
tokenizer=tokenizer,
return_all_scores=True
)
Training Data
Our training data consists of patent titles labelled with 0-1 tags for each patent class. Labels were generated by the British Patent Office between 1855-1883 and our patent titles were extracted from the front pages of our specification texts using a patent title NER model.
Training Procedure
We use the standard multi-label classification protocols with the HuggingFace Trainer API, but replace the default BCEWithLogitsLoss
with a focal loss function (α=1, γ=2) to address class imbalance. Both during evaluation and at inference, we apply a sigmoid to each logit and use a 0.5 threshold to determine positive labels for each class.
Evaluation
We compute precision, recall, and F1 for each class (with a 0.5 sigmoid threshold), as well as exact match (only if ground truth and predicted classes are identical) and any match (if any overlap between ground truth and predicted classes) percentages.
These scores are aggregated for the test set below.
Metric Type | Precision (Micro) | Recall (Micro) | F1 (Micro) | Exact Match | Any Match |
---|---|---|---|---|---|
Micro Average | 83.4% | 60.3% | 70.0% | 52.9% | 90.8% |
References
@misc{hanlon2016,
title = {{British Patent Technology Classification Database: 1855–1882}},
author = {Hanlon, Walker},
year = {2016},
url = {http://www.econ.ucla.edu/whanlon/},
note = {Available at: \url{http://www.econ.ucla.edu/whanlon/}}
}
@misc{lin2018focallossdenseobject,
title={Focal Loss for Dense Object Detection},
author={Tsung-Yi Lin and Priya Goyal and Ross Girshick and Kaiming He and Piotr Dollár},
year={2018},
eprint={1708.02002},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/1708.02002},
}
Citation
If you use our model in your research, please cite our accompanying paper as follows:
@article{bct2025,
title = {300 Years of British Patents},
author = {Enrico Berkes and Matthew Lee Chen and Matteo Tranchero},
journal = {arXiv preprint arXiv:2401.12345},
year = {2025},
url = {https://arxiv.org/abs/2401.12345}
}
- Downloads last month
- 21