bio-miniALBERT-128 / README.md
omidrohanian's picture
Update README.md
afc85b5 verified
---
license: mit
---
# Model
miniALBERT is a recursive transformer model which uses cross-layer parameter sharing, embedding factorisation, and bottleneck adapters to achieve high parameter efficiency.
Since miniALBERT is a compact model, it is trained using a layer-to-layer distillation technique, using the BioBERT-v1.1 model as the teacher. Currently, this model is trained for 100K steps on the PubMed Abstracts dataset.
In terms of architecture, this model uses an embedding dimension of 128, a hidden size of 768, an MLP expansion rate of 4, and a reduction factor of 16 for bottleneck adapters. In general, this model uses 6 recursions and has a unique parameter count of 11 million parameters.
# Usage
Since miniALBERT uses a unique architecture it can not be loaded using ts.AutoModel for now. To load the model, first, clone the miniALBERT GitHub project, using the below code:
```bash
git clone https://github.com/nlpie-research/MiniALBERT.git
```
Then use the ```sys.path.append``` to add the miniALBERT files to your project and then import the miniALBERT modeling file using the below code:
```bash
import sys
sys.path.append("PATH_TO_CLONED_PROJECT/MiniALBERT/")
from minialbert_modeling import MiniAlbertForSequenceClassification, MiniAlbertForTokenClassification
```
Finally, load the model like a regular model in the transformers library using the below code:
```Python
# For NER use the below code
model = MiniAlbertForTokenClassification.from_pretrained("nlpie/bio-miniALBERT-128")
# For Sequence Classification use the below code
model = MiniAlbertForTokenClassification.from_pretrained("nlpie/bio-miniALBERT-128")
```
In addition, For efficient fine-tuning using the pre-trained bottleneck adapters use the below code:
```Python
model.trainAdaptersOnly()
```
# Citation
If you use the model, please cite our paper:
```
@inproceedings{nouriborji2023minialbert,
title={MiniALBERT: Model Distillation via Parameter-Efficient Recursive Transformers},
author={Nouriborji, Mohammadmahdi and Rohanian, Omid and Kouchaki, Samaneh and Clifton, David A},
booktitle={Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics},
pages={1161--1173},
year={2023}
}
```