This is a tiny Longformer model designed for Russian language. It was initialized from cointegrated/rubert-tiny2 weights and has been modified to support a context length of up to 16384 tokens. We fine-tuned it on a dataset of Russian books, news, wiki and habr, however it still undrestands English, thanks to the source model. For a detailed information check out our post on Habr.
Model attributes:
- 12 attention heads
- 3 hidden layers
- 16384 tokens length of context
The model can be used as-is to produce text embeddings or it can be further fine-tuned for a specific downstream task.
Text embeddings can be produced as follows:
# pip install transformers sentencepiece
import torch
from transformers import LongformerModel, LongformerTokenizerFast
model = LongformerModel.from_pretrained('kazzand/ru-longformer-tiny-16384')
tokenizer = LongformerTokenizerFast.from_pretrained('kazzand/ru-longformer-tiny-16384')
def get_cls_embedding(text, model, tokenizer, device='cuda'):
model.to(device)
batch = tokenizer(text, return_tensors='pt')
#set global attention for cls token
global_attention_mask = [
[1 if token_id == tokenizer.cls_token_id else 0 for token_id in input_ids]
for input_ids in batch["input_ids"]
]
#add global attention mask to batch
batch["global_attention_mask"] = torch.tensor(global_attention_mask)
with torch.no_grad():
output = model(**batch.to(device))
return output.last_hidden_state[:,0,:]
P.S. Thanks for moral and technical support AbstractDL
- Downloads last month
- 117
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.