Uzbek Dependency Parser
This model predicts Universal Dependencies dependency relations for Uzbek text.
Model details
The model was fine-tuned on a Universal Dependencies treebank containing approximately 600 annotated sentences. It is based on the XLM-RoBERTa base model and adapted for token classification.
Usage
from transformers import AutoTokenizer, AutoModelForTokenClassification
import torch
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("Arofat/uzbek-dependency-parser")
model = AutoModelForTokenClassification.from_pretrained("Arofat/uzbek-dependency-parser")
# Prepare text
text = "Men O'zbekistonda yashayman."
tokens = text.split()
# Get predictions
inputs = tokenizer(tokens, is_split_into_words=True, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
# Process outputs
predictions = torch.argmax(outputs.logits, dim=2)
id2label = model.config.id2label
# Get dependency relations
dep_tags = []
word_ids = inputs.word_ids(batch_index=0)
prev_word_id = None
for idx, word_id in enumerate(word_ids):
if word_id is None or word_id == prev_word_id:
continue
dep_tags.append(id2label[predictions[0, idx].item()])
prev_word_id = word_id
# Print results
for token, tag in zip(tokens, dep_tags):
print(f"{token}: {tag}")
Limitations
This model was trained on a relatively small dataset and may not generalize well to all domains of Uzbek text. Note that this model only predicts dependency relations (labels) and not the dependency tree structure (heads). For a complete dependency parse, additional processing is needed.
- Downloads last month
- 2
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
HF Inference deployability: The model has no library tag.