metadata

license: apache-2.0
datasets:
  - risqaliyevds/uzbek_ner
language:
  - uz
metrics:
  - precision
  - f1
  - recall
  - accuracy
base_model:
  - FacebookAI/xlm-roberta-base
pipeline_tag: token-classification

NER Model for Uzbek Language (XLM-RoBERTa-based)

This is a Named Entity Recognition (NER) model trained for the Uzbek language based on the XLM-RoBERTa architecture. It is fine-tuned to classify entities into categories such as location, person, organization, and other types.

Model Details

Model Type: XLM-RoBERTa (Transformer-based)
Task: Named Entity Recognition (NER)
Training Data: Custom dataset with labeled named entities for Uzbek language.
Categories:
- B-LOC (Location)
- B-PERSON (Person)
- B-ORG (Organization)
- B-PRODUCT (Product)
- B-DATE (Date)
- B-TIME
- B-LANGUAGE
- B-GPE

Metrics

Validation accuracy = 0.9793
val_loss: 0.1141
Precision: 0.97
Recall: 0.97
F1-Score: 0.97

Usage

You can use this model with the Hugging Face Transformers library to perform NER tasks on your own Uzbek language text.

from transformers import pipeline

# Load the NER model
ner_pipeline = pipeline('ner', model='jamshidahmadov/roberta-ner-uz', tokenizer='jamshidahmadov/roberta-ner-uz')

# Example usage
text = "Shvetsiya bosh vaziri Stefan Lyoven Stokholmdagi Spendrups kompaniyasiga tashrif buyurdi."
entities = ner_pipeline(text)

for entity in entities:
    print(entity)