roberta-ner-uz / README.md
jamshidahmadov's picture
Update README.md
bd58865 verified
metadata
license: apache-2.0
datasets:
  - risqaliyevds/uzbek_ner
language:
  - uz
metrics:
  - precision
  - f1
  - recall
  - accuracy
base_model:
  - FacebookAI/xlm-roberta-base
pipeline_tag: token-classification

NER Model for Uzbek Language (XLM-RoBERTa-based)

This is a Named Entity Recognition (NER) model trained for the Uzbek language based on the XLM-RoBERTa architecture. It is fine-tuned to classify entities into categories such as location, person, organization, and other types.

Model Details

  • Model Type: XLM-RoBERTa (Transformer-based)
  • Task: Named Entity Recognition (NER)
  • Training Data: Custom dataset with labeled named entities for Uzbek language.
  • Categories:
    • B-LOC (Location)
    • B-PERSON (Person)
    • B-ORG (Organization)
    • B-PRODUCT (Product)
    • B-DATE (Date)
    • B-TIME
    • B-LANGUAGE
    • B-GPE

Metrics

  • Validation accuracy = 0.9793

  • val_loss: 0.1141

  • Precision: 0.97

  • Recall: 0.97

  • F1-Score: 0.97

Usage

You can use this model with the Hugging Face Transformers library to perform NER tasks on your own Uzbek language text.

from transformers import pipeline

# Load the NER model
ner_pipeline = pipeline('ner', model='jamshidahmadov/roberta-ner-uz', tokenizer='jamshidahmadov/roberta-ner-uz')

# Example usage
text = "Shvetsiya bosh vaziri Stefan Lyoven Stokholmdagi Spendrups kompaniyasiga tashrif buyurdi."
entities = ner_pipeline(text)

for entity in entities:
    print(entity)