roberta-ner-uz / README.md
Jamshid Ahmadov
Update README.md
bd58865 verified
---
license: apache-2.0
datasets:
- risqaliyevds/uzbek_ner
language:
- uz
metrics:
- precision
- f1
- recall
- accuracy
base_model:
- FacebookAI/xlm-roberta-base
pipeline_tag: token-classification
---
# NER Model for Uzbek Language (XLM-RoBERTa-based)
This is a Named Entity Recognition (NER) model trained for the Uzbek language based on the XLM-RoBERTa architecture. It is fine-tuned to classify entities into categories such as location, person, organization, and other types.
## Model Details
- **Model Type**: XLM-RoBERTa (Transformer-based)
- **Task**: Named Entity Recognition (NER)
- **Training Data**: Custom dataset with labeled named entities for Uzbek language.
- **Categories**:
- `B-LOC` (Location)
- `B-PERSON` (Person)
- `B-ORG` (Organization)
- `B-PRODUCT` (Product)
- `B-DATE` (Date)
- `B-TIME`
- `B-LANGUAGE`
- `B-GPE`
## Metrics
- **Validation accuracy = 0.9793**
- **val_loss: 0.1141**
- **Precision: 0.97**
- **Recall: 0.97**
- **F1-Score: 0.97**
## Usage
You can use this model with the Hugging Face Transformers library to perform NER tasks on your own Uzbek language text.
```python
from transformers import pipeline
# Load the NER model
ner_pipeline = pipeline('ner', model='jamshidahmadov/roberta-ner-uz', tokenizer='jamshidahmadov/roberta-ner-uz')
# Example usage
text = "Shvetsiya bosh vaziri Stefan Lyoven Stokholmdagi Spendrups kompaniyasiga tashrif buyurdi."
entities = ner_pipeline(text)
for entity in entities:
print(entity)