File size: 1,626 Bytes
00b5d57 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
---
license: mit
datasets:
- agentlans/tatoeba-english-translations
base_model:
- microsoft/mdeberta-v3-base
pipeline_tag: text-classification
tags:
- multilingual
- quality-assessment
---
# DeBERTa V3 Base for Multilingual Quality Assessment
This is a fine-tuned version of the multilingual DeBERTa model (mdeberta) for assessing text quality across languages.
## Model Details
- **Architecture:** mdeberta-v3-base-quality
- **Task:** Regression (Quality Assessment)
- **Training Data:** [agentlans/tatoeba-english-translations](https://huggingface.co/datasets/agentlans/tatoeba-english-translations/) dataset containing 39 100 English translations
- **Input:** Text in any of the supported languages by DeBERTa
- **Output:** Estimated quality score for text
- higher values indicate better text
## Performance
Root mean squared error (RMSE) on 20% held-out validation set: 0.5036
## Training Data
The model was trained on [agentlans/tatoeba-english-translations](https://huggingface.co/datasets/agentlans/tatoeba-english-translations).
## Usage
## Limitations
- Performance may vary for texts significantly different from the training data
- Output is based on statistical patterns and may not always align with human judgment
- Quality is assessed purely on textual features, not considering factors like subject familiarity or cultural context
## Ethical Considerations
- Should not be used as the sole determinant of text suitability for specific audiences
- Results may reflect biases present in the training data sources
- Care should be taken when using these models in educational or publishing contexts |