--- license: mit datasets: - agentlans/tatoeba-english-translations base_model: - microsoft/mdeberta-v3-base pipeline_tag: text-classification tags: - multilingual - quality-assessment --- # DeBERTa V3 Base for Multilingual Quality Assessment This is a fine-tuned version of the multilingual DeBERTa model (mdeberta) for assessing text quality across languages. ## Model Details - **Architecture:** mdeberta-v3-base-quality - **Task:** Regression (Quality Assessment) - **Training Data:** [agentlans/tatoeba-english-translations](https://huggingface.co/datasets/agentlans/tatoeba-english-translations/) dataset containing 39 100 English translations - **Input:** Text in any of the supported languages by DeBERTa - **Output:** Estimated quality score for text - higher values indicate better text ## Performance Root mean squared error (RMSE) on 20% held-out validation set: 0.5036 ## Training Data The model was trained on [agentlans/tatoeba-english-translations](https://huggingface.co/datasets/agentlans/tatoeba-english-translations). ## Usage ## Limitations - Performance may vary for texts significantly different from the training data - Output is based on statistical patterns and may not always align with human judgment - Quality is assessed purely on textual features, not considering factors like subject familiarity or cultural context ## Ethical Considerations - Should not be used as the sole determinant of text suitability for specific audiences - Results may reflect biases present in the training data sources - Care should be taken when using these models in educational or publishing contexts