agentlans
/

deberta-v3-xsmall-zyda-2-quality

Text Classification

Generated from Trainer

Model card Files Files and versions

agentlans commited on Dec 15, 2024

Commit

26423be

·

verified ·

1 Parent(s): 6a56b21

Update README.md

Files changed (1) hide show

README.md +2 -12

README.md CHANGED Viewed

@@ -13,7 +13,7 @@ model-index:
 ## Model Overview
-This model is a fine-tuned version of [agentlans/deberta-v3-xsmall-zyda-2](https://huggingface.co/agentlans/deberta-v3-xsmall-zyda-2) designed for text quality assessment. It has been trained on the Text Quality Meta-Analysis Dataset and achieves the following results on the evaluation set:
 - Loss: 0.3165
 - MSE: 0.3165
@@ -24,16 +24,6 @@ The model was trained on the [Text Quality Meta-Analysis Dataset](https://huggin
 In this context, "quality" refers to legible English sentences that are not spam and contain useful information. It does not necessarily indicate grammatical or factual correctness.
-### Quality Score Derivation
-The composite quality score was derived through the following steps:
-1. Principal Component Analysis (PCA) was performed on the normalized "fineweb" and "nvidia" scores.
-2. The first principal component was extracted as an initial measure of quality.
-3. This quality measure was then adjusted for sentence length using robust linear regression (rlm function from the MASS package).
-4. The adjusted quality scores were scaled to z-scores to produce the final quality metric.
-5. The scores were then quantile normalized to a normal distribution.
 ## Model Description
 The model is based on the DeBERTa-v3-xsmall architecture and has been fine-tuned for sequence classification tasks, specifically for assessing the quality of text inputs.
@@ -44,7 +34,7 @@ This model is intended for evaluating the quality of text inputs. It can be used
 ### Usage Example
-```
 import torch
 from transformers import AutoModelForSequenceClassification, AutoTokenizer

 ## Model Overview
+This model is a fine-tuned version of [agentlans/deberta-v3-xsmall-zyda-2](https://huggingface.co/agentlans/deberta-v3-xsmall-zyda-2) designed for text quality assessment. It achieves the following results on the evaluation set:
 - Loss: 0.3165
 - MSE: 0.3165
 In this context, "quality" refers to legible English sentences that are not spam and contain useful information. It does not necessarily indicate grammatical or factual correctness.
 ## Model Description
 The model is based on the DeBERTa-v3-xsmall architecture and has been fine-tuned for sequence classification tasks, specifically for assessing the quality of text inputs.
 ### Usage Example
+```python
 import torch
 from transformers import AutoModelForSequenceClassification, AutoTokenizer