|
--- |
|
library_name: transformers |
|
license: mit |
|
base_model: agentlans/deberta-v3-xsmall-zyda-2 |
|
tags: |
|
- generated_from_trainer |
|
model-index: |
|
- name: deberta-v3-xsmall-zyda-2-transformed-quality-new |
|
results: [] |
|
--- |
|
|
|
# DeBERTa-v3-xsmall-Zyda-2-quality |
|
|
|
## Model Overview |
|
|
|
This model is a fine-tuned version of [agentlans/deberta-v3-xsmall-zyda-2](https://huggingface.co/agentlans/deberta-v3-xsmall-zyda-2) designed for text quality assessment. It achieves the following results on the evaluation set: |
|
|
|
- Loss: 0.3165 |
|
- MSE: 0.3165 |
|
|
|
## Dataset Information |
|
|
|
The model was trained on the [Text Quality Meta-Analysis Dataset](https://huggingface.co/datasets/agentlans/text-quality-v2), which is a comprehensive collection of sentences with associated quality metrics derived from multiple sources and methods. This dataset combines text from various sources with quality scores from different models to create a thorough assessment of sentence quality. |
|
|
|
In this context, "quality" refers to legible English sentences that are not spam and contain useful information. It does not necessarily indicate grammatical or factual correctness. |
|
|
|
## Model Description |
|
|
|
The model is based on the DeBERTa-v3-xsmall architecture and has been fine-tuned for sequence classification tasks, specifically for assessing the quality of text inputs. |
|
|
|
## Intended Uses & Limitations |
|
|
|
This model is intended for evaluating the quality of text inputs. It can be used for various applications such as content moderation, spam detection, or assessing the credibility of information. |
|
|
|
### Usage Example |
|
|
|
```python |
|
import torch |
|
from transformers import AutoModelForSequenceClassification, AutoTokenizer |
|
|
|
# Load model and tokenizer |
|
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
|
|
|
model_name = "agentlans/deberta-v3-xsmall-zyda-2-quality" |
|
model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=1).to(device) |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
|
|
# Function to perform inference |
|
def predict_score(text): |
|
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True).to(device) |
|
with torch.no_grad(): |
|
logits = model(**inputs).logits |
|
return logits.item() |
|
|
|
# Example usage |
|
input_text = "This product is excellent and works perfectly!" |
|
predicted_score = predict_score(input_text) |
|
print(f"Predicted score: {predicted_score}") |
|
``` |
|
|
|
### Sample Predictions |
|
|
|
| Text | Quality Score | |
|
|------|---------------| |
|
| Discover the secret to eternal youth with our revolutionary skincare product! | -1.74 | |
|
| Act now! Limited time offer on miracle weight loss pills! | -1.50 | |
|
| Congratulations! You've won a $1,000 gift card! Click here to claim your prize! | -0.86 | |
|
| Get rich quick with our foolproof investment strategy - no experience needed! | -0.77 | |
|
| Your computer is infected! Click here for a free scan and fix your issues now! | -0.29 | |
|
| Unlock the secrets of the universe with our exclusive online astronomy course! | 0.14 | |
|
| Earn money from home by participating in online surveys - sign up today! | 0.23 | |
|
| The Eiffel Tower can be 15 cm taller during the summer due to thermal expansion. | 0.75 | |
|
| Did you know? The average person spends 6 years of their life dreaming. | 1.60 | |
|
| Did you know that honey never spoils? Archaeologists have found pots of honey in ancient Egyptian tombs that are over 3,000 years old and still edible. | 2.27 | |
|
|
|
## Training Procedure |
|
|
|
### Training Hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- Learning rate: 5e-05 |
|
- Train batch size: 64 |
|
- Eval batch size: 8 |
|
- Seed: 42 |
|
- Optimizer: AdamW with betas=(0.9, 0.999) and epsilon=1e-08 |
|
- Learning rate scheduler: Linear |
|
- Number of epochs: 3.0 |
|
|
|
### Training Results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | MSE | |
|
|:-------------:|:-----:|:-----:|:---------------:|:------:| |
|
| 0.3506 | 1.0 | 12649 | 0.3512 | 0.3512 | |
|
| 0.2800 | 2.0 | 25298 | 0.3187 | 0.3187 | |
|
| 0.2398 | 3.0 | 37947 | 0.3165 | 0.3165 | |
|
|
|
### Framework Versions |
|
|
|
- Transformers: 4.46.3 |
|
- PyTorch: 2.5.1+cu124 |
|
- Datasets: 3.1.0 |
|
- Tokenizers: 0.20.3 |
|
|