DistilBERT Incoherence Classifier (Multilingual)
This is a fine-tuned DistilBERT-multilingual model for classifying text based on its coherence. It can identify various types of incoherence.
Model Details
- Model: DistilBERT (distilbert-base-multilingual-cased)
- Task: Text Classification (Coherence Detection)
- Fine-tuning: The model was fine-tuned using a synthetically generated dataset that features various types of incoherence
Training Metrics
Epoch | Training Loss | Validation Loss | Accuracy | Precision | Recall | F1 |
---|---|---|---|---|---|---|
1 | 0.343600 | 0.303963 | 0.880312 | 0.882746 | 0.880312 | 0.879637 |
2 | 0.245200 | 0.286482 | 0.900850 | 0.901156 | 0.900850 | 0.899612 |
3 | 0.149700 | 0.313061 | 0.906161 | 0.906049 | 0.906161 | 0.905103 |
Evaluation Metrics
The following metrics were measured on the test set:
Metric | Value |
---|---|
Loss | 0.316272 |
Accuracy | 0.903329 |
Precision | 0.903704 |
Recall | 0.903329 |
F1-Score | 0.902359 |
Classification Report:
precision recall f1-score support
coherent 0.86 0.93 0.90 2051
grammatical_errors 0.88 0.76 0.81 599
random_bytes 1.00 1.00 1.00 599
random_tokens 1.00 1.00 1.00 600
random_words 0.95 0.93 0.94 600
run_on 0.85 0.79 0.82 600
word_soup 0.89 0.83 0.86 599
accuracy 0.90 5648
macro avg 0.92 0.89 0.90 5648
weighted avg 0.90 0.90 0.90 5648
Confusion Matrix
The confusion matrix above shows the performance of the model on each class.
Usage
This model can be used for text classification tasks, specifically for detecting and categorizing different types of text incoherence. You can use the inference_example
function provided in the notebook to test your own text.
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
tokenizer = AutoTokenizer.from_pretrained("SuccubusBot/distilbert-multilingual-incoherence-classifier")
model = AutoModelForSequenceClassification.from_pretrained("SuccubusBot/distilbert-multilingual-incoherence-classifier")
classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
while True:
text = input("Enter text (or type 'exit' to quit): ")
if text.lower() == "exit":
break
# Example usage
results = classifier(text)
# Print the results with confidence scores for all labels
for result in results:
print(f"Label: {result['label']}, Confidence: {result['score']}")
Limitations
The model has been trained on a generated dataset, so care must be taken in evaluating it in the real world. More data may need to be collected before evaluating this model in a real-world setting.
- Downloads last month
- 9
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support