File size: 1,796 Bytes
a172f2c 51c2739 a172f2c 51c2739 b2c6c7a 51c2739 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 |
---
language: ru
tags:
- spam-detection
- text-classification
- russian
license: mit
datasets:
- RUSpam/spam_dataset_v4
metrics:
- F1
model-index:
- name: spam_deberta_v4
results:
- task:
name: Классификация текста
type: text-classification
dataset:
name: RUSpam/russian_spam_dataset
type: RUSpam/russian_spam_dataset
metrics:
- name: F1
type: F1
value: 0.9897
---
# RUSpam/spam_deberta_v4
## Описание
Это модель определения спама, основанная на архитектуре Deberta, дообученная на русскоязычных данных о спаме. Она классифицирует текст как спам или не спам.
## Использование
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model_path = "RUSpam/spam_deberta_v4"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForSequenceClassification.from_pretrained(model_path)
def predict(text):
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=256)
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
predicted_class = torch.argmax(logits, dim=1).item()
return "Спам" if predicted_class == 1 else "Не спам"
text = "Ваш текст для проверки здесь"
result = predict(text)
print(f"Результат: {result}")
```
# Цитирование
```
@MISC{RUSpam/spam_deberta_v4,
author = {Denis Petrov, Kirill Fedko (Neurospacex), Sergey Yalovegin},
title = {Russian Spam Classification Model},
url = {https://huggingface.co/RUSpam/spam_deberta_v4/},
year = 2024
}
``` |