File size: 3,550 Bytes
e838bd2 da84b5a e838bd2 da84b5a e838bd2 da84b5a e838bd2 da84b5a e838bd2 f3c1b5f 725d1cf 48c56aa e838bd2 48c56aa e838bd2 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 |
---
base_model:
- allenai/longformer-base-4096
datasets:
- wangkevin02/LMSYS-USP
language:
- en
license: mit
metrics:
- accuracy
pipeline_tag: text-classification
library_name: transformers
---
# AI Detect Model
## Model Description
> **GitHub repository** for exploring the source code and additional resources: https://github.com/wangkevin02/USP
The **AI Detect Model** is a binary classification model designed to determine whether a given text is AI-generated (label=1) or written by a human (label=0). This model plays a crucial role in providing AI detection rewards, helping to prevent reward hacking during Reinforcement Learning with Cycle Consistency (RLCC). For more details, please refer to [our paper](https://arxiv.org/pdf/2502.18968).
This model is built upon the [Longformer](https://huggingface.co/allenai/longformer-base-4096) architecture and trained using our proprietary [LMSYS-USP](https://huggingface.co/datasets/wangkevin02/LMSYS-USP) dataset. Specifically, in a dialogue context, texts generated by the assistant are labeled as AI-generated (label=1), while user-generated texts are assigned the opposite label (label=0).
> *Note*: Our model is subject to the following constraints:
>
> 1. **Maximum Context Length**: Supports up to **4,096 tokens**. Exceeding this may degrade performance; keep inputs within this limit for best results.
> 2. **Language Limitation**: Optimized for English. Non-English performance may vary due to limited training data.
## Quick Start
You can utilize our AI detection model as demonstrated below:
```python
from transformers import LongformerTokenizer, LongformerForSequenceClassification
import torch
import torch.nn.functional as F
class AIDetector:
def __init__(self, model_name="allenai/longformer-base-4096", max_length=4096):
"""
Initialize the AIDetector with a pretrained Longformer model and tokenizer.
Args:
model_name (str): The name or path of the pretrained Longformer model.
max_length (int): The maximum sequence length for tokenization.
"""
self.tokenizer = LongformerTokenizer.from_pretrained(model_name)
self.model = LongformerForSequenceClassification.from_pretrained(model_name)
self.model.eval()
self.max_length = max_length
self.tokenizer.padding_side = "right"
@torch.no_grad()
def get_probability(self, texts):
inputs = self.tokenizer(texts, padding=True, truncation=True, max_length=self.max_length, return_tensors='pt')
outputs = self.model(**inputs)
probabilities = F.softmax(outputs.logits, dim=1)
return probabilities
# Example usage
if __name__ == "__main__":
classifier = AIDetector(model_name="/path/to/ai_detector")
target_text = [
"I am thinking about going away for vacation",
"How can I help you today?"
]
result = classifier.get_probability(target_text)
print(result)
# >>> Expected Output:
# >>> tensor([[0.9954, 0.0046],
# >>> [0.0265, 0.9735]])
```
## Citation
If you find this model useful, please cite:
```plaintext
@misc{wang2025knowbettermodelinghumanlike,
title={Know You First and Be You Better: Modeling Human-Like User Simulators via Implicit Profiles},
author={Kuang Wang and Xianfei Li and Shenghao Yang and Li Zhou and Feng Jiang and Haizhou Li},
year={2025},
eprint={2502.18968},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2502.18968},
}
``` |