Spaces:
Runtime error
Runtime error
Update README.md
Browse files
README.md
CHANGED
@@ -33,14 +33,6 @@ _<sup>β </sup> Zhejiang University,
|
|
33 |
Presenting a comprehensive benchmark dataset designed to assess the proficiency of deepfake detectors amidst real-world scenarios.
|
34 |
</div>
|
35 |
|
36 |
-
## π Table of Contents
|
37 |
-
- [Introduction](#-introduction)
|
38 |
-
- [Activities](#-activities)
|
39 |
-
- [Dataset Description](#-dataset-description)
|
40 |
-
- [Try Detection](#computer--try-detection)
|
41 |
-
- [How to Get the Data](#-how-to-get-the-data)
|
42 |
-
- [Citation](#-citation)
|
43 |
-
<!-- - [Contributing](#-contributing) -->
|
44 |
|
45 |
## π Introduction
|
46 |
Recent advances in large language models have enabled them to reach a level of text generation comparable to that of humans.
|
@@ -52,15 +44,6 @@ To this end, we build **a wild testbed for deepfake text detection**, by gatheri
|
|
52 |
This repository contains the data to testify deepfake detection methods described in our paper, [Deepfake Text Detection in the Wild](https://arxiv.org/abs/2305.13242).
|
53 |
Welcome to test your detection methods on our testbed!
|
54 |
|
55 |
-
## π
Activities
|
56 |
-
|
57 |
-
- **May 25, 2023**: Initial dataset release including texts from 10 domains and 27 LLMs, contributing to 6 testbeds with increasing detection difficulty.
|
58 |
-
- π **June 19, 2023**: Update two 'wilder' testbeds! We go one step wilder by constructing an additional testset with texts from unseen domains generated by an unseen model, to testify the detection ability in more practical scenarios.
|
59 |
-
We consider four new datasets: CNN/DailyMail, DialogSum, PubMedQA and IMDb to test the detection of deepfake news, deepfake dialogues, deepfake scientific answers and deepfake movie reviews.
|
60 |
-
We sample 200 instances from each dataset and use a newly developed LLM, i.e., GPT-4, with specially designed prompts to create deepfake texts, establishing an "Unseen Domains & Unseen Model" scenario.
|
61 |
-
Previous work demonstrates that detection methods are vulnerable to being deceived by target texts.
|
62 |
-
Therefore, we also paraphrase each sentence individually for both human-written and machine-generated texts, forming an even more challenging testbed.
|
63 |
-
We adopt gpt-3.5-trubo as the zero-shot paraphraser and consider all paraphrased texts as machine-generated.
|
64 |
|
65 |
## π Dataset Description
|
66 |
|
@@ -84,10 +67,7 @@ Here's a brief overview of the types of data included:
|
|
84 |
|
85 |
|
86 |
|
87 |
-
|
88 |
-
|
89 |
-
|
90 |
-
## :computer: Try Detection
|
91 |
|
92 |
### Model Access
|
93 |
Our Longformer detector, which has been trained on the entire dataset, is now accessible through [Huggingface](https://huggingface.co/nealcly/detection-longformer). Additionally, you can try detection directly using our [online demo](https://huggingface.co/spaces/yaful/DeepfakeTextDetect).
|
|
|
33 |
Presenting a comprehensive benchmark dataset designed to assess the proficiency of deepfake detectors amidst real-world scenarios.
|
34 |
</div>
|
35 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
36 |
|
37 |
## π Introduction
|
38 |
Recent advances in large language models have enabled them to reach a level of text generation comparable to that of humans.
|
|
|
44 |
This repository contains the data to testify deepfake detection methods described in our paper, [Deepfake Text Detection in the Wild](https://arxiv.org/abs/2305.13242).
|
45 |
Welcome to test your detection methods on our testbed!
|
46 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
47 |
|
48 |
## π Dataset Description
|
49 |
|
|
|
67 |
|
68 |
|
69 |
|
70 |
+
## :computer: Try Detection
|
|
|
|
|
|
|
71 |
|
72 |
### Model Access
|
73 |
Our Longformer detector, which has been trained on the entire dataset, is now accessible through [Huggingface](https://huggingface.co/nealcly/detection-longformer). Additionally, you can try detection directly using our [online demo](https://huggingface.co/spaces/yaful/DeepfakeTextDetect).
|