Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,35 @@
|
|
1 |
---
|
2 |
-
license:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
license: afl-3.0
|
3 |
+
language:
|
4 |
+
- ru
|
5 |
+
library_name: transformers
|
6 |
+
pipeline_tag: text2text-generation
|
7 |
+
tags:
|
8 |
+
- humor
|
9 |
+
- T5
|
10 |
+
- jokes-generation
|
11 |
---
|
12 |
+
|
13 |
+
|
14 |
+
## Task
|
15 |
+
Model create for jokes generation task on Russian language.
|
16 |
+
Generate jokes from scratch is too difficult task. Too make it easier jokes was splitted into setup and punch pairs.
|
17 |
+
Each setup can produce infinite number of punches so inspiration was also introduced,
|
18 |
+
which means main idea (or main word) of punch for given setup. In the real world, jokes come in different qualities (bad, good, funny, ...).
|
19 |
+
Therefore, in order for the models to distinguish them from each other, a mark was introduced. It ranges from 0 (not a joke) to 5 (golden joke).
|
20 |
+
|
21 |
+
|
22 |
+
## Info
|
23 |
+
Model trained using flax on huge dataset with jokes and anekdots on different tasks:
|
24 |
+
1. Span masks (dataset size: 850K)
|
25 |
+
2. Conditional generation: generate inspiration by given setup (dataset size: 230K)
|
26 |
+
3. Conditional generation: generate punch by given setup and inspiration (dataset size: 240K)
|
27 |
+
4. Conditional generation: generate mark by given setup and punch (dataset size: 200K)
|
28 |
+
|
29 |
+
|
30 |
+
## Ethical considerations and risks
|
31 |
+
Model is fine-tuned on a large corpus of humorous text data scraped from from websites/telegram channels with anecdotes, shortliners, jokes.
|
32 |
+
Text was not filtered for explicit content or assessed for existing biases.
|
33 |
+
As a result the model itself is potentially vulnerable to generating equivalently inappropriate content or replicating inherent biases
|
34 |
+
in the underlying data.
|
35 |
+
Please don't take it seriously.
|