lchaloupsky
commited on
Commit
•
c80e40e
1
Parent(s):
78fb005
Update README.md
Browse files
README.md
CHANGED
@@ -1,11 +1,13 @@
|
|
1 |
---
|
2 |
language: cs
|
|
|
|
|
3 |
license: mit
|
4 |
datasets:
|
5 |
- oscar
|
6 |
---
|
7 |
|
8 |
-
# Czech
|
9 |
This model was trained as a part of the [master thesis](https://dspace.cuni.cz/handle/20.500.11956/176356?locale-attribute=en) on the Czech part of the [OSCAR](https://huggingface.co/datasets/oscar) dataset.
|
10 |
|
11 |
## Introduction
|
@@ -126,7 +128,7 @@ The training data used for this model come from the Czech part of the OSCAR data
|
|
126 |
> Because large-scale language models like GPT-2 do not distinguish fact from fiction, we don’t support use-cases that require the generated text to be true. Additionally, language models like GPT-2 reflect the biases inherent to the systems they were trained on, so we do not recommend that they be deployed into systems that interact with humans > unless the deployers first carry out a study of biases relevant to the intended use-case. We found no statistically significant difference in gender, race, and religious bias probes between 774M and 1.5B, implying all versions of GPT-2 should be approached with similar levels of caution around use cases that are sensitive to biases around human attributes.
|
127 |
|
128 |
## Author
|
129 |
-
Czech
|
130 |
|
131 |
## Citation
|
132 |
```
|
|
|
1 |
---
|
2 |
language: cs
|
3 |
+
widget:
|
4 |
+
- text: Praha je krásné město
|
5 |
license: mit
|
6 |
datasets:
|
7 |
- oscar
|
8 |
---
|
9 |
|
10 |
+
# Czech GPT-2 small model trained on the OSCAR dataset
|
11 |
This model was trained as a part of the [master thesis](https://dspace.cuni.cz/handle/20.500.11956/176356?locale-attribute=en) on the Czech part of the [OSCAR](https://huggingface.co/datasets/oscar) dataset.
|
12 |
|
13 |
## Introduction
|
|
|
128 |
> Because large-scale language models like GPT-2 do not distinguish fact from fiction, we don’t support use-cases that require the generated text to be true. Additionally, language models like GPT-2 reflect the biases inherent to the systems they were trained on, so we do not recommend that they be deployed into systems that interact with humans > unless the deployers first carry out a study of biases relevant to the intended use-case. We found no statistically significant difference in gender, race, and religious bias probes between 774M and 1.5B, implying all versions of GPT-2 should be approached with similar levels of caution around use cases that are sensitive to biases around human attributes.
|
129 |
|
130 |
## Author
|
131 |
+
Czech-GPT2-OSCAR was trained and evaluated by [Lukáš Chaloupský](https://cz.linkedin.com/in/luk%C3%A1%C5%A1-chaloupsk%C3%BD-0016b8226?original_referer=https%3A%2F%2Fwww.google.com%2F) thanks to the computing power of the GPU (NVIDIA A100 SXM4 40GB) cluster of [IT4I](https://www.it4i.cz/) (VSB - Technical University of Ostrava).
|
132 |
|
133 |
## Citation
|
134 |
```
|