Spaces:
Runtime error
Runtime error
File size: 1,820 Bytes
19d4726 e404e3f 19d4726 c541339 19d4726 f1c1edb c541339 9e442ef 19d4726 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
<div style="text-align: center; max-width: 650px; margin: 0 auto;">
<div>
<h1 style="font-weight: 900; font-size: 3rem; margin: 20px;">
Porttagger-DANTE
</h1>
<p class="slogan">A Brazilian Portuguese part of speech tagger according to the <a
href="https://universaldependencies.org/">Universal Dependencies</a> model
</p>
</div>
<p style="margin-top: 30px; margin-bottom: 10px; font-size: 94%; text-align: justify;">
 Porttagger is a state of the art part of speech tagger for Brazilian Portuguese that automatically assigns
morphosyntactic classes to the words of sentences, following the Universal Dependencies international model. You
may provide single sentences or multiple sentences (using plain text files with several sentences) to be tagged.
You may also choose which trained model to use. The options include a model trained on news texts (using the
<a href="https://sites.google.com/icmc.usp.br/poetisa/resources-and-tools">Porttinari-base</a> corpus), on stock
market tweets (from the <a
href="https://www.kaggle.com/datasets/fernandojvdasilva/stock-tweets-ptbr-emotions">DANTE</a> corpus), on
academic texts from the oil & gas
domain (from the <a
href="https://github.com/UniversalDependencies/UD_Portuguese-PetroGold/blob/master/README.md">PetroGold</a>
corpus), and on all of them together. To the interested reader, this initiative is
part of the <a href="https://sites.google.com/icmc.usp.br/poetisa/">POeTiSA</a> project, where much more
information is available.
See more details about Porttagger in this <a href="https://sol.sbc.org.br/index.php/stil/article/view/25438/25259">paper</a>.
</p>
</div> |