Spaces:
Running
Running
<div style="text-align: center; max-width: 650px; margin: 0 auto;"> | |
<div> | |
<h1 style="font-weight: 900; font-size: 3rem; margin: 20px;"> | |
Porttagger | |
</h1> | |
<p class="slogan">A Brazilian Portuguese part-of-speech tagger according to Universal | |
Dependencies</p> | |
</div> | |
<p style="margin-top: 30px; margin-bottom: 10px; font-size: 94%; text-align: left;"> | |
Porttagger (Porttinari Part-Of-Speech) tagger was trained on the <a | |
href="https://sites.google.com/icmc.usp.br/poetisa/resources-and-tools">Porttinari-base</a> corpus which is | |
a collection of news extracted from the Folha de São Paulo newspaper site. The trained model is a fine-tuned | |
version | |
of <a href="https://huggingface.co/neuralmind/bert-base-portuguese-cased">Bertimbau</a> that receives tokens and | |
outputs part-of-speech tags. Since the model expects a sequence of | |
tokens | |
for its inputs, <a src="https://spacy.io/models/pt">Spacy's</a> tokenization is used to tokenize the input text. | |
</p> | |
</div> |