culturay_el_32000

About

๐Ÿ‡ฌ๐Ÿ‡ท A Greek tokenizer, trained on the Greek (el) subset of the CulturaY dataset.

Description

This is a character-level Modern Greek (el) tokenizer, trained on the corresponding subset of CulturaY. It has a vocabulary size of 32,000 (multiple of 128), which makes it fast for integration in models.

Usage

import tokenizers

dataset = tokenizers.Tokenizer.from_pretrained("gvlassis/culturay_el_32000")
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.