--- license: mit language: - af - az - be - bg - bn - ca - cs - cy - da - de - el - en - eo - es - et - eu - fa - fi - fr - fy - ga - gl - gu - he - hi - hu - hy - id - is - it - ka - kk - ky - la - lt - lv - mg - mk - ml - mt - nl - pa - pl - pt - ro - ru - sk - sq - sv - ta - te - th - tr - uk - yi - yo datasets: - benjamin/compoundpiece --- Compound normalization model from [CompoundPiece: Evaluating and Improving Decompounding Performance of Language Models](https://arxiv.org/abs/2305.14214). ## Usage ``` from transformers import pipeline pipe = pipeline("text2text-generation", "benjamin/compoundpiece") pipe("Hauswirtschaftslehre", max_length=32) # [{'generated_text': 'Haus-Wirtschaft-Lehre'}] ``` # Citation ``` @article{minixhofer2023compoundpiece, title={CompoundPiece: Evaluating and Improving Decompounding Performance of Language Models}, author={Minixhofer, Benjamin and Pfeiffer, Jonas and Vuli{\'c}, Ivan}, journal={arXiv preprint arXiv:2305.14214}, year={2023} } ``` # License MIT