--- license: cc-by-sa-4.0 language: - cs pipeline_tag: text-generation base_model: BUT-FIT/CSTinyLlama-1.2B widget: - text: '<|AUTHOR|> Šmilovský, Alois Vojtěch' example_title: Alois Vojtech Smilovsky - text: '<|AUTHOR|> Vrchlický, Jaroslav' example_title: Jaroslav Vrchlicky - text: '<|AUTHOR|> Adámek, Bohumil' example_title: Bohumil Adamek --- ### Czech Poetry TinyLLama TinyLLama finetuned on Czech poetry from github project by Institute of Czech Literature, Czech Academy of Sciences. https://github.com/versotym/corpusCzechVerse ## Usage Use as any other LM style model ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch tokenizer = AutoTokenizer.from_pretrained("jinymusim/TinyLlama-Czech-Poet") model = AutoModelForCausalLM.from_pretrained("jinymusim/TinyLlama-Czech-Poet") # Input Poet Start poet_start = '<|AUTHOR|> Adámek, Bohumil' poet_start = poet_start.strip() tokenized_poet_start = tokenizer.encode(poet_start, return_tensors='pt') # generated a continuation to it out = model.generate(tokenized_poet_start, max_length=256, do_sample=True, top_k=50 early_stopping=True, pad_token_id= tokenizer.pad_token_id, eos_token_id = tokenizer.eos_token_id) # Decode Poet decoded_cont = tokenizer.decode(out[0], skip_special_tokens=True) print(decoded_cont) ``` ## Structure of outputs Outputs are structured in following way: ``` <|AUTHOR|> AUTHOR <|TITLE|> TITLE <|YEAR|> YEAR <|STROPHE_START|> <|METER|> METER <|RHYME|> RHYME SCHEMA STROPHE <|STROPHE_END|> <|STROPHE_START|> <|METER|> METER <|RHYME|> RHYME SCHEMA STROPHE <|STROPHE_START|> ```