simecek
/

cswikimistral_0.1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

cswikimistral_0.1 / README.md

simecek's picture

Update README.md

1884c58 verified 6 months ago

|

history blame contribute delete

1.73 kB

	---
	license: apache-2.0
	datasets:
	- simecek/wikipedie_20230601
	language:
	- cs
	---

	This is a [Mistral7B](https://huggingface.co/mistralai/Mistral-7B-v0.1) model fine-tuned with 4bit-QLoRA on Czech Wikipedia data. The model is primarily designed for further fine-tuning for Czech-specific NLP tasks, including summarization and question answering. This adaptation allows for better performance in tasks that require an understanding of the Czech language and context.

	For exact QLoRA parameters, see the Axolotl's [YAML file](cswiki-mistral7.yml).

	[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)

	Example of usage::

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	model_name = "simecek/cswikimistral_0.1"
	device = "cuda" if torch.cuda.is_available() else "cpu"

	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", load_in_4bit=True)

	def generate_text(prompt, max_new_tokens=50):
	inputs = tokenizer(prompt, return_tensors="pt").to(device)
	attention_mask = inputs["attention_mask"]
	input_ids = inputs["input_ids"]

	output = model.generate(
	input_ids,
	attention_mask=attention_mask,
	max_new_tokens=max_new_tokens,
	num_return_sequences=1,
	pad_token_id=tokenizer.eos_token_id,
	)

	return tokenizer.decode(output[0], skip_special_tokens=True)

	prompt = "Hlavní město České republiky je"
	generated_text = generate_text(prompt, max_new_tokens=5)
	print(generated_text)
	```