gsar78
/

GreekLlama-1.1B-it

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

GreekLlama-1.1B-it / README.md

gsar78's picture

Update README.md

507373e verified 6 months ago

|

history blame contribute delete

1.59 kB

	---
	license: apache-2.0
	language:
	- el
	pipeline_tag: text-generation
	---
	# Model Description

	This is an instruction tuned model based on the gsar78/GreekLlama-1.1B-base model.

	The dataset used has 52k instruction/response pairs, all in Greek language

	Notice: The model is for experimental & research purposes.

	# Usage

	To use you can just run the following in a Colab configured with a GPU:

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	import transformers
	import torch


	# Load the tokenizer and model
	tokenizer = AutoTokenizer.from_pretrained("gsar78/GreekLlama-1.1B-it")
	model = AutoModelForCausalLM.from_pretrained("gsar78/GreekLlama-1.1B-it")


	# Check if CUDA is available and move the model to GPU if possible
	device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
	model.to(device)

	prompt = "Ποιά είναι τα δύο βασικά πράγματα που πρέπει να γνωρίζω για την Τεχνητή Νοημοσύνη:"

	# Tokenize the input prompt
	inputs = tokenizer(prompt, return_tensors="pt").to(device)

	# Generate the output
	generation_params = {
	#"max_new_tokens": 250, # Adjust the number of tokens generated
	"do_sample": True, # Enable sampling to diversify outputs
	"temperature": 0.1, # Sampling temperature
	"top_p": 0.9, # Nucleus sampling
	"num_return_sequences": 1,
	}

	output = model.generate(inputs, generation_params)

	# Decode the generated text
	generated_text = tokenizer.decode(output[0], skip_special_tokens=True)

	print("Generated Text:")
	print(generated_text)
	```