gsar78
/

GreekLlama-1.1B-it

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

gsar78 commited on Jul 5, 2024

Commit

57f812d

•

1 Parent(s): 379f827

Create README.md

Files changed (1) hide show

README.md +54 -0

README.md ADDED Viewed

	@@ -0,0 +1,54 @@

+---
+license: apache-2.0
+language:
+- el
+pipeline_tag: text-generation
+---
+# Model Description
+This is an instruction tuned model based on the gsar78/GreekLlama-1.1B-base model.
+The dataset used is 52k row instruction/response pairs all in Greek language
+Notice: The model is for experimental & research purposes.
+# Usage
+To use you can just run the following in a Colab configured with a GPU:
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+import transformers
+import torch
+# Load the tokenizer and model
+tokenizer = AutoTokenizer.from_pretrained("gsar78/GreekLlama-1.1B-it")
+model = AutoModelForCausalLM.from_pretrained("gsar78/GreekLlama-1.1B-it")
+# Check if CUDA is available and move the model to GPU if possible
+device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+model.to(device)
+prompt = "Ποιά είναι τα δύο βασικά πράγματα που πρέπει να γνωρίζω για την Τεχνητή Νοημοσύνη:"
+# Tokenize the input prompt
+inputs = tokenizer(prompt, return_tensors="pt").to(device)
+# Generate the output
+generation_params = {
+    #"max_new_tokens": 250,  # Adjust the number of tokens generated
+    "do_sample": True,  # Enable sampling to diversify outputs
+    "temperature": 0.1,  # Sampling temperature
+    "top_p": 0.9,  # Nucleus sampling
+    "num_return_sequences": 1,
+}
+output = model.generate(**inputs, **generation_params)
+# Decode the generated text
+generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
+print("Generated Text:")
+print(generated_text)
+```