Matheusuz
/

Sailor-1.8B-AWQ

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions Community

Matheusuz commited on Mar 10, 2024

Commit

8cf59b6

·

verified ·

1 Parent(s): a339543

Create README.md

Files changed (1) hide show

README.md +76 -0

README.md ADDED Viewed

	@@ -0,0 +1,76 @@

+---
+license: other
+inference: false
+---
+**Sailor 1.8B AWQ**
+- Model creator: Sea AI Lab
+- Original model: Sailor 1.8B
+Sailor is a suite of Open Language Models tailored for South-East Asia (SEA), focusing on languages such as 🇮🇩Indonesian, 🇹🇭Thai, 🇻🇳Vietnamese, 🇲🇾Malay, and 🇱🇦Lao. Developed with careful data curation, Sailor models are designed to understand and generate text across diverse linguistic landscapes of SEA region. Built from Qwen 1.5 , Sailor encompasses models of varying sizes, spanning from 0.5B to 7B versions for different requirements. We further fine-tune the base model with open-source datasets to get instruction-tuned models, namedly Sailor-Chat. Benchmarking results demonstrate Sailor's proficiency in tasks such as question answering, commonsense reasoning, and other tasks in SEA languages.
+**Description**
+This repo contain AWQ format model files for Sailor Sailor 1.8B.
+**Prompt Format**
+```
+prompt_template = "{prompt}"
+```
+**Quickstart**
+Here provides a code snippet to show you how to load the tokenizer and model and how to generate contents.
+- Using transformers
+```
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_name = "Matheusuz/Sailor-1.8B-AWQ"
+# Model
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    low_cpu_mem_usage=True,
+    device_map="cuda:0"
+)
+# Tokenizer
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+# Prompt template
+prompt_template = "Artificial intelligence is"
+# Convert prompt to tokens
+tokens = tokenizer(
+    prompt_template,
+    return_tensors='pt'
+).input_ids.cuda()
+# Model parameters
+generation_params = {
+    "do_sample": True,
+    "temperature": 0.7,
+    "top_p": 0.95,
+    "top_k": 40,
+    "max_new_tokens": 512,
+    "repetition_penalty": 1.1
+}
+# Generation
+generation_output = model.generate(
+    tokens,
+    **generation_params
+)
+# Get the tokens from the output, decode them, print them
+token_output = generation_output[0]
+text_output = tokenizer.decode(token_output)
+print(text_output)
+```
+**License**
+Sailor is distributed under the terms of the Qwen License.