Sailor 1.8B AWQ

  • Model creator: Sea AI Lab
  • Original model: Sailor 1.8B

Sailor is a suite of Open Language Models tailored for South-East Asia (SEA), focusing on languages such as 🇮🇩Indonesian, 🇹🇭Thai, 🇻🇳Vietnamese, 🇲🇾Malay, and 🇱🇦Lao. Developed with careful data curation, Sailor models are designed to understand and generate text across diverse linguistic landscapes of SEA region. Built from Qwen 1.5 , Sailor encompasses models of varying sizes, spanning from 0.5B to 7B versions for different requirements. We further fine-tune the base model with open-source datasets to get instruction-tuned models, namedly Sailor-Chat. Benchmarking results demonstrate Sailor's proficiency in tasks such as question answering, commonsense reasoning, and other tasks in SEA languages.

Description

This repo contain AWQ format model files for Sailor Sailor 1.8B.

Prompt Format

prompt_template = "{prompt}"

Quickstart

Here provides a code snippet to show you how to load the tokenizer and model and how to generate contents.

  • Using transformers
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Matheusuz/Sailor-1.8B-AWQ"

# Model
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    low_cpu_mem_usage=True,
    device_map="cuda:0"
)

# Tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Prompt template
prompt_template = "Artificial intelligence is"

# Convert prompt to tokens
tokens = tokenizer(
    prompt_template,
    return_tensors='pt'
).input_ids.cuda()

# Model parameters
generation_params = {
    "do_sample": True,
    "temperature": 0.7,
    "top_p": 0.95,
    "top_k": 40,
    "max_new_tokens": 512,
    "repetition_penalty": 1.1
}

# Generation
generation_output = model.generate(
    tokens,
    **generation_params
)

# Get the tokens from the output, decode them, print them
token_output = generation_output[0]
text_output = tokenizer.decode(token_output)
print(text_output)

License

Sailor is distributed under the terms of the Qwen License.

Downloads last month
4
Safetensors
Model size
785M params
Tensor type
I32
·
FP16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model authors have turned it off explicitly.