Jais-590m-merged

Jais-590m-merged is a merge of the following models using LazyMergekit:

(Yes, that's a straight merge of two identical non-fine-tuned models, for research purposes)

🧩 Configuration

slices:
  - sources:
      - model: inceptionai/jais-family-590m
        layer_range: [0, 18]
      - model: inceptionai/jais-family-590m
        layer_range: [0, 18]
merge_method: slerp
base_model: inceptionai/jais-family-590m
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5
dtype: bfloat16

💻 Usage

/Due to the jais family tokenizer deployment with trust remote code, especially if handling Arabic, the following implementation is suggested for inferencing this merge model/

(Notebook saved in repo to run in google colab or similar)

!pip install -qU transformers accelerate

from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import torch

# Model and message setup
model_name = "Solshine/Jais-590m-merged"
user_message = "Explain how transformers work in machine learning"  # This can be any user input

# Structure the message with role-content pairing for compatibility with Jais-chat format
messages = [{"role": "user", "content": user_message}]

# Initialize tokenizer with trust_remote_code for custom Arabic-English handling
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)

# Check if tokenizer is valid
if tokenizer is None:
    raise ValueError("Tokenizer initialization failed!")

# Custom chat template including assistant role
def custom_chat_template(messages):
    chat_prompt = ""
    for message in messages:
        role = message["role"]
        content = message["content"]
        chat_prompt += f"{role}: {content}\n"
    # Add assistant role to prompt the model's response
    chat_prompt += "assistant:"
    return chat_prompt

# Generate the prompt
prompt = custom_chat_template(messages)
print(f"Generated prompt:\n{prompt}")

# Initialize the model
model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True)
if model is None:
    raise ValueError("Model initialization failed!")

# Move model to the appropriate device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

# Initialize the text generation pipeline
text_gen_pipeline = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    device=device,
    torch_dtype=torch.float16,
    trust_remote_code=True
)

# Generate text
try:
    outputs = text_gen_pipeline(
        prompt,
        max_new_tokens=256,
        do_sample=True,
        temperature=0.7,
        top_k=50,
        top_p=0.95,
        pad_token_id=tokenizer.eos_token_id  # Ensure proper stopping
    )
    # Extract and print the assistant's response
    generated_text = outputs[0]["generated_text"]
    assistant_response = generated_text.split("assistant:")[1].strip()
    print(f"Assistant's response:\n{assistant_response}")
except Exception as e:
    print(f"Error during text generation: {e}")

Examples:

user: ما هي الاعتبارات الأخلاقية الثلاثة الجيدة للرجل؟ ?
assistant:
Assistant's response:
ما هو الشيء الأكثر أهمية في الحياة؟
user: What food crops are best to grow in Northern UAE?
assistant:
Assistant's response:
Vegetables.
user: What do you need to train a large language model?
assistant:
Assistant's response:
I need to train a model to recognize 10 different languages.

How can I do this?

A:

How can I do this?

You could do this in two ways:

Create a trained model using the provided source data (and the data it produces is not in your control)
Create a trained model using a different source data (and the data it produces is in your control)

The first way is much easier to implement than the second.  As I said, you can use the source data in a separate model and use the model's training function to train the model that produces the data.  I'm not sure if this is what you want or not, but it's possible.

A:

If you are training a model for 10 different languages, then you will need to train a model that recognizes 10 different languages. 
This is possible, but it is not easy.
You can train a model for a specific language, say English, by training a model for that language. Then, when you train the model for 10 other languages, you will need to train a model for the 10 languages that don't have the same English as the one you trained for.
This is what
user: dog, cat, mouse, {}
assistant:
Assistant's response:
dog, cat, mouse, {}

I have a function that returns the list of items from the object.
def get_items(items):
    for item in items:
        return [item]

I would like to do something like this
assistant = get_items(items)

or
cat = get_items(items)

A:

You can use itertools.izip_longest() to zip all of the items together and then get the first item.
import itertools

items = [dog, cat, mouse, {}]

result = list(itertools.izip_longest(items, key=lambda x: x))

# [dog, cat, mouse, {}]

If you really want to use a list comprehension, you can do it like this:
result = [item for item in items if item]

If you really want to use a dictionary instead of a list, you can do this:
result = {item: item for item in items if item}

You can use a dictionary in this case because it will allow you to iterate over the keys and values of the dictionary
Downloads last month
22
Safetensors
Model size
641M params
Tensor type
BF16
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for Solshine/Jais-590m-merged

Finetuned
(4)
this model