RombUltima-32B

FINGU-AI/RombUltima-32B is a merged model combining rombodawg/Rombos-LLM-V2.5-Qwen-32b and Sakalti/ultiima-32B. This model maintains the individual strengths of both Qwen and Ultima architectures while benefiting from an optimized fusion for improved reasoning, multilingual comprehension, and multi-turn conversation capabilities.


Training & Fine-Tuning

RombUltima-32B is based on a linear merge of its parent models using equal weighting (0.5 each), resulting in a balanced fusion that leverages both structured knowledge from Rombos and enhanced generalization from Ultima.

  • Tokenization Approach: Uses a union-based tokenizer to maximize vocabulary coverage.
  • Precision: Trained and fine-tuned in float16 for efficient inference.
  • Long-Context Support: Supports up to 32K tokens (based on Qwen-32B), with stable generation up to 8K tokens, depending on hardware constraints.
  • Multilingual Strength: Strong performance in English, French, Chinese, and other global languages.

Performance & Benchmarks

OpenLLM Leaderboard

πŸ“Œ Coming Soon – Evaluation against leading LLM benchmarks.

MT-Bench

πŸ“Œ Coming Soon – Multi-turn conversational performance analysis.


Usage

You can run this model using the following code:

import transformers
from transformers import AutoTokenizer

# Format prompt
message = [
    {"role": "system", "content": "You are a helpful AI assistant."},
    {"role": "user", "content": "What is a Large Language Model?"}
]
tokenizer = AutoTokenizer.from_pretrained("FINGU-AI/RombUltima-32B")
prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False)

# Create pipeline
pipeline = transformers.pipeline(
    "text-generation",
    model="FINGU-AI/RombUltima-32B",
    tokenizer=tokenizer
)

# Generate text
sequences = pipeline(
    prompt,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
    num_return_sequences=1,
    max_length=200,
)
print(sequences[0]['generated_text'])

Merging Details

  • Parent Models:
    • 🟒 rombodawg/Rombos-LLM-V2.5-Qwen-32b (weight: 0.5)
    • 🟒 Sakalti/ultiima-32B (weight: 0.5)
  • Merge Method: Linear
  • Tokenizer Source: Union-based
  • Precision: Float16

Licensing & Intended Use

  • License: Subject to original licenses of the merged models.
  • Intended Use: Research, content generation, multilingual applications, and general-purpose AI assistance.
  • Limitations: While the model excels in structured reasoning and multilingual understanding, hallucinations and biases may still exist.

πŸ“Œ For feedback and contributions, visit: FINGU-AI on Hugging Face.

Downloads last month
51
Safetensors
Model size
17.6B params
Tensor type
FP16
Β·
F32
Β·
U8
Β·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.