Meta-Llama-3-70bのセルフマージにより120Bにパラメーター数を拡大したモデルの高性能化が報告されています
今回高品質な日本語LLMである、karakuri-ai/karakuri-lm-8x7b-chat-v0.1の精度を更に高めるために、"num_hidden_layers": 32、から、56への自己拡張マージを行いました。
マージに利用したスライスのインターバルから本モデル(Ex-karakuri-8x12B-chat-v2)が非マージ部分4層、Ex-karakuri-8x12B-chat-v1は8層に設定しています

It was inspired by large merges like:

slices:
- sources:
  - layer_range: [0, 4]
    model: karakuri-ai/karakuri-lm-8x7b-chat-v0.1
- sources:
  - layer_range: [2, 6]
    model: karakuri-ai/karakuri-lm-8x7b-chat-v0.1
- sources:
  - layer_range: [4, 8]
    model: karakuri-ai/karakuri-lm-8x7b-chat-v0.1
- sources:
  - layer_range: [6, 10]
    model: karakuri-ai/karakuri-lm-8x7b-chat-v0.1
- sources:
  - layer_range: [8, 12]
    model: karakuri-ai/karakuri-lm-8x7b-chat-v0.1
- sources:
  - layer_range: [10, 14]
    model: karakuri-ai/karakuri-lm-8x7b-chat-v0.1
- sources:
  - layer_range: [12, 16]
    model: karakuri-ai/karakuri-lm-8x7b-chat-v0.1
- sources:
  - layer_range: [14, 18]
    model: karakuri-ai/karakuri-lm-8x7b-chat-v0.1
- sources:
  - layer_range: [16, 20]
    model: karakuri-ai/karakuri-lm-8x7b-chat-v0.1
- sources:
  - layer_range: [18, 22]
    model: karakuri-ai/karakuri-lm-8x7b-chat-v0.1
- sources:
  - layer_range: [20, 24]
    model: karakuri-ai/karakuri-lm-8x7b-chat-v0.1
- sources:
  - layer_range: [22, 26]
    model: karakuri-ai/karakuri-lm-8x7b-chat-v0.1
- sources:
  - layer_range: [24, 28]
    model: karakuri-ai/karakuri-lm-8x7b-chat-v0.1
- sources:
  - layer_range: [26, 30]
    model: karakuri-ai/karakuri-lm-8x7b-chat-v0.1
- sources:
  - layer_range: [28, 32]
    model: karakuri-ai/karakuri-lm-8x7b-chat-v0.1
merge_method: passthrough
dtype: bfloat16
Downloads last month
358
Safetensors
Model size
87.3B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.