merge

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the passthrough merge method.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:


  slices:
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 0
          - 4
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 3
          - 4
        parameters:
          scale:
            - filter: o_proj
              value: 0
            - filter: down_proj
              value: 0
            - value: 1
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 4
          - 8
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 7
          - 8
        parameters:
          scale:
            - filter: o_proj
              value: 0
            - filter: down_proj
              value: 0
            - value: 1
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 8
          - 12
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 11
          - 12
        parameters:
          scale:
            - filter: o_proj
              value: 0
            - filter: down_proj
              value: 0
            - value: 1
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 12
          - 16
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 15
          - 16
        parameters:
          scale:
            - filter: o_proj
              value: 0
            - filter: down_proj
              value: 0
            - value: 1
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 16
          - 20
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 19
          - 20
        parameters:
          scale:
            - filter: o_proj
              value: 0
            - filter: down_proj
              value: 0
            - value: 1
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 20
          - 24
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 23
          - 24
        parameters:
          scale:
            - filter: o_proj
              value: 0
            - filter: down_proj
              value: 0
            - value: 1
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 24
          - 28
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 27
          - 28
        parameters:
          scale:
            - filter: o_proj
              value: 0
            - filter: down_proj
              value: 0
            - value: 1
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 28
          - 32
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 31
          - 32
        parameters:
          scale:
            - filter: o_proj
              value: 0
            - filter: down_proj
              value: 0
            - value: 1
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 32
          - 36
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 35
          - 36
        parameters:
          scale:
            - filter: o_proj
              value: 0
            - filter: down_proj
              value: 0
            - value: 1
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 36
          - 40
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 39
          - 40
        parameters:
          scale:
            - filter: o_proj
              value: 0
            - filter: down_proj
              value: 0
            - value: 1
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 40
          - 44
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 43
          - 44
        parameters:
          scale:
            - filter: o_proj
              value: 0
            - filter: down_proj
              value: 0
            - value: 1
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 44
          - 48
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 47
          - 48
        parameters:
          scale:
            - filter: o_proj
              value: 0
            - filter: down_proj
              value: 0
            - value: 1
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 48
          - 52
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 51
          - 52
        parameters:
          scale:
            - filter: o_proj
              value: 0
            - filter: down_proj
              value: 0
            - value: 1
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 52
          - 56
  - sources:
      - model: mistral-community/Codestral-22B-v0.1
        layer_range:
          - 55
          - 56
        parameters:
          scale:
            - filter: o_proj
              value: 0
            - filter: down_proj
              value: 0
            - value: 1

  merge_method: passthrough
  dtype: bfloat16
Downloads last month
5
Safetensors
Model size
27.7B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for simplyinquisitive/code-stral-taylors-version

Finetuned
(2)
this model