File size: 1,608 Bytes
e058c33
 
 
 
 
 
 
 
 
 
 
5c28963
8f87d26
5c28963
8f87d26
 
5c28963
8f87d26
5c28963
 
 
 
8f87d26
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
---
language: en
tags:
- llama
- text-generation
- model-merging
license: mit
base_model:
- meta-llama/Meta-Llama-3-8B
library_name: transformers
---

# llama-3-8b-merged-linear

## Overview
This model represents a linear merge of three distinct Llama 3-8b models using the Mergekit tool. The primary goal of this merge is to leverage the unique strengths of each base model, such as multilingual capabilities and specialized domain knowledge, into a more versatile and generalized language model.

By merging these models linearly, we combine their expertise into a unified model that performs well across various tasks, such as text generation, multilingual understanding, and domain-specific tasks.

## Model Details

### Model Description
- **Models Used**:
  - Danielbrdz/Barcenas-Llama3-8b-ORPO
  - DeepMount00/Llama-3-8b-Ita
  - lightblue/suzume-llama-3-8B-multilingual
    
- **Merging Tool**: Mergekit
- **Merge Method**: Linear merge with equal weighting (1.0) for all models
- **Tokenizer Source**: Union
- **Data Type**: float16 (FP16) precision
- **License**: MIT License
- **Languages Supported**: Multilingual, including English, Italian, and potentially others from the multilingual base models

## Configuration
The following YAML configuration was used to produce this model:

```yaml
models:
  - model: Danielbrdz/Barcenas-Llama3-8b-ORPO
    parameters:
      weight: 1.0
  - model: DeepMount00/Llama-3-8b-Ita
    parameters:
      weight: 1.0
  - model: lightblue/suzume-llama-3-8B-multilingual
    parameters:
      weight: 1.0
merge_method: linear
tokenizer_source: union
dtype: float16