vhab10
/

llama-3-8b-merged-linear

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

llama-3-8b-merged-linear / README.md

vhab10's picture

Update README.md

e058c33 verified 3 months ago

|

1.61 kB

	---
	language: en
	tags:
	- llama
	- text-generation
	- model-merging
	license: mit
	base_model:
	- meta-llama/Meta-Llama-3-8B
	library_name: transformers
	---

	# llama-3-8b-merged-linear

	## Overview
	This model represents a linear merge of three distinct Llama 3-8b models using the Mergekit tool. The primary goal of this merge is to leverage the unique strengths of each base model, such as multilingual capabilities and specialized domain knowledge, into a more versatile and generalized language model.

	By merging these models linearly, we combine their expertise into a unified model that performs well across various tasks, such as text generation, multilingual understanding, and domain-specific tasks.

	## Model Details

	### Model Description
	- Models Used:
	- Danielbrdz/Barcenas-Llama3-8b-ORPO
	- DeepMount00/Llama-3-8b-Ita
	- lightblue/suzume-llama-3-8B-multilingual

	- Merging Tool: Mergekit
	- Merge Method: Linear merge with equal weighting (1.0) for all models
	- Tokenizer Source: Union
	- Data Type: float16 (FP16) precision
	- License: MIT License
	- Languages Supported: Multilingual, including English, Italian, and potentially others from the multilingual base models

	## Configuration
	The following YAML configuration was used to produce this model:

	```yaml
	models:
	- model: Danielbrdz/Barcenas-Llama3-8b-ORPO
	parameters:
	weight: 1.0
	- model: DeepMount00/Llama-3-8b-Ita
	parameters:
	weight: 1.0
	- model: lightblue/suzume-llama-3-8B-multilingual
	parameters:
	weight: 1.0
	merge_method: linear
	tokenizer_source: union
	dtype: float16