keitokei1994
/

swallow-3-8B-sqlcoder-2x8B-GGUF

Mixture of Experts

Model card Files Files and versions Community

keitokei1994 commited on Jul 3, 2024

Commit

09c8243

·

verified ·

1 Parent(s): 405363a

Create README.md

Files changed (1) hide show

README.md +49 -0

README.md ADDED Viewed

	@@ -0,0 +1,49 @@

+---
+license: llama3
+language:
+- ja
+- en
+tags:
+- moe
+- japanese
+- sql
+---
+### モデルの説明(English explanation is below.)
+このモデルは、MergeKitツールを使用して作成されたMixture of Experts (MoE) 言語モデルをGGUF形式で量子化したものです。
+量子化していないものは [こちら](https://huggingface.co/keitokei1994/swallow-3-8B-sqlcoder-2x8B) 。
+### モデルの詳細
+- **モデル名**: swallow-3-8B-sqlcoder-2x8B-GGUF
+- **モデルアーキテクチャ**: Mixture of Experts (MoE)
+- **ベースモデル**:
+  - [tokyotech-llm/Llama-3-Swallow-8B-Instruct-v0.1](https://huggingface.co/tokyotech-llm/Llama-3-Swallow-8B-Instruct-v0.1)
+  - [defog/llama-3-sqlcoder-8b](https://huggingface.co/defog/llama-3-sqlcoder-8b)
+- **マージツール**: MergeKit
+このMoEモデルは、Llama-3-Swallow-8B-Instruct-v0.1の日本語能力とLlama-3-sqlcoder-8bのSQL生成能力を組み合わせることで、より強力で多機能な言語モデルを目指しています。
+#### 特徴
+- 日本語と英語の両方に対応
+- Llama-3-Swallow-8B-Instruct-v0.1による優れた日本語処理能力
+- Llama-3-sqlcoder-8bによる高度なSQL生成と処理能力
+#### 要求スペック
+Q4_K_M量子化モデルであれば、RTX3060 12GBでフルロード可能です。
+筆者はWSL2やGoogle Colaboratotry Proでの作成後、Llama.cppとLMstudioにて動作確認を行っています。
+---
+### Model Description
+This model is a Mixture of Experts (MoE) language model created using the MergeKit tool.
+The gguf version can be found [こちら](https://huggingface.co/keitokei1994/swallow-3-8B-sqlcoder-2x8B).
+### Model Details
+- **Model Name**: swallow-3-8B-sqlcoder-2x8B-GGUF
+- **Model Architecture**: Mixture of Experts (MoE)
+- **Base Models**:
+  - [tokyotech-llm/Llama-3-Swallow-8B-Instruct-v0.1](https://huggingface.co/tokyotech-llm/Llama-3-Swallow-8B-Instruct-v0.1)
+  - [defog/llama-3-sqlcoder-8b](https://huggingface.co/defog/llama-3-sqlcoder-8b)
+- **Merge Tool**: MergeKit
+This MoE model aims to create a more powerful and versatile language model by combining the Japanese language capabilities of Llama-3-Swallow-8B-Instruct-v0.1 with the SQL generation abilities of Llama-3-sqlcoder-8b.
+#### Features
+- Support for both Japanese and English languages
+- Excellent Japanese processing capabilities from Llama-3-Swallow-8B-Instruct-v0.1
+- Advanced SQL generation and processing capabilities from Llama-3-sqlcoder-8b
+#### System Requirements
+If using the Q4_K_M quantized model, it can be fully loaded on an RTX3060 12GB.
+The author has created the model using WSL2 and Google Colaboratory Pro, and has tested it using Llama.cpp and LMstudio.