QuantFactory
/

llama-3-chinese-8b-instruct-v2-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

QuantFactory/llama-3-chinese-8b-instruct-v2-GGUF

This is quantized version of hfl/llama-3-chinese-8b-instruct-v2 created using llama.cpp

Model Description

This repository contains Llama-3-Chinese-8B-Instruct-v2, which is directly tuned with 5M instruction data on Meta-Llama-3-8B-Instruct.

Note: This is an instruction (chat) model, which can be used for conversation, QA, etc.

Further details (performance, usage, etc.) should refer to GitHub project page: https://github.com/ymcui/Chinese-LLaMA-Alpaca-3

Downloads last month: 59

GGUF

Model size

8.03B params

Architecture

llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for QuantFactory/llama-3-chinese-8b-instruct-v2-GGUF

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Finetuned

hfl/llama-3-chinese-8b-instruct-v2

Quantized

(4)

this model