metadata

language:
  - ko
  - en
license: cc-by-sa-4.0
model-index:
  - name: kiqu-70b
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: AI2 Reasoning Challenge (25-Shot)
          type: ai2_arc
          config: ARC-Challenge
          split: test
          args:
            num_few_shot: 25
        metrics:
          - type: acc_norm
            value: 72.1
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maywell/kiqu-70b
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: HellaSwag (10-Shot)
          type: hellaswag
          split: validation
          args:
            num_few_shot: 10
        metrics:
          - type: acc_norm
            value: 87.94
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maywell/kiqu-70b
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU (5-Shot)
          type: cais/mmlu
          config: all
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 74.93
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maywell/kiqu-70b
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: TruthfulQA (0-shot)
          type: truthful_qa
          config: multiple_choice
          split: validation
          args:
            num_few_shot: 0
        metrics:
          - type: mc2
            value: 63.48
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maywell/kiqu-70b
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: Winogrande (5-shot)
          type: winogrande
          config: winogrande_xl
          split: validation
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 84.85
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maywell/kiqu-70b
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GSM8k (5-shot)
          type: gsm8k
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 68.46
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maywell/kiqu-70b
          name: Open LLM Leaderboard

kiqu-70b (Arena Leaderboard)

kiqu-70b is a SFT+DPO trained model based on Miqu-70B-Alpaca-DPO using Korean datasets.

Since this model is finetune of miqu-1-70b using it on commercial purposes is at your own risk. — leaked early version Mistral-Medium

본 모델 kiqu-70b는 Miqu-70B-Alpaca-DPO 모델을 기반으로 한국어 데이터셋을 사용하여 SFT+DPO 훈련을 진행하여 제작되었습니다.

베이스 모델인 miqu-1-70b 모델이 미스트랄-미디움의 초기 유출 버전이기에 상업적 사용에 대한 risk는 본인에게 있습니다.

Beside that this model follows cc-by-sa-4.0

본 모델 자체로서는 cc-by-sa-4.0을 따릅니다.

Model Details

Base Model
miqu-1-70b (Early Mistral-Medium)

Instruction format

It follows Mistral format. Giving few-shots to model is highly recommended

본 모델은 미스트랄 포맷을 따릅니다. few-shot 사용을 적극 권장합니다.

[INST] {instruction}
[/INST] {output}

Multi-shot

[INST] {instruction}
[/INST] {output}

[INST] {instruction}
[/INST] {output}

[INST] {instruction}
[/INST] {output}
.
.
.

Recommended Template - 1-shot with system prompt

너는 kiqu-70B라는 한국어에 특화된 언어모델이야. 깔끔하고 자연스럽게 대답해줘!
[INST] 안녕?
[/INST] 안녕하세요! 무엇을 도와드릴까요? 질문이나 궁금한 점이 있다면 언제든지 말씀해주세요.

[INST] {instruction}
[/INST]

Trailing space after [/INST] can affect models performance in significant margin. So, when doing inference it is recommended to not include trailing space in chat template.

[/INST] 뒤에 띄어쓰기는 모델 성능에 유의미한 영향을 미칩니다. 따라서, 인퍼런스(추론)과정에서는 챗 템플릿에 띄어쓰기를 제외하는 것을 적극 권장합니다.

Model Benchmark

TBD

Author's Message

This model's training got sponsered by no one but support from people around Earth.

Support Me

Discord Server

Contact Me on Discord - is.maywell

Follow me on twitter - https://twitter.com/stablefluffy

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	75.29
AI2 Reasoning Challenge (25-Shot)	72.10
HellaSwag (10-Shot)	87.94
MMLU (5-Shot)	74.93
TruthfulQA (0-shot)	63.48
Winogrande (5-shot)	84.85
GSM8k (5-shot)	68.46