agiin-13.6B-v0.1 / README.md
leaderboard-pr-bot's picture
Adding Evaluation Results
4159656 verified
|
raw
history blame
4.87 kB
metadata
language:
  - en
license: apache-2.0
datasets:
  - Intel/orca_dpo_pairs
model-index:
  - name: agiin-13.6B-v0.1
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: AI2 Reasoning Challenge (25-Shot)
          type: ai2_arc
          config: ARC-Challenge
          split: test
          args:
            num_few_shot: 25
        metrics:
          - type: acc_norm
            value: 69.45
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mncai/agiin-13.6B-v0.1
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: HellaSwag (10-Shot)
          type: hellaswag
          split: validation
          args:
            num_few_shot: 10
        metrics:
          - type: acc_norm
            value: 86.64
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mncai/agiin-13.6B-v0.1
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU (5-Shot)
          type: cais/mmlu
          config: all
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 61.15
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mncai/agiin-13.6B-v0.1
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: TruthfulQA (0-shot)
          type: truthful_qa
          config: multiple_choice
          split: validation
          args:
            num_few_shot: 0
        metrics:
          - type: mc2
            value: 67.97
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mncai/agiin-13.6B-v0.1
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: Winogrande (5-shot)
          type: winogrande
          config: winogrande_xl
          split: validation
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 78.69
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mncai/agiin-13.6B-v0.1
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GSM8k (5-shot)
          type: gsm8k
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 46.47
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mncai/agiin-13.6B-v0.1
          name: Open LLM Leaderboard

Model Card for mncai/agiin-13.6B-v0.1

Introduction of MindsAndCompany

https://mnc.ai/

We create various AI models and develop solutions that can be applied to businesses. And as for generative AI, we are developing products like Code Assistant, TOD Chatbot, LLMOps, and are in the process of developing Enterprise AGI (Artificial General Intelligence).

Model Summary

This model was built based on the Mistral architecture. It was inspired by neural connection technology and rehabilitation therapy. I have created a new model architecture that does not require pretraining, and training the model is sufficient with just one H100 for 7 hours.

Data

Intel/orca_dpo_pairs (DPO)

Surgery and Training

stack mistral 62 layers and DPO.

How to Use

message = [
    {"role": "system", "content": "You are a helpful assistant chatbot."},
    {"role": "user", "content": "๋‘ ๊ฐœ์˜ ๊ตฌ๊ฐ€ ๊ฐ๊ฐ ์ง€๋ฆ„์ด 1, 2์ผ๋•Œ ๋‘ ๊ตฌ์˜ ๋ถ€ํ”ผ๋Š” ๋ช‡๋ฐฐ์ง€? ์„ค๋ช…๋„ ๊ฐ™์ด ํ•ด์ค˜."}
]
tokenizer = AutoTokenizer.from_pretrained(hf_model)
prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False)

pipeline = transformers.pipeline(
    "text-generation",
    model=hf_model,
    tokenizer=tokenizer
)


sequences = pipeline(
    prompt,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
    num_return_sequences=1,
    max_length=512,
)
print(sequences[0]['generated_text'])

Contact

If you have any questions, please raise an issue or contact us at [email protected]

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 68.40
AI2 Reasoning Challenge (25-Shot) 69.45
HellaSwag (10-Shot) 86.64
MMLU (5-Shot) 61.15
TruthfulQA (0-shot) 67.97
Winogrande (5-shot) 78.69
GSM8k (5-shot) 46.47