moreh-sungmin's picture
Update README.md
9207024 verified
|
raw
history blame
1.81 kB
metadata
license: mit
language:
  - en

Introduction

MoMo-70B is trained via Supervised Fine-Tuning (SFT) using LoRA, with the QWEN-72B model as its base-model.
This is a Direct Preference Optimization(DPO) version trained from v1.8.4 as a base model, with several optimizations in hyperparameters.
Note that we did not exploit any form of weight merge.
For leaderboard submission, the trained weight is realigned for compatibility with llama.
MoMo-70B is trained using Moreh's MoAI platform, which simplifies the training of large-scale models, and AMD's MI250 GPU.

Details

Used Librarys

  • torch
  • peft

Used Datasets

Model ARC MMLU TruthfulQA GSM8K
V1.8.6(result < 0.1, %) TBU TBU 0.73 TBU

Used Environments

How to use

# pip install transformers==4.35.2
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("moreh/MoMo-70B-LoRA-V1.8.6")
model = AutoModelForCausalLM.from_pretrained(
    "moreh/MoMo-70B-LoRA-V1.8.6"
)