metadata

base_model: alpindale/Mistral-7B-v0.2-hf
tags:
  - axolotl
  - generated_from_trainer
model-index:
  - name: janus-7b
    results: []
license: apache-2.0
language:
  - en

Links for Reference

Homepage: In Progress
Repository: https://github.com/kaistAI/Janus
Paper:
Point of Contact:[email protected]

TL; DR

Janus is a model trained using Mistral-7B-v0.2 as its base model. Janus has been trained on Multifaceted Collection, a preference dataset containing 192k unique system messages for aligning LLMs to diverse human preferences. Janus not only excels at generating personalized responses that cater to various human preferences but is also adept at producing responses that are generally preferred for being helpful and harmless.

Model Details

Model Description

Model type: Language model
Language(s) (NLP): English
License: Apache 2.0
Related Models: Janus-66k-7B Janus-DPO-7B, Janus-ORPO-7B, Janus-RM-7B
Training Data: Multifaceted Collection
Resources for more information:
- Research paper
- GitHub Repo

Usage

Janus is a model generalized for various system messages, allowing users to control the model's response by inputting the desired system message. The input prompt format is as follows:

[INST]{system_message}\n{instruction}[/INST]

Additionally, an example of the inference code applying this is as follows:

Training Details

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-06
train_batch_size: 2
eval_batch_size: 2
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 4
total_train_batch_size: 32
total_eval_batch_size: 8
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 10
num_epochs: 4

Framework versions

Transformers 4.40.0.dev0
Pytorch 2.2.2
Datasets 2.18.0
Tokenizers 0.15.0

Citation

If you find the following model helpful, please consider citing our paper!

BibTeX: