Mixtral model fine-tunning and experiment
Apoorv Omar
Ap98
AI & ML interests
LLM, NLP, Transformers
Organizations
None yet
Collections
3
Paper for LLM training
-
A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity
Paper • 2401.01967 • Published -
Secrets of RLHF in Large Language Models Part I: PPO
Paper • 2307.04964 • Published • 28 -
Zephyr: Direct Distillation of LM Alignment
Paper • 2310.16944 • Published • 121 -
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
Paper • 2404.05961 • Published • 64
models
3
datasets
None public yet