Truthful_DPO_TomGrc_FusionNet_7Bx2_MoE_13B

Description

This repo contains GGUF format model files for Truthful_DPO_TomGrc_FusionNet_7Bx2_MoE_13B.

Files Provided

Name	Quant	Bits	File Size	Remark
truthful_dpo_tomgrc_fusionnet_7bx2_moe_13b.IQ3_XXS.gguf	IQ3_XXS	3	5.30 GB	3.06 bpw quantization
truthful_dpo_tomgrc_fusionnet_7bx2_moe_13b.IQ3_S.gguf	IQ3_S	3	5.60 GB	3.44 bpw quantization
truthful_dpo_tomgrc_fusionnet_7bx2_moe_13b.IQ3_M.gguf	IQ3_M	3	5.74 GB	3.66 bpw quantization mix
truthful_dpo_tomgrc_fusionnet_7bx2_moe_13b.Q4_0.gguf	Q4_0	4	7.28 GB	3.56G, +0.2166 ppl
truthful_dpo_tomgrc_fusionnet_7bx2_moe_13b.IQ4_NL.gguf	IQ4_NL	4	7.36 GB	4.25 bpw non-linear quantization
truthful_dpo_tomgrc_fusionnet_7bx2_moe_13b.Q4_K_M.gguf	Q4_K_M	4	7.78 GB	3.80G, +0.0532 ppl
truthful_dpo_tomgrc_fusionnet_7bx2_moe_13b.Q5_K_M.gguf	Q5_K_M	5	9.13 GB	4.45G, +0.0122 ppl
truthful_dpo_tomgrc_fusionnet_7bx2_moe_13b.Q6_K.gguf	Q6_K	6	10.57 GB	5.15G, +0.0008 ppl
truthful_dpo_tomgrc_fusionnet_7bx2_moe_13b.Q8_0.gguf	Q8_0	8	13.69 GB	6.70G, +0.0004 ppl

Parameters

path	type	architecture	rope_theta	sliding_win	max_pos_embed
yunconglong/Truthful_DPO_TomGrc_FusionNet_7Bx2_MoE_13B	mixtral	MixtralForCausalLM	10000.0	null	32768

Benchmarks

Original Model Card

DPO Trainer with dataset jondurbin/truthy-dpo-v0.1 to improve [TomGrc/FusionNet_7Bx2_MoE_14B]

DPO Trainer TRL supports the DPO Trainer for training language models from preference data, as described in the paper Direct Preference Optimization: Your Language Model is Secretly a Reward Model by Rafailov et al., 2023. ```