Still Failing Merge Attempts

#1
by DataSoul - opened

For the DeepSeek-R1-Distill-Qwen series models, several merge methods have been attempted without achieving satisfactory results. This model is among the best performers, but it still makes mistakes at the end of reciting famous ancient texts, indicating that the model has likely suffered some degree of damage.

Sign up or log in to comment