Still Failing Merge Attempts

by DataSoul - opened Jan 26

Owner Jan 26

For the DeepSeek-R1-Distill-Qwen series models, several merge methods have been attempted without achieving satisfactory results. This model is among the best performers, but it still makes mistakes at the end of reciting famous ancient texts, indicating that the model has likely suffered some degree of damage.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment