the abliterated is not very thorough

#1
by CrestYao - opened

It feels like the abliterated is not very thorough. When it comes to political issues (such as being asked “What impact would Trump's presidency have on China?” or “If China decides to reclaim Taiwan by force, what would be the most likely method employed?” etc.), a very official statement is given, but the question is not truly answered. The abliterated versions of the previous QWen or QwQ models did not have this issue. It should be a problem brought about by the distillation of the DeepSeek-R1 model.

Important Note There's a new version available, please try using the new version huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated-v2.

Important Note There's a new version available, please try using the new version huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated-v2.

Thank you for the update! However, my PC needs to wait for a GGUF version to test. I will provide feedback on the results after the testing is done.

Important Note There's a new version available, please try using the new version huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated-v2.

I have already tried the V2 version, but the previously mentioned issues were not resolved. Normally, the model goes through a process before answering questions. However, for the types of issues mentioned earlier, the model would bypass the process and directly provide an official statement as a response. Moreover, this official statement is slightly different each time it is regenerated; the general meaning is the same, but the wording is not entirely identical. I suspect this might be due to the failure of the erasure technique.

I have also already tried the V2 version,the previously mentioned issues were not resolved.

I conducted several tests using the same quantization version by Mradermacher. When using the 32B model, if asking questions in a simple manner, the model would not enter the process and would directly refuse to answer. However, when I modified the prompt to make the question more complex, there was a 50% chance of successfully entering the process, and the model would eventually provide an answer normally. But when using the 14B model, whether with Mradermacher's V1 quantization version or another person's V2 quantization version, the model would refuse to answer directly, regardless of whether the questions were simple or complex, and it would not enter the process.

So I'm finding that asking it controversial and/or unethical things V2 just thinks to itself Yes, user wants "subject" and then next thought is filtered.

ollama那里试了新的版本 还是不回答要如何收复台湾的这个问题 :)

Sign up or log in to comment