Convert deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B error
Conversion failed: Sliding Window Attention is enabled but not implemented for sdpa; unexpected results may be encountered. /usr/local/lib/python3.10/site-packages/transformers/modeling_attn_mask_utils.py:285: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! elif sliding_window is None or key_value_length < sliding_window: /usr/local/lib/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py:750: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if attention_mask.shape[-1] > target_length: -[x] values not close enough, max diff: 0.008636951446533203 (atol: 1e-05) -[x] values not close enough, max diff: 0.0016372203826904297 (atol: 1e-05) -[x] values not close enough, max diff: 0.0005313754081726074 (atol: 1e-05) -[x] values not close enough, max diff: 0.0011655092239379883 (atol: 1e-05) -[x] values not close enough, max diff: 0.0007362663745880127 (atol: 1e-05) -[x] values not close enough, max diff: 0.0018906593322753906 (atol: 1e-05) -[x] values not close enough, max diff: 0.0013477802276611328 (atol: 1e-05) -[x] values not close enough, max diff: 0.0017404556274414062 (atol: 1e-05) -[x] values not close enough, max diff: 0.0013149380683898926 (atol: 1e-05) -[x] values not close enough, max diff: 0.0016482025384902954 (atol: 1e-05) -[x] values not close enough, max diff: 0.001369476318359375 (atol: 1e-05) -[x] values not close enough, max diff: 0.0021108388900756836 (atol: 1e-05) -[x] values not close enough, max diff: 0.0009351372718811035 (atol: 1e-05) -[x] values not close enough, max diff: 0.001401066780090332 (atol: 1e-05) -[x] values not close enough, max diff: 0.001059412956237793 (atol: 1e-05) -[x] values not close enough, max diff: 0.002683401107788086 (atol: 1e-05) -[x] values not close enough, max diff: 0.0009528994560241699 (atol: 1e-05) -[x] values not close enough, max diff: 0.002058863639831543 (atol: 1e-05) -[x] values not close enough, max diff: 0.0008495301008224487 (atol: 1e-05) -[x] values not close enough, max diff: 0.0017757415771484375 (atol: 1e-05) -[x] values not close enough, max diff: 0.0011088848114013672 (atol: 1e-05) -[x] values not close enough, max diff: 0.0018005967140197754 (atol: 1e-05) -[x] values not close enough, max diff: 0.0009817183017730713 (atol: 1e-05) -[x] values not close enough, max diff: 0.0024039745330810547 (atol: 1e-05) -[x] values not close enough, max diff: 0.0013313889503479004 (atol: 1e-05) -[x] values not close enough, max diff: 0.001611948013305664 (atol: 1e-05) -[x] values not close enough, max diff: 0.0011759400367736816 (atol: 1e-05) -[x] values not close enough, max diff: 0.0016410350799560547 (atol: 1e-05) -[x] values not close enough, max diff: 0.0020647048950195312 (atol: 1e-05) -[x] values not close enough, max diff: 0.0022127628326416016 (atol: 1e-05) -[x] values not close enough, max diff: 0.0017508268356323242 (atol: 1e-05) -[x] values not close enough, max diff: 0.002321779727935791 (atol: 1e-05) -[x] values not close enough, max diff: 0.0022073984146118164 (atol: 1e-05) -[x] values not close enough, max diff: 0.0013532638549804688 (atol: 1e-05) -[x] values not close enough, max diff: 0.0018004179000854492 (atol: 1e-05) -[x] values not close enough, max diff: 0.0014026761054992676 (atol: 1e-05) -[x] values not close enough, max diff: 0.001420140266418457 (atol: 1e-05) -[x] values not close enough, max diff: 0.0015635490417480469 (atol: 1e-05) -[x] values not close enough, max diff: 0.0017262101173400879 (atol: 1e-05) -[x] values not close enough, max diff: 0.002414703369140625 (atol: 1e-05) -[x] values not close enough, max diff: 0.0021743178367614746 (atol: 1e-05) -[x] values not close enough, max diff: 0.0023407936096191406 (atol: 1e-05) -[x] values not close enough, max diff: 0.0021066665649414062 (atol: 1e-05) -[x] values not close enough, max diff: 0.0021425187587738037 (atol: 1e-05) -[x] values not close enough, max diff: 0.003965616226196289 (atol: 1e-05) -[x] values not close enough, max diff: 0.0028635263442993164 (atol: 1e-05) -[x] values not close enough, max diff: 0.004308462142944336 (atol: 1e-05) -[x] values not close enough, max diff: 0.001798391342163086 (atol: 1e-05) -[x] values not close enough, max diff: 0.006231784820556641 (atol: 1e-05) -[x] values not close enough, max diff: 0.0026568174362182617 (atol: 1e-05) -[x] values not close enough, max diff: 0.007318735122680664 (atol: 1e-05) -[x] values not close enough, max diff: 0.0018971562385559082 (atol: 1e-05) -[x] values not close enough, max diff: 0.008878231048583984 (atol: 1e-05) -[x] values not close enough, max diff: 0.0023338794708251953 (atol: 1e-05) -[x] values not close enough, max diff: 0.009150505065917969 (atol: 1e-05) The ONNX export succeeded with the warning: The maximum absolute difference between the output of the reference model and the ONNX exported model is not within the set tolerance 1e-05: