CPU support by ONNX as phi3?

#1
by alierenak - opened

As far as I know, this model currently doesn't support running on a CPU due to flash_attn constraints. Are there any plans to release an ONNX version, similar to what was done with phi3

3.5 versions will ship too https://huggingface.co/onnx-community

At present there is a 3.5 mini instruct onnx model available https://huggingface.co/onnx-community/Phi-3.5-mini-instruct-web/tree/main

the second link leads to 404 ('-onnx' ist missing)

this link does work:
https://huggingface.co/onnx-community/Phi-3.5-mini-instruct-onnx-web/tree/main

How to run onnx model

nguyenbh changed discussion status to closed

Sign up or log in to comment