Use HF if CUDA available and not persistent model load 476f08d unverified hans00 commited on 8 days ago