optimum-internal-testing/tiny-random-llama
Updated
β’
186k
None defined yet.
diffusers
π§¨bistandbytes
as the official backend but using others like torchao
is already very simple. enable_model_cpu_offload()
from optimum.onnxruntime import ORTModelForSequenceClassification
# Load the model from the hub and export it to the ONNX format
model_id = "distilbert-base-uncased-finetuned-sst-2-english"
model = ORTModelForSequenceClassification.from_pretrained(model_id, export=True)
torch.compile()
them.