EmbeddedLLM/Phi-3-mini-128k-instruct-onnx-directml

Performance Metrics

DirectML

We measured the performance of DirectML on AMD Ryzen 9 7940HS /w Radeon 78

Prompt Length Generation Length Average Throughput (tps)
128 128 -
128 256 -
128 512 -
128 1024 -
256 128 -
256 256 -
256 512 -
256 1024 -
512 128 -
512 256 -
512 512 -
512 1024 -
1024 128 -
1024 256 -
1024 512 -
1024 1024 -
Downloads last month
7
Inference Examples
Inference API (serverless) has been turned off for this model.

Collection including EmbeddedLLM/Phi-3-mini-128k-instruct-onnx-directml