Spaces:
Runtime error
Runtime error
faster inference?
#1
by
DoctorSlimm
- opened
great model im a huge fan! any way to make it faster?
anything along the lines of vllm or so for this model arch?
batching? blfloat16? onnx? quantization?