ibm-fms
/

llama3-8b-accelerator

Inference Endpoints

Model card Files Files and versions Community

Resources

View closed (0)

Unexpectedly Large Memory Usage of ibm-fms/llama3-8b-accelerator in vLLM

#4 opened about 1 month ago by

llama3.1 version

#3 opened 6 months ago by

ValueError: Unsupported model type mlp_speculator using TGI server

#2 opened 9 months ago by

rishabh-simpplr

shard 0 never ready when given the speculator option?

#1 opened 9 months ago by