TinyLLama TensorRT LLM Edition.

This repo contains the TensorRT LLM version of TinyLlama Model. The conversion is done to support Float16 precision on Nvidia TensorRT.

Downloads last month
9
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support