view article Article Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference Jan 16 • 71
view article Article Memory-efficient Diffusion Transformers with Quanto and Diffusers Jul 30, 2024 • 64
view article Article Google Cloud TPUs made available to Hugging Face users By pagezyhf and 3 others • Jul 9, 2024 • 19
view article Article Google Cloud TPUs made available to Hugging Face users By pagezyhf and 3 others • Jul 9, 2024 • 19