gradio transformers torch huggingface_hub accelerate>=0.26.0' llama-index llama-index-embeddings-huggingface peft auto-gptq optimum bitsandbytes