Rename inference-cache-config/Llama3.1-70b.json to inference-cache-config/llama3.1-70b.json 563ba38 verified dacorvo HF staff commited on Sep 27, 2024
Update inference-cache-config/Llama3.1-70b.json 7b0370b verified dacorvo HF staff commited on Sep 27, 2024
Update inference-cache-config/mistral.json 8ea3b57 verified dacorvo HF staff commited on Sep 27, 2024
Rename inference-cache-config/Llama3.1-70B.json to inference-cache-config/Llama3.1-70b.json a92cfe3 verified dacorvo HF staff commited on Sep 26, 2024
Update inference-cache-config/mixtral.json 7342c16 verified dacorvo HF staff commited on Sep 26, 2024
Rename inference-cache-config/Llama-3.1-70B.json to inference-cache-config/Llama3.1-70B.json b41e94c verified dacorvo HF staff commited on Sep 26, 2024
Delete inference-cache-config/llama3-8b.json 5b0b2de verified dacorvo HF staff commited on Sep 26, 2024
Delete inference-cache-config/llama2-7b-13b.json 219c5fd verified dacorvo HF staff commited on Sep 26, 2024
Rename inference-cache-config/llama-3.1-8B.json to inference-cache-config/llama.json 14844a0 verified dacorvo HF staff commited on Sep 26, 2024
Update inference-cache-config/mistral.json 6c4c814 verified dacorvo HF staff commited on Sep 26, 2024
Update inference-cache-config/llama3-8b.json de9e259 verified dacorvo HF staff commited on Sep 26, 2024
Update inference-cache-config/llama3-70b.json 5694f75 verified dacorvo HF staff commited on Sep 26, 2024
Update inference-cache-config/stable-diffusion.json 5272eb2 verified Jingya HF staff commited on Sep 24, 2024
Update inference-cache-config/llama-variants.json e7179a3 verified dacorvo HF staff commited on Jun 27, 2024
Rename inference-cache-config/llama2.json to inference-cache-config/llama2-7b-13b.json be28bda verified dacorvo HF staff commited on Jun 27, 2024
Rename inference-cache-config/llama3.json to inference-cache-config/llama3-8b.json 06bc70d verified dacorvo HF staff commited on Jun 27, 2024
Add more batch_size for mistral on smaller instances 545cd4d verified dacorvo HF staff commited on May 31, 2024
Use princeton-nlp/Sheared-LLaMA-1.3B as a test model 695b341 verified dacorvo HF staff commited on May 30, 2024
Rename inference-cache-config/llama.json to inference-cache-config/llama2.json f06a55a verified dacorvo HF staff commited on Apr 19, 2024
Create stable-diffusion.json (#43) 32561fe verified philschmid HF staff Jingya HF staff commited on Apr 4, 2024
Added Llama-70b batch_size 4 to inference cache 593822e verified dacorvo HF staff commited on Mar 8, 2024
Create inference-cache-config/llama.json 1960ccb verified philschmid HF staff commited on Mar 5, 2024