Rename inference-cache-config/qwen-2.5-large.json to inference-cache-config/qwen2.5-large.json 2aa52ac verified dacorvo HF staff commited on 29 days ago
Rename inference-cache-config/qwen2.5 to inference-cache-config/qwen2.5.json b9f1fde verified dacorvo HF staff commited on 29 days ago
Add qwen2.5 config for models up to 14B params 4e25bb0 verified dacorvo HF staff commited on 29 days ago
Rename inference-cache-config/Llama3.1-70b.json to inference-cache-config/llama3.1-70b.json 563ba38 verified dacorvo HF staff commited on Sep 27, 2024
Update inference-cache-config/Llama3.1-70b.json 7b0370b verified dacorvo HF staff commited on Sep 27, 2024
Update inference-cache-config/mistral.json 8ea3b57 verified dacorvo HF staff commited on Sep 27, 2024
Rename inference-cache-config/Llama3.1-70B.json to inference-cache-config/Llama3.1-70b.json a92cfe3 verified dacorvo HF staff commited on Sep 26, 2024
Update inference-cache-config/mixtral.json 7342c16 verified dacorvo HF staff commited on Sep 26, 2024
Rename inference-cache-config/Llama-3.1-70B.json to inference-cache-config/Llama3.1-70B.json b41e94c verified dacorvo HF staff commited on Sep 26, 2024
Delete inference-cache-config/llama3-8b.json 5b0b2de verified dacorvo HF staff commited on Sep 26, 2024
Delete inference-cache-config/llama2-7b-13b.json 219c5fd verified dacorvo HF staff commited on Sep 26, 2024
Rename inference-cache-config/llama-3.1-8B.json to inference-cache-config/llama.json 14844a0 verified dacorvo HF staff commited on Sep 26, 2024
Update inference-cache-config/mistral.json 6c4c814 verified dacorvo HF staff commited on Sep 26, 2024
Update inference-cache-config/llama3-8b.json de9e259 verified dacorvo HF staff commited on Sep 26, 2024
Update inference-cache-config/llama3-70b.json 5694f75 verified dacorvo HF staff commited on Sep 26, 2024
Update inference-cache-config/stable-diffusion.json 5272eb2 verified Jingya HF staff commited on Sep 24, 2024
Update inference-cache-config/llama-variants.json e7179a3 verified dacorvo HF staff commited on Jun 27, 2024
Rename inference-cache-config/llama2.json to inference-cache-config/llama2-7b-13b.json be28bda verified dacorvo HF staff commited on Jun 27, 2024
Rename inference-cache-config/llama3.json to inference-cache-config/llama3-8b.json 06bc70d verified dacorvo HF staff commited on Jun 27, 2024
Add more batch_size for mistral on smaller instances 545cd4d verified dacorvo HF staff commited on May 31, 2024
Use princeton-nlp/Sheared-LLaMA-1.3B as a test model 695b341 verified dacorvo HF staff commited on May 30, 2024
Rename inference-cache-config/llama.json to inference-cache-config/llama2.json f06a55a verified dacorvo HF staff commited on Apr 19, 2024
Create stable-diffusion.json (#43) 32561fe verified philschmid HF staff Jingya HF staff commited on Apr 4, 2024
Added Llama-70b batch_size 4 to inference cache 593822e verified dacorvo HF staff commited on Mar 8, 2024