alvarobartt HF staff commited on
Commit
460b17a
·
verified ·
1 Parent(s): 155d8b2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -261,7 +261,7 @@ chat_completion = client.chat.completions.create(
261
  ## Quantization Reproduction
262
 
263
  > [!NOTE]
264
- > In order to quantize Llama 3.1 70B Instruct using AutoGPTQ, you will need to use an instance with at least enough CPU RAM to fit the whole model i.e. ~800GiB, and an NVIDIA GPU with 80GiB of VRAM to quantize it.
265
 
266
  In order to quantize Llama 3.1 70B Instruct with GPTQ in INT4, you need to install the following packages:
267
 
 
261
  ## Quantization Reproduction
262
 
263
  > [!NOTE]
264
+ > In order to quantize Llama 3.1 70B Instruct using AutoGPTQ, you will need to use an instance with at least enough CPU RAM to fit the whole model i.e. ~140GiB, and an NVIDIA GPU with 40GiB of VRAM to quantize it.
265
 
266
  In order to quantize Llama 3.1 70B Instruct with GPTQ in INT4, you need to install the following packages:
267