Method to get 8bit quantized model
#1
by
kitaharatomoyo
- opened
Can you tell me how you get the 8bit quantized model from falcon-7b?
I want to get my own 8bit quantized model from finetuned falcon-7b model.
Hi @kitaharatomoyo , you can try load and directly directly from HG as below:
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", load_in_8bit=True, trust_remote_code=True)
MODEL_SAVE_FOLDER_NAME = "falcon-7b-8bit"
model.save_pretrained(MODEL_SAVE_FOLDER_NAME)