8bit and sharded weights

#37

by ThreeBlessings - opened May 27, 2023

May 27, 2023

Hi!

I'm updating a lab for Data-Centric AI course and it would be cool to use this model with load_in_8bit=True parameter and have it sharded in 2Gb weights for easy use with free tier Colab GPUs.

Is it planned to add this features?

AayushShah

May 29, 2023

I've used this code:

model_name = "mosaicml/mpt-7b-instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name,
                                             low_cpu_mem_usage=True,
                                             trust_remote_code=True,
                                             load_in_8bit=True,
                                             torch_dtype=torch.float16,
                                             device_map="auto")

But gives this error:

ValueError: MPTForCausalLM does not support `device_map='auto'` yet.

abhi-mosaic

Jun 3, 2023

I believe this should be fixed now as of this PR: https://huggingface.co/mosaicml/mpt-7b-instruct/discussions/41

abhi-mosaic changed discussion status to closed Jun 3, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment