bengsoon commited on
Commit
ba9184c
·
verified ·
1 Parent(s): 5494497

Quantization with BitsAndBytesConfig

Browse files
Files changed (1) hide show
  1. README.md +7 -4
README.md CHANGED
@@ -20,7 +20,7 @@ This is a fine-tuned model from [Meta/Meta-Llama-3-8B](https://huggingface.co/Me
20
  The motivation behind this model was to fine-tune an LLM that is capable of understanding the nuances of the Drilling Operations and provide 24-hour summarizations based on the inputs from Daily Drilling Reports hourly activities.
21
 
22
  ## How to use
23
- ### Recommended template for DriLLM summarization:
24
  ``` python
25
  TEMPLATE = """<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
26
 
@@ -90,10 +90,13 @@ print("Response: ", output[0]["generated_text"].split("### Response:")[1].strip(
90
  If you are facing GPU constraints, you can try to load it with 8-bit quantization
91
 
92
  ``` python
 
 
93
  pipeline = transformers.pipeline(
94
- "text-generation",
95
- model=model_id,
96
- model_kwargs={"torch_dtype": torch.bfloat16, "load_in_8bit": True}, # Use 8-bit quantization
 
97
  device_map="auto"
98
  )
99
  ```
 
20
  The motivation behind this model was to fine-tune an LLM that is capable of understanding the nuances of the Drilling Operations and provide 24-hour summarizations based on the inputs from Daily Drilling Reports hourly activities.
21
 
22
  ## How to use
23
+ ### Recommended template for DriLLM-Summarizer:
24
  ``` python
25
  TEMPLATE = """<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
26
 
 
90
  If you are facing GPU constraints, you can try to load it with 8-bit quantization
91
 
92
  ``` python
93
+ from transformers import BitsAndBytesConfig
94
+
95
  pipeline = transformers.pipeline(
96
+ "text-generation",
97
+ model=model_id,
98
+ torch_dtype = torch.bfloat16,
99
+ quantization_config=BitsAndBytesConfig(load_in_8bit=True), # 8-bit quantization,
100
  device_map="auto"
101
  )
102
  ```