universitytehran
/

PersianMind-v1.0

Text Generation

text-generation-inference

Model card Files Files and versions Community

Pedram Rostami commited on Mar 12, 2024

Commit

e8acab7

·

verified ·

1 Parent(s): e9576ae

Update README.md

Files changed (1) hide show

README.md +5 -5

README.md CHANGED Viewed

@@ -47,17 +47,17 @@ Use the code below to get started with the model.
 Note that you need to install <code><b>sentencepiece</b></code> and <code><b>accelerate</b></code> libraries along with <code><b>PyTorch</b></code> and <code><b>🤗Transformers</b></code> to run this code.
 ```python
-from transformers import LlamaTokenizer, LlamaForCausalLM
 import torch
 device = "cuda" if torch.cuda.is_available() else "cpu"
-model = LlamaForCausalLM.from_pretrained(
     "universitytehran/PersianMind-v1.0",
     torch_dtype=torch.bfloat16,
     low_cpu_mem_usage=True,
     device_map={"": device},
 )
-tokenizer = LlamaTokenizer.from_pretrained(
     "universitytehran/PersianMind-v1.0",
 )
@@ -84,7 +84,7 @@ To quantize the model, you should install the <code><b>bitsandbytes</b></code> l
 In order to quantize the model in 8-bit (`INT8`), use the code below.
 ```python
-model = LlamaForCausalLM.from_pretrained(
     "universitytehran/PersianMind-v1.0",
     device_map="auto",
     low_cpu_mem_usage=True,
@@ -102,7 +102,7 @@ quantization_config = BitsAndBytesConfig(
     bnb_4bit_use_double_quant=True,
     bnb_4bit_quant_type="nf4",
 )
-model = LlamaForCausalLM.from_pretrained(
     "universitytehran/PersianMind-v1.0",
     quantization_config=quantization_config,
     device_map="auto"

 Note that you need to install <code><b>sentencepiece</b></code> and <code><b>accelerate</b></code> libraries along with <code><b>PyTorch</b></code> and <code><b>🤗Transformers</b></code> to run this code.
 ```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
 import torch
 device = "cuda" if torch.cuda.is_available() else "cpu"
+model = AutoModelForCausalLM.from_pretrained(
     "universitytehran/PersianMind-v1.0",
     torch_dtype=torch.bfloat16,
     low_cpu_mem_usage=True,
     device_map={"": device},
 )
+tokenizer = AutoTokenizer.from_pretrained(
     "universitytehran/PersianMind-v1.0",
 )
 In order to quantize the model in 8-bit (`INT8`), use the code below.
 ```python
+model = AutoModelForCausalLM.from_pretrained(
     "universitytehran/PersianMind-v1.0",
     device_map="auto",
     low_cpu_mem_usage=True,
     bnb_4bit_use_double_quant=True,
     bnb_4bit_quant_type="nf4",
 )
+model = AutoModelForCausalLM.from_pretrained(
     "universitytehran/PersianMind-v1.0",
     quantization_config=quantization_config,
     device_map="auto"