mjdousti commited on
Commit
beb0d76
·
1 Parent(s): ae8cd0d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -18
README.md CHANGED
@@ -26,9 +26,12 @@ co2_eq_emissions:
26
  <img src="PersianMind.jpg" alt="PersianMind logo" width=200/>
27
 
28
 
29
- # PersianMind
30
 
31
- PersianMind is a a cross-lingual Persian-English large language model.
 
 
 
32
 
33
  ### Model Description
34
 
@@ -39,7 +42,8 @@ PersianMind is a a cross-lingual Persian-English large language model.
39
 
40
  ## How to Get Started with the Model
41
 
42
- Use the code below to get started with the model. Note that you need to install `sentencepiece` and `accelerate` libraries to run this code.
 
43
 
44
  ```python
45
  from transformers import LlamaTokenizer, LlamaForCausalLM
@@ -73,11 +77,11 @@ model_output = model_output.replace(model_input, "")
73
  print(model_output)
74
  ```
75
 
76
- ## How to Get Started with the Quantized Model
77
 
78
  Quantized models can be run on resource-constrained devices.
79
- To use quantized models, you should install the `bitsandbytes` library.
80
- To get started with 8-bit quantized model, use the code below to define the model.
81
 
82
  ```python
83
  model = LlamaForCausalLM.from_pretrained(
@@ -88,7 +92,7 @@ model = LlamaForCausalLM.from_pretrained(
88
  )
89
  ```
90
 
91
- To get started with 4-bit quantized model, use the code below to define the model.
92
 
93
  ```python
94
  from transformers import BitsAndBytesConfig
@@ -105,24 +109,25 @@ model = LlamaForCausalLM.from_pretrained(
105
  )
106
  ```
107
 
108
- ## Evaluating Quantized Models
109
 
110
- | Model | Belebele (Persian) | Translation Fa2En | Translation En2Fa | Model Size | Words/sec |
111
- | :----------------- | :----------------: | :---------------: | :---------------: | :--------: | :-------: |
112
- | PersianMind | 73.9 | 83.61 | 79.44 | 13.66G | 25.35 |
113
- | PersianMind-8bit | 73.7 | 82.32 | 78.61 | 7.2G | 11.36 |
114
- | PersianMind-4bit | 70.2 | 82.07 | 80.36 | 3.9G | 24.36 |
115
 
116
  We evaluated quantized models in various tasks against the original model.
117
  Specifically, we evaluated all models using the reading comprehension multiple-choice
118
- question-answering benchmark of Belebele (Persian subset) and reported the accuracy of each model.
119
  Additionally, we evaluated our models for Persian-to-English and English-to-Persian translation tasks.
120
- For this, we utilized the Persian-English subset of the Flores-200 dataset and reported our results using the Comet metric.
121
- Furthermore, we calculated the average number of words generated by each model per second during running the translation tasks.
122
- To understand resource efficiency, we measured the memory usage of each model by employing the `get_memory_footprint` function.
 
123
 
124
  ## License
125
- PersianMind is subject to Meta's [LLaMa2 Community License](https://raw.githubusercontent.com/facebookresearch/llama/main/LICENSE).
126
  It is further licensed under [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/), which allows non-commercial use of the model.
127
  Commercial use of this model requires written agreement which must be obtained from the copyright holders who are listed as developers in this page.
128
  If you suspect any violations, please reach out to us.
 
26
  <img src="PersianMind.jpg" alt="PersianMind logo" width=200/>
27
 
28
 
29
+ # <span style="font-variant:small-caps;">PersianMind</span>
30
 
31
+ <span style="font-variant:small-caps;">PersianMind</span> is a cross-lingual Persian-English large language model.
32
+ The model achieves state-of-the-art results on Persian subset of the [Belebele](https://github.com/facebookresearch/belebele) benchmark
33
+ and the [ParsiNLU multiple-choice QA](https://github.com/persiannlp/parsinlu) task.
34
+ It also attains performance comparable to GPT-3.5-turbo in a Persian reading comprehension task.
35
 
36
  ### Model Description
37
 
 
42
 
43
  ## How to Get Started with the Model
44
 
45
+ Use the code below to get started with the model.
46
+ Note that you need to install <code><b>sentencepiece</b></code> and <code><b>accelerate</b></code> libraries along with <code><b>Pytorch</b></code> and <code><b>🤗Transformers</b></code> to run this code.
47
 
48
  ```python
49
  from transformers import LlamaTokenizer, LlamaForCausalLM
 
77
  print(model_output)
78
  ```
79
 
80
+ ### How to Quantize the Model
81
 
82
  Quantized models can be run on resource-constrained devices.
83
+ To quantize the model, you should install the <code><b>bitsandbytes</b></code> library.
84
+ In order to quantize the model in 8-bit (`INT8`), use the code below.
85
 
86
  ```python
87
  model = LlamaForCausalLM.from_pretrained(
 
92
  )
93
  ```
94
 
95
+ Alternatively, you can quantize the model in 4-bit (`INT4`) with the following code.
96
 
97
  ```python
98
  from transformers import BitsAndBytesConfig
 
109
  )
110
  ```
111
 
112
+ ### Evaluating Quantized Models
113
 
114
+ | Model | Belebele (Persian) | Fa→En Translation | En→Fa Translation | Model Size | Tokens/sec |
115
+ | :----------------------------------------------------------------- | :----------------: | :---------------: | :---------------: | :--------: | :--------: |
116
+ | <span style="font-variant:small-caps;">PersianMind</span> (`bf16`) | 73.9 | 83.61 | 79.44 | 13.7G | 25.35 |
117
+ | <span style="font-variant:small-caps;">PersianMind</span> (`INT8`) | 73.7 | 82.32 | 78.61 | 7.2G | 11.36 |
118
+ | <span style="font-variant:small-caps;">PersianMind</span> (`INT4`) | 70.2 | 82.07 | 80.36 | 3.9G | 24.36 |
119
 
120
  We evaluated quantized models in various tasks against the original model.
121
  Specifically, we evaluated all models using the reading comprehension multiple-choice
122
+ question-answering benchmark of [Belebele](https://github.com/facebookresearch/belebele) (Persian subset) and reported the accuracy of each model.
123
  Additionally, we evaluated our models for Persian-to-English and English-to-Persian translation tasks.
124
+ For this, we utilized the Persian-English subset of the [Flores-200](https://github.com/facebookresearch/flores/tree/main/flores200) dataset and
125
+ reported our results using the <span style="font-variant:small-caps;">Comet</span> metric.
126
+ Furthermore, we calculated the average number of generated tokens per second by each model during running the translation tasks.
127
+ To understand resource efficiency, we measured the memory usage of each model by employing the `get_memory_footprint()` function.
128
 
129
  ## License
130
+ <span style="font-variant:small-caps;">PersianMind</span> is subject to Meta's [LLaMa2 Community License](https://raw.githubusercontent.com/facebookresearch/llama/main/LICENSE).
131
  It is further licensed under [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/), which allows non-commercial use of the model.
132
  Commercial use of this model requires written agreement which must be obtained from the copyright holders who are listed as developers in this page.
133
  If you suspect any violations, please reach out to us.