|
--- |
|
tags: |
|
- gptq |
|
- 4bit |
|
- int4 |
|
- gptqmodel |
|
- modelcloud |
|
--- |
|
This model has been quantized using [GPTQModel](https://github.com/ModelCloud/GPTQModel). |
|
|
|
- **bits**: 4 |
|
- **group_size**: 128 |
|
- **desc_act**: false |
|
- **static_groups**: false |
|
- **sym**: true |
|
- **lm_head**: false |
|
- **damp_percent**: 0.0025 |
|
- **damp_auto_increment**: 0.0015 |
|
- **true_sequential**: true |
|
- **model_name_or_path**: "" |
|
- **model_file_base_name**: "model" |
|
- **quant_method**: "gptq" |
|
- **checkpoint_format**: "gptq" |
|
- **meta**οΌ |
|
- **quantizer**: "gptqmodel:1.0.3-dev0" |
|
|
|
## Example: |
|
```python |
|
from transformers import AutoTokenizer |
|
from gptqmodel import GPTQModel |
|
|
|
model_name = "ModelCloud/GRIN-MoE-gptq-4bit" |
|
|
|
prompt = [ |
|
{"role": "system", |
|
"content": "You are GRIN-MoE model from microsoft, a helpful assistant."}, |
|
{"role": "user", "content": "I am in Shanghai, preparing to visit the natural history museum. Can you tell me the best way to"} |
|
] |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) |
|
|
|
model = GPTQModel.from_quantized(model_name, trust_remote_code=True) |
|
|
|
input_tensor = tokenizer.apply_chat_template(prompt, add_generation_prompt=True, return_tensors="pt") |
|
outputs = model.generate(input_ids=input_tensor.to(model.device), max_new_tokens=100) |
|
result = tokenizer.decode(outputs[0][input_tensor.shape[1]:], skip_special_tokens=True) |
|
|
|
print(result) |
|
``` |
|
|
|
## Lm_eval result: |
|
|
|
| Tasks | Metric | | GRIN-MoE | GRIN-MoE-gptq-4bit | |
|
| ------------------------------------- | ---------- | --- | -------- | ------------------ | |
|
| arc_challenge | acc | β | 0.6408 | 0.6425 | |
|
| | acc_norm | β | 0.6561 | 0.6587 | |
|
| arc_easy | acc | β | 0.8645 | 0.8683 | |
|
| | acc_norm | β | 0.8422 | 0.846 | |
|
| boolq | acc | β | 0.8820 | 0.8765 | |
|
| hellaswag | acc | β | 0.6972 | 0.6891 | |
|
| | acc_norm | β | 0.8518 | 0.8486 | |
|
| lambada_openai | acc | β | 0.7058 | 0.7068 | |
|
| | perplexity | β | 3.4568 | 3.5732 | |
|
| mmlu | acc | β | 0.7751 | 0.7706 | |
|
| - humanities | acc | β | 0.7394 | 0.7384 | |
|
| - formal_logic | acc | β | 0.6429 | 0.6746 | |
|
| - high_school_european_history | acc | β | 0.8606 | 0.8364 | |
|
| - high_school_us_history | acc | β | 0.9118 | 0.902 | |
|
| - high_school_world_history | acc | β | 0.8903 | 0.8734 | |
|
| - international_law | acc | β | 0.9256 | 0.9091 | |
|
| - jurisprudence | acc | β | 0.8426 | 0.8519 | |
|
| - logical_fallacies | acc | β | 0.8344 | 0.8528 | |
|
| - moral_disputes | acc | β | 0.7977 | 0.8208 | |
|
| - moral_scenarios | acc | β | 0.6961 | 0.6849 | |
|
| - philosophy | acc | β | 0.8199 | 0.8071 | |
|
| - prehistory | acc | β | 0.8457 | 0.8426 | |
|
| - professional_law | acc | β | 0.6173 | 0.6193 | |
|
| - world_religions | acc | β | 0.8480 | 0.8655 | |
|
| - other | acc | β | 0.8130 | 0.805 | |
|
| - business_ethics | acc | β | 0.8100 | 0.78 | |
|
| - clinical_knowledge | acc | β | 0.8415 | 0.8302 | |
|
| - college_medicine | acc | β | 0.7514 | 0.7457 | |
|
| - global_facts | acc | β | 0.5700 | 0.54 | |
|
| - human_aging | acc | β | 0.7803 | 0.7668 | |
|
| - management | acc | β | 0.8447 | 0.8447 | |
|
| - marketing | acc | β | 0.9145 | 0.9103 | |
|
| - medical_genetics | acc | β | 0.9200 | 0.89 | |
|
| - miscellaneous | acc | β | 0.8966 | 0.8927 | |
|
| - nutrition | acc | β | 0.8333 | 0.8268 | |
|
| - professional_accounting | acc | β | 0.6489 | 0.656 | |
|
| - professional_medicine | acc | β | 0.8750 | 0.8603 | |
|
| - virology | acc | β | 0.5422 | 0.5361 | |
|
| - social sciences | acc | β | 0.8638 | 0.8544 | |
|
| - econometrics | acc | β | 0.5789 | 0.5789 | |
|
| - high_school_geography | acc | β | 0.9091 | 0.8788 | |
|
| - high_school_government_and_politics | acc | β | 0.9585 | 0.943 | |
|
| - high_school_macroeconomics | acc | β | 0.8308 | 0.8103 | |
|
| - high_school_microeconomics | acc | β | 0.9328 | 0.9286 | |
|
| - high_school_psychology | acc | β | 0.9321 | 0.9303 | |
|
| - human_sexuality | acc | β | 0.8779 | 0.8626 | |
|
| - professional_psychology | acc | β | 0.8382 | 0.8219 | |
|
| - public_relations | acc | β | 0.7545 | 0.7727 | |
|
| - security_studies | acc | β | 0.7878 | 0.7918 | |
|
| - sociology | acc | β | 0.8905 | 0.8955 | |
|
| - us_foreign_policy | acc | β | 0.9000 | 0.88 | |
|
| - stem | acc | β | 0.7044 | 0.7031 | |
|
| - abstract_algebra | acc | β | 0.5000 | 0.45 | |
|
| - anatomy | acc | β | 0.7407 | 0.7481 | |
|
| - astronomy | acc | β | 0.8618 | 0.8618 | |
|
| - college_biology | acc | β | 0.8889 | 0.875 | |
|
| - college_chemistry | acc | β | 0.6100 | 0.59 | |
|
| - college_computer_science | acc | β | 0.7100 | 0.67 | |
|
| - college_mathematics | acc | β | 0.5100 | 0.58 | |
|
| - college_physics | acc | β | 0.4608 | 0.4608 | |
|
| - computer_security | acc | β | 0.8200 | 0.82 | |
|
| - conceptual_physics | acc | β | 0.7787 | 0.766 | |
|
| - electrical_engineering | acc | β | 0.6828 | 0.6828 | |
|
| - elementary_mathematics | acc | β | 0.7566 | 0.7593 | |
|
| - high_school_biology | acc | β | 0.9000 | 0.9097 | |
|
| - high_school_chemistry | acc | β | 0.6650 | 0.665 | |
|
| - high_school_computer_science | acc | β | 0.8700 | 0.86 | |
|
| - high_school_mathematics | acc | β | 0.4370 | 0.4296 | |
|
| - high_school_physics | acc | β | 0.5960 | 0.5894 | |
|
| - high_school_statistics | acc | β | 0.7176 | 0.7222 | |
|
| - machine_learning | acc | β | 0.6071 | 0.6339 | |
|
| openbookqa | acc | β | 0.3920 | 0.386 | |
|
| | acc_norm | β | 0.4900 | 0.486 | |
|
| piqa | acc | β | 0.8183 | 0.8166 | |
|
| | acc_norm | β | 0.8205 | 0.8177 | |
|
| rte | acc | β | 0.8014 | 0.7834 | |
|
| truthfulqa_mc1 | acc | β | 0.3880 | 0.399 | |
|
| winogrande | acc | β | 0.7940 | 0.768 | |
|
| | | | | | |
|
| Groups | Metric | | Value | Value | |
|
| mmlu | acc | β | 0.7751 | 0.7706 | |
|
| - humanities | acc | β | 0.7394 | 0.7384 | |
|
| - other | acc | β | 0.8130 | 0.805 | |
|
| - social sciences | acc | β | 0.8638 | 0.8544 | |
|
| - stem | acc | β | 0.7044 | 0.7031 | |
|
|