Update README.md

fcc1d33 verified 2 months ago

2.78 kB

	---
	library_name: transformers
	license: apache-2.0
	base_model:
	- Qwen/Qwen2.5-32B-Instruct
	pipeline_tag: text-generation
	---

	Converted version of [Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct) to 4-bit using bitsandbytes. For more information about the model,
	refer to the model's page.

	## Impact on performance
	Impact of quantization on a set of models.

	Evaluation of the model was conducted using the PoLL (Pool of LLM) technique, assessing performance on 100 French questions with scores aggregated from six evaluations
	(two per evaluator). The evaluators included GPT-4o, Gemini-1.5-pro, and Claude3.5-sonnet.

	Performance Scores (on a scale of 5):
	\| Model \| Score \| # params (Billion) \| size (GB) \|
	\|---------------------------------------------:\|:--------:\|:------------------:\|:---------:\|
	\| gpt-4o \| 4.13 \| N/A \| N/A \|
	\| gpt-4o-mini \| 4.02 \| N/A \| N/A \|
	\| Qwen/Qwen2.5-32B-Instruct \| 3.99 \| 32.8 \| 65.6 \|
	\| cmarkea/Qwen2.5-32B-Instruct-4bit \| 3.98 \| 32.8 \| 16.4 \|
	\| mistralai/Mixtral-8x7B-Instruct-v0.1 \| 3.71 \| 46.7 \| 93.4 \|
	\| cmarkea/Mixtral-8x7B-Instruct-v0.1-4bit \| 3.68 \| 46.7 \| 23.35 \|
	\| meta-llama/Meta-Llama-3.1-70B-Instruct \| 3.68 \| 70.06 \| 140.12 \|
	\| gpt-3.5-turbo \| 3.66 \| 175 \| 350 \|
	\| cmarkea/Meta-Llama-3.1-70B-Instruct-4bit \| 3.64 \| 70.06 \| 35.3 \|
	\| TheBloke/Mixtral-8x7B-Instruct-v0.1-GPTQ \| 3.56 \| 46.7 \| 46.7 \|
	\| meta-llama/Meta-Llama-3.1-8B-Instruct \| 3.25 \| 8.03 \| 16.06 \|
	\| mistralai/Mistral-7B-Instruct-v0.2 \| 1.98 \| 7.25 \| 14.5 \|
	\| cmarkea/bloomz-7b1-mt-sft-chat \| 1.69 \| 7.07 \| 14.14 \|
	\| cmarkea/bloomz-3b-dpo-chat \| 1.68 \| 3 \| 6 \|
	\| cmarkea/bloomz-3b-sft-chat \| 1.51 \| 3 \| 6 \|
	\| croissantllm/CroissantLLMChat-v0.1 \| 1.19 \| 1.3 \| 2.7 \|
	\| cmarkea/bloomz-560m-sft-chat \| 1.04 \| 0.56 \| 1.12 \|
	\| OpenLLM-France/Claire-Mistral-7B-0.1 \| 0.38 \| 7.25 \| 14.5 \|

	The impact of quantization is negligible.

	## Prompt Pattern
	Here is a reminder of the command pattern to interact with the model:
	```verbatim
	<\|im_start\|>user\n{user_prompt_1}<\|im_end\|>\n<\|im_start\|>assistant\n{model_answer_1}...
	```

	---
	library_name: transformers
	license: apache-2.0
	base_model:
	- Qwen/Qwen2.5-32B-Instruct
	pipeline_tag: text-generation
	---

	Converted version of [Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct) to 4-bit using bitsandbytes. For more information about the model,
	refer to the model's page.

	## Impact on performance
	Impact of quantization on a set of models.

	Evaluation of the model was conducted using the PoLL (Pool of LLM) technique, assessing performance on 100 French questions with scores aggregated from six evaluations
	(two per evaluator). The evaluators included GPT-4o, Gemini-1.5-pro, and Claude3.5-sonnet.

	Performance Scores (on a scale of 5):
	\| Model \| Score \| # params (Billion) \| size (GB) \|
	\|---------------------------------------------:\|:--------:\|:------------------:\|:---------:\|
	\| gpt-4o \| 4.13 \| N/A \| N/A \|
	\| gpt-4o-mini \| 4.02 \| N/A \| N/A \|
	\| Qwen/Qwen2.5-32B-Instruct \| 3.99 \| 32.8 \| 65.6 \|
	\| cmarkea/Qwen2.5-32B-Instruct-4bit \| 3.98 \| 32.8 \| 16.4 \|
	\| mistralai/Mixtral-8x7B-Instruct-v0.1 \| 3.71 \| 46.7 \| 93.4 \|
	\| cmarkea/Mixtral-8x7B-Instruct-v0.1-4bit \| 3.68 \| 46.7 \| 23.35 \|
	\| meta-llama/Meta-Llama-3.1-70B-Instruct \| 3.68 \| 70.06 \| 140.12 \|
	\| gpt-3.5-turbo \| 3.66 \| 175 \| 350 \|
	\| cmarkea/Meta-Llama-3.1-70B-Instruct-4bit \| 3.64 \| 70.06 \| 35.3 \|
	\| TheBloke/Mixtral-8x7B-Instruct-v0.1-GPTQ \| 3.56 \| 46.7 \| 46.7 \|
	\| meta-llama/Meta-Llama-3.1-8B-Instruct \| 3.25 \| 8.03 \| 16.06 \|
	\| mistralai/Mistral-7B-Instruct-v0.2 \| 1.98 \| 7.25 \| 14.5 \|
	\| cmarkea/bloomz-7b1-mt-sft-chat \| 1.69 \| 7.07 \| 14.14 \|
	\| cmarkea/bloomz-3b-dpo-chat \| 1.68 \| 3 \| 6 \|
	\| cmarkea/bloomz-3b-sft-chat \| 1.51 \| 3 \| 6 \|
	\| croissantllm/CroissantLLMChat-v0.1 \| 1.19 \| 1.3 \| 2.7 \|
	\| cmarkea/bloomz-560m-sft-chat \| 1.04 \| 0.56 \| 1.12 \|
	\| OpenLLM-France/Claire-Mistral-7B-0.1 \| 0.38 \| 7.25 \| 14.5 \|

	The impact of quantization is negligible.

	## Prompt Pattern
	Here is a reminder of the command pattern to interact with the model:
	```verbatim
	<\|im_start\|>user\n{user_prompt_1}<\|im_end\|>\n<\|im_start\|>assistant\n{model_answer_1}...
	```