Mihaiii
/

stablelm-zephyr-3b-OV_FP14-4BIT

Text Generation

Model card Files Files and versions Community

stablelm-zephyr-3b-OV_FP14-4BIT / README.md

Mihaiii's picture

Update README.md

342ccb2 verified 11 months ago

|

history blame contribute delete

1.71 kB

	---
	library_name: transformers
	license: other
	---
	The quantized version of [stablelm-zephyr-3b](https://huggingface.co/stabilityai/stablelm-zephyr-3b) after running the steps on from [here](https://github.com/openvinotoolkit/openvino_notebooks/blob/main/notebooks/273-stable-zephyr-3b-chatbot/273-stable-zephyr-3b-chatbot.ipynb)

	You can use it like this (steps taken from the above link):

	```bash
	pip install -q git+https://github.com/huggingface/optimum-intel.git@e22a2ac26b3a6c7854da956d538f784ebeca879b onnx openvino-nightly
	```

	then

	```python
	from optimum.intel.openvino import OVModelForCausalLM
	from transformers import AutoConfig, AutoTokenizer
	from optimum.utils import NormalizedTextConfig, NormalizedConfigManager

	NormalizedConfigManager._conf['stablelm_epoch'] = NormalizedTextConfig.with_args(num_layers='num_hidden_layers', num_attention_heads='num_attention_heads')
	NormalizedConfigManager._conf['stablelm-epoch'] = NormalizedTextConfig.with_args(num_layers='num_hidden_layers', num_attention_heads='num_attention_heads')

	model_path = 'Mihaiii/stablelm-zephyr-3b-OV_FP14-4BIT'
	model = OVModelForCausalLM.from_pretrained(model_path, compile=False, config=AutoConfig.from_pretrained(model_path, trust_remote_code=True), stateful=True)
	tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)

	prompt = [{'role': 'user', 'content': 'List 3 synonyms for the word "tiny"'}]
	inputs = tokenizer.apply_chat_template(
	prompt,
	add_generation_prompt=True,
	return_tensors='pt'
	)

	tokens = model.generate(
	inputs.to(model.device),
	max_new_tokens=1024,
	temperature=0.8,
	do_sample=True
	)

	print(tokenizer.decode(tokens[0], skip_special_tokens=False))
	```