mlx-community
/

gemma-2-27b-bf16-4bit

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

gemma-2-27b-bf16-4bit / README.md

awni's picture

Upload folder using huggingface_hub (#1)

2b034d7 verified 2 months ago

|

history blame contribute delete

1.22 kB

	---
	base_model: google/gemma-2-27b
	library_name: transformers
	license: gemma
	pipeline_tag: text-generation
	tags:
	- mlx
	extra_gated_heading: Access Gemma on Hugging Face
	extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and
	agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging
	Face and click below. Requests are processed immediately.
	extra_gated_button_content: Acknowledge license
	---

	# mlx-community/gemma-2-27b-4-bit

	The Model [mlx-community/gemma-2-27b-4-bit](https://huggingface.co/mlx-community/gemma-2-27b-4-bit) was converted to MLX format from [google/gemma-2-27b](https://huggingface.co/google/gemma-2-27b) using mlx-lm version 0.19.1.

	## Use with mlx

	```bash
	pip install mlx-lm
	```

	```python
	from mlx_lm import load, generate

	model, tokenizer = load("mlx-community/gemma-2-27b-4-bit")

	prompt="hello"

	if hasattr(tokenizer, "apply_chat_template") and tokenizer.chat_template is not None:
	messages = [{"role": "user", "content": prompt}]
	prompt = tokenizer.apply_chat_template(
	messages, tokenize=False, add_generation_prompt=True
	)

	response = generate(model, tokenizer, prompt=prompt, verbose=True)
	```