zachlandes
/

Mistral-Large-Instruct-2411-MLX

Model card Files Files and versions

Mistral-Large-Instruct-2411-MLX / README.md

zachlandes's picture

Update model card metadata

b054fe2 verified 9 months ago

|

history blame contribute delete

2.79 kB

	---
	license: apache-2.0
	language:
	- en
	- fr
	- es
	- de
	- it
	- pt
	- zh
	- ja
	- ru
	- ko
	base_model:
	- mistralai/Mistral-Large-Instruct-2411
	tags:
	- conversational
	- mlx
	---
	# Model Card for Mistral-Large-Instruct-2411-MLX

	This repository serves as the parent directory for the MLX quantized versions of the Mistral Large Instruct 2411 model. The quantized versions were created for MLX (Apple Silicon) using the `mlx-lm` library.

	## Quantized Versions

	- [2-bit Quantization (Q2)](https://huggingface.co/zachlandes/Mistral-Large-Instruct-2411-Q2-MLX)
	- [4-bit Quantization (Q4)](https://huggingface.co/zachlandes/Mistral-Large-Instruct-2411-Q4-MLX)

	Each version is optimized for specific memory and performance trade-offs. See the individual repositories for details on the quantization methods.

	## Original Model

	The original Mistral-Large-Instruct-2411 model is available [here](https://huggingface.co/mistralai/Mistral-Large-Instruct-2411). Mistral model usage is governed by the [Mistral Research License](https://mistral.ai/licenses/MRL-0.1.md).

	## License

	This model family is governed by the [Mistral Research License](https://mistral.ai/licenses/MRL-0.1.md). Please review the license terms before use.

	## Table of Contents

	- [Model Details](#model-details)
	- [Model Description](#model-description)
	- [Uses](#uses)
	- [Direct Use](#direct-use)
	- [Out-of-Scope Use](#out-of-scope-use)
	- [Bias, Risks, and Limitations](#bias-risks-and-limitations)
	- [Recommendations](#recommendations)
	- [Technical Specifications](#technical-specifications)
	- [How to Get Started](#how-to-get-started)

	## Model Details

	### Model Description

	The Mistral-Large-Instruct-2411-MLX family includes quantized versions of the Mistral Large Instruct 2411 model, optimized for deployment on MLX (Apple Silicon). The quantization reduces memory usage and inference latency, enabling efficient deployment on resource-constrained systems.

	- Developed by: Mistral AI
	- Model type: Large language model
	- Language(s): English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese, Russian, Korean
	- Quantization levels: 2-bit (Q2), 4-bit (Q4)

	## Technical Specifications

	- Parent Model: [Mistral-Large-Instruct-2411](https://huggingface.co/mistralai/Mistral-Large-Instruct-2411)
	- Quantization: 2-bit (Q2), 4-bit (Q4)
	- Framework: MLX (`mlx-lm` library)

	## How to Get Started

	Visit the individual quantized repositories for details and usage instructions:

	- [2-bit Quantization (Q2)](https://huggingface.co/zachlandes/Mistral-Large-Instruct-2411-Q2-MLX)
	- [4-bit Quantization (Q4)](https://huggingface.co/zachlandes/Mistral-Large-Instruct-2411-Q4-MLX)

	## Model Card Contact

	For inquiries, contact [Zach Landes](https://www.linkedin.com/in/zachlandes/).