license: apache-2.0
language:
- en
pipeline_tag: text-generation
tags:
- serialization
Model Card for Neural-Zephyr Mistral 14B
Intel and Hugging Face developed two of the most prominent Mistral-type models released: Neural-Chat and Zephyr.
Neural-Zephyr is a hybrid Transfer Learning version joining Neural-Chat weights and Zephyr Mistral type models. The weights are aggregated in the same layers, summing up 14B parameters.
Zephyr is a series of language models that are trained to act as helpful assistants. Zephyr-7B-β is the second model in the series, and is a fine-tuned version of mistralai/Mistral-7B-v0.1 that was trained on a mix of publicly available, synthetic datasets using Direct Preference Optimization (DPO). and made the model more helpful. However, this means that model is likely to generate problematic text when prompted to do so. You can find more details in the technical report.
Model description
- Model type: A 14B parameter GPT-like model fine-tuned on a mix of publicly available, synthetic datasets.
- Language(s) (NLP): Primarily English
- License: MIT
- Finetuned from model: mistralai/Mistral-7B-v0.1
Use in Transformers
Load model directly
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, MistralForCausalLM
from huggingface_hub import hf_hub_download
model = MistralForCausalLM.from_pretrained("ai-agi/neural-zephyr", use_cache=False, torch_dtype=torch.bfloat16, device_map="auto")
model_weights = hf_hub_download(repo_id="ai-agi/neural-zephyr", filename="model_weights.pth")
state_dict = torch.load(model_weights)
model.load_state_dict(state_dict)
tokenizer = AutoTokenizer.from_pretrained("ai-agi/neural-zephyr", use_fast=True)
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
Manage your GPU/CPU memory for model and weights