QuantFactory
/

Mistral-Nemo-Instruct-2407-abliterated-GGUF

GGUF

Inference Endpoints

conversational

Model card Files Files and versions Community

aashish1904 commited on Sep 9, 2024

Commit

77f37b9

verified ·

1 Parent(s): f2a4518

Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +76 -0

README.md ADDED Viewed

	@@ -0,0 +1,76 @@

+---
+language:
+- en
+- fr
+- de
+- es
+- it
+- pt
+- ru
+- zh
+- ja
+license: apache-2.0
+---
+![](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)
+# QuantFactory/Mistral-Nemo-Instruct-2407-abliterated-GGUF
+This is quantized version of [natong19/Mistral-Nemo-Instruct-2407-abliterated](https://huggingface.co/natong19/Mistral-Nemo-Instruct-2407-abliterated) created using llama.cpp
+# Original Model Card
+# Mistral-Nemo-Instruct-2407-abliterated
+## Introduction
+Abliterated version of [Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407), a Large Language Model (LLM) trained jointly by Mistral AI and NVIDIA that significantly outperforms existing models smaller or similar in size.
+The model's strongest refusal directions have been ablated via weight orthogonalization, but the model may still refuse your request, misunderstand your intent, or provide unsolicited advice regarding ethics or safety.
+## Key features
+- Trained with a **128k context window**
+- Trained on a large proportion of **multilingual and code data**
+- Drop-in replacement of Mistral 7B
+## Quickstart
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+model_id = "natong19/Mistral-Nemo-Instruct-2407-abliterated"
+device = "cuda"
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+conversation = [{"role": "user", "content": "Where's the capital of France?"}]
+tool_use_prompt = tokenizer.apply_chat_template(
+            conversation,
+            tokenize=False,
+            add_generation_prompt=True,
+)
+inputs = tokenizer(tool_use_prompt, return_tensors="pt", return_token_type_ids=False).to(device)
+model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto")
+outputs = model.generate(**inputs, max_new_tokens=128)
+print(tokenizer.decode(outputs[0][len(inputs["input_ids"][0]):], skip_special_tokens=True))
+```
+## Evaluation
+Evaluation framework: lm-evaluation-harness 0.4.2
+| Benchmark | Mistral-Nemo-Instruct-2407 | Mistral-Nemo-Instruct-2407-abliterated |
+| :--- | :---: | :---: |
+| ARC (25-shot) | 65.9 | 65.8 |
+| GSM8K (5-shot) | 76.2 | 75.2 |
+| HellaSwag (10-shot) | 84.3 | 84.3 |
+| MMLU (5-shot) | 68.4 | 68.8 |
+| TruthfulQA (0-shot) | 54.9 | 55.0 |
+| Winogrande (5-shot) | 82.2 | 82.6 |