dfurman
/

Mixtral-8x7B-Instruct-v0.1

@@ -19,23 +19,23 @@ base_model: mistralai/Mixtral-8x7B-v0.1
 </div>
-# Mistral-7B-Instruct-v0.2
-A pretrained generative language model with 7 billion parameters geared towards instruction-following capabilities.
 ## Model Details
-This model was built via parameter-efficient finetuning of the [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) base model on the first 20k rows in each of the [jondurbin/airoboros-2.2.1](https://huggingface.co/datasets/jondurbin/airoboros-2.2.1), [Open-Orca/SlimOrca](https://huggingface.co/datasets/Open-Orca/SlimOrca), and [garage-bAInd/Open-Platypus](https://huggingface.co/datasets/garage-bAInd/Open-Platypus) datasets.
 - **Developed by:** Daniel Furman
 - **Model type:** Causal language model (clm)
 - **Language(s) (NLP):** English
 - **License:** Apache 2.0
-- **Finetuned from model:** [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
 ## Model Sources
-- **Repository:** [here](https://github.com/daniel-furman/sft-demos/blob/main/src/sft/mistral/sft_Mistral_7B_Instruct_v0_1_peft.ipynb)
 ## Evaluation Results
@@ -69,7 +69,7 @@ from transformers import (
 ```
 ```python
-peft_model_id = "dfurman/Mistral-7B-Instruct-v0.2"
 config = PeftConfig.from_pretrained(peft_model_id)
 tokenizer = AutoTokenizer.from_pretrained(
@@ -141,18 +141,18 @@ print(response)
 **Generation**:
 ```python
-"""1. Combine the following ingredients in a cocktail shaker:
-2 oz light rum (or white rum)
-1 oz dark rum
-0.5 oz orange curacao or triple sec
-0.75 oz lime juice, freshly squeezed
-0.5 tbsp simple syrup (optional; if you like your drinks sweet)
-Few drops of bitters (Angostura is traditional but any will do)
-Ice cubes to fill the shaker
-2. Shake vigorously until well-chilled and combined.
-3. Strain into an ice-filled glass.
-4. Garnish with a slice of lime or an orange wedge, if desired."""
 ```
 </details>
@@ -166,7 +166,7 @@ Ice cubes to fill the shaker
 ## Training
-It took ~5 hours to train 3 epochs on 1x A100 (40 GB SXM).
 ### Prompt Format

 </div>
+# Mixtral-8x7B-Instruct-v0.1
+A pretrained generative language model with ~47 billion parameters geared towards instruction-following capabilities.
 ## Model Details
+This model was built via parameter-efficient finetuning of the [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) base model on the first 20k rows in each of the [jondurbin/airoboros-2.2.1](https://huggingface.co/datasets/jondurbin/airoboros-2.2.1), [Open-Orca/SlimOrca](https://huggingface.co/datasets/Open-Orca/SlimOrca), and [garage-bAInd/Open-Platypus](https://huggingface.co/datasets/garage-bAInd/Open-Platypus) datasets.
 - **Developed by:** Daniel Furman
 - **Model type:** Causal language model (clm)
 - **Language(s) (NLP):** English
 - **License:** Apache 2.0
+- **Finetuned from model:** [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1)
 ## Model Sources
+- **Repository:** [here](https://github.com/daniel-furman/sft-demos/blob/main/src/sft/mixtral/sft_Mixtral_8x7B_Instruct_v0_1_peft.py)
 ## Evaluation Results
 ```
 ```python
+peft_model_id = "dfurman/Mixtral-8x7B-Instruct-v0.1"
 config = PeftConfig.from_pretrained(peft_model_id)
 tokenizer = AutoTokenizer.from_pretrained(
 **Generation**:
 ```python
+"""1.5 oz White Rum
+2 oz Dark Rum
+1 oz Orange Curacao
+0.5 oz Orgeat Syrup
+0.5 oz Simple Syrup
+0.75 oz Lime Juice
+In a shaker filled with ice, combine the white rum, dark rum, orange curacao, orgeat syrup, simple syrup, and lime juice. Shake vigorously for 10-15 seconds.
+Strain the mixture into a double old-fashioned glass filled with fresh ice. Garnish with a lime wedge and a sprig of mint.
+Enjoy your delicious mai tai!"""
 ```
 </details>
 ## Training
+It took ~24 hours to train 2 epochs on 4x A6000s.
 ### Prompt Format