aws-neuron
/

zephyr-7b-beta-neuron

@@ -43,7 +43,7 @@ This model has been compiled to run on an inf2.xlarge (the smallest Inferentia2
 ## Set up the environment
-First, use the [DLAMI image from Hugging Face](https://aws.amazon.com/marketplace/pp/prodview-gr3e6yiscria2).  It has most of the utilities and drivers preinstalled.  However, you will need to update transformers-neruonx from the source to get Mistral support.
 ```
@@ -52,9 +52,9 @@ python -m pip install git+https://github.com/aws-neuron/transformers-neuronx.git
 ## Running inference from this repository
-If you want to run a quick test or if the exact model you want to use is [HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta), you can run it directly using the steps below.  Otherwise, jump to the Compilation of other Mistral versions section.
-First, you will need a local copy of the library.  This is because one of the nice things that the Hugging Face optimum library does is abstract local loads from repository loads.  However, Mistral inference isn't supported yet.
 ```

 ## Set up the environment
+First, use the [DLAMI image from Hugging Face](https://aws.amazon.com/marketplace/pp/prodview-gr3e6yiscria2).  It has most of the utilities and drivers preinstalled.  However, you will need to update transformers-neruonx from the source to get Mistral/Zephyr support.
 ```
 ## Running inference from this repository
+If you want to run a quick test or if the exact model you want to use is [HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta), you can run it directly using the steps below.  Otherwise, jump to the Compilation of other Mistral/Zephyr versions section.
+First, you will need a local copy of the library.  This is because one of the nice things that the Hugging Face optimum library does is abstract local loads from repository loads.  However, Mistral/Zephyr inference isn't supported yet.
 ```