norallm
/

normistral-7b-warm-instruct

Text Generation

Norwegian Bokmål

Norwegian Nynorsk

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

davda54 commited on Apr 15

Commit

4620a93

•

1 Parent(s): beeda4c

Update README.md

Files changed (1) hide show

README.md +4 -3

README.md CHANGED Viewed

@@ -141,13 +141,14 @@ llm = Llama.from_pretrained(
   filename="*Q4_K_M.gguf", # suffix of the filename containing the level of quantization.
   n_ctx=32768,  # The max sequence length to use - note that longer sequence lengths require much more resources
   n_threads=8,            # The number of CPU threads to use, tailor to your system and the resulting performance
-  n_gpu_layers=16         # The number of layers to offload to GPU, if you have GPU acceleration available
 )
 # Simple inference example
 output = llm(
   """<s><|im_start|> user
-Hva kan jeg bruke einstape til?<|im_end|><|im_start|> assitant
 """, # Prompt
   max_tokens=512,  # Generate up to 512 tokens
   stop=["<|im_end|>"],   # Example stop token
@@ -161,7 +162,7 @@ llm.create_chat_completion(
     messages = [
         {
             "role": "user",
-            "content": Hva kan jeg bruke einstape til?"
         }
     ]
 )

   filename="*Q4_K_M.gguf", # suffix of the filename containing the level of quantization.
   n_ctx=32768,  # The max sequence length to use - note that longer sequence lengths require much more resources
   n_threads=8,            # The number of CPU threads to use, tailor to your system and the resulting performance
+  n_gpu_layers=35         # The number of layers to offload to GPU, if you have GPU acceleration available
 )
 # Simple inference example
 output = llm(
   """<s><|im_start|> user
+Hva kan jeg bruke einstape til?<|im_end|>
+<|im_start|> assistant
 """, # Prompt
   max_tokens=512,  # Generate up to 512 tokens
   stop=["<|im_end|>"],   # Example stop token
     messages = [
         {
             "role": "user",
+            "content": "Hva kan jeg bruke einstape til?"
         }
     ]
 )