CreitinGameplays
/

Llama-3.1-8b-reasoning-test-Q4_K_M-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

CreitinGameplays commited on 16 days ago

Commit

8883031

·

verified ·

1 Parent(s): 6833b1b

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -75,7 +75,7 @@ llm = Llama.from_pretrained(
 chat_history = [
     {"role": "system", "content": """
 You are a helpful assistant named Llama, made by Meta AI.
-Always use your <|reasoning|> <|end_reasoning|> tokens, without any text formatting, plain text only.
     """}
 ]
@@ -93,7 +93,7 @@ while True:
     # Call the chat completion API in streaming mode with the updated conversation.
     output_stream = llm.create_chat_completion(
         messages=chat_history,
-        temperature=0.7,
         top_p=0.95,
         max_tokens=4096,
         stream=True

 chat_history = [
     {"role": "system", "content": """
 You are a helpful assistant named Llama, made by Meta AI.
+Always use your <|reasoning|> and <|end_reasoning|> tokens, without any text formatting, plain text only.
     """}
 ]
     # Call the chat completion API in streaming mode with the updated conversation.
     output_stream = llm.create_chat_completion(
         messages=chat_history,
+        temperature=0.4,
         top_p=0.95,
         max_tokens=4096,
         stream=True