Spaces:

expandme-tech
/

SmallZOO-GGUFee-Llama

Running

expandme commited on 22 days ago

Commit

54081a3

•

1 Parent(s): 7537b38

Fixing tokens repetition ? - What wind.surf will do ?

Files changed (2) hide show

app.py CHANGED Viewed

@@ -164,11 +164,14 @@ with gr.Blocks() as demo:
     def submit_message(message, chat_history, model_name, system_message, max_tokens, temperature, top_p):
         history = [] if chat_history is None else chat_history
-        for response in respond(message, history, model_name, system_message, max_tokens, temperature, top_p):
-            history = history + [
-                {"role": "user", "content": message},
-                {"role": "assistant", "content": response}
-            ]
             yield history, ""
     submit_event = submit.click(

     def submit_message(message, chat_history, model_name, system_message, max_tokens, temperature, top_p):
         history = [] if chat_history is None else chat_history
+        # Add user message first
+        history = history + [{"role": "user", "content": message}]
+        # Then stream the assistant's response
+        for response in respond(message, history[:-1], model_name, system_message, max_tokens, temperature, top_p):
+            history[-1] = {"role": "user", "content": message}
+            history = history + [{"role": "assistant", "content": response}]
             yield history, ""
     submit_event = submit.click(

models.lst CHANGED Viewed

@@ -12,6 +12,6 @@ https://huggingface.co/lmstudio-community/Qwen2.5-1.5B-Instruct-GGUF
 https://huggingface.co/lmstudio-community/granite-3.0-1b-a400m-instruct-GGUF
-https://huggingface.co/lmstudio-community/AMD-OLMo-1B-SFT-GGUF


12
13	https://huggingface.co/lmstudio-community/granite-3.0-1b-a400m-instruct-GGUF
14
15	+ https://huggingface.co/lmstudio-community/AMD-OLMo-1B-SFT-DPO-GGUF
16
17