Spaces:

ironserengety
/

GraphRAG-Local-to-Global

Sleeping

Saif Rehman Nasir commited on Sep 29, 2024

Commit

9aaf62d

1 Parent(s): b29383b

Reduce output token size to keep it under rate limit

Files changed (1) hide show

rag.py CHANGED Viewed

@@ -23,7 +23,7 @@ vector_index = os.getenv("VECTOR_INDEX")
 chat_llm = HuggingFaceEndpoint(
     repo_id="meta-llama/Meta-Llama-3-8B-Instruct",
     task="text-generation",
-    max_new_tokens=8048,
     do_sample=False,
 )

 chat_llm = HuggingFaceEndpoint(
     repo_id="meta-llama/Meta-Llama-3-8B-Instruct",
     task="text-generation",
+    max_new_tokens=6000,
     do_sample=False,
 )