Spaces:

dar-tau
/

run_inference

Sleeping

dar-tau commited on Jun 8, 2024

Commit

28d873f

verified ·

1 Parent(s): a8f6d87

Update app.py

Files changed (1) hide show

app.py CHANGED Viewed

@@ -19,8 +19,36 @@ system_prompt = '''You are given a partial input text for a chat interface. Prop
 Don't suggest anything if there are no good suggestions.
 Make sure the suggestions are valid completions of the text! No need for them to complete the text completely.
 Suggest only up to 5 works ahead. The scheme of your answer should be "answer1;answer2;answer3" (return between 0 to 4 answers).
 '''
 @spaces.GPU
 def generate(text):
     messages = [
@@ -32,5 +60,6 @@ def generate(text):
 if __name__ == "__main__":
     demo = gr.Interface(generate, inputs="textbox", outputs="textbox")
     demo.launch()

 Don't suggest anything if there are no good suggestions.
 Make sure the suggestions are valid completions of the text! No need for them to complete the text completely.
 Suggest only up to 5 works ahead. The scheme of your answer should be "answer1;answer2;answer3" (return between 0 to 4 answers).
+Answers should be only the completions themselves.
+Examples:
+(1)
+User: "Help me write a sentiment analysis pipeline"
+Assistant: "using huggingface;using NLTK;using python"
+(2)
+User: "My name is"
+Assistant: "" (nothing much to contribute at this point. return nothing)
+(3)
+User: "Help me find a present for my"
+Assistant: "girlfriend;mother;father;friend"
 '''
+@spaces.GPU
+def get_past_key_values(system_prompt):
+    messages = [{'role': 'system', 'content': system}]
+    tokenized = tokenizer.apply_chat_template(messages, return_tensors='pt')
+    # assert that this is indeed a prefix (TODO: make sure this is robust)
+    messages.append({'role': 'user', 'content': 'TEST'})
+    tokenized_test = tokenizer.apply_chat_template(messages, return_tensors='pt')
+    assert (tokenized_test[:, :tokenized.shape[1]] == tokenized).all().cpu().item()
+    return model(**tokenized.to(model.device)).past_key_values
 @spaces.GPU
 def generate(text):
     messages = [
 if __name__ == "__main__":
+    past_key_values = get_past_key_values(system_prompt)
     demo = gr.Interface(generate, inputs="textbox", outputs="textbox")
     demo.launch()