kaikaidai commited on
Commit
68c49ca
·
verified ·
1 Parent(s): 40a124e

Update common.py

Browse files
Files changed (1) hide show
  1. common.py +18 -18
common.py CHANGED
@@ -47,28 +47,28 @@ EVAL_DESCRIPTION = """
47
  - Examples (Optional)
48
  """
49
 
50
- DEFAULT_EVAL_PROMPT = """You are assessing a chat bot response to a user's input. Your evaluation should focus on the helpfulness of the response given the user's instructions. Do not allow the length of the response to influence your evaluation. Be objective as possible and give a brief explanation for your score.
51
 
52
  Scoring Rubric:
53
- Score 1: The response is unhelpful, providing irrelevant or incorrect content that does not address the request.
54
- Score 2: The response is partially helpful, missing key elements or including minor inaccuracies, and lacks depth in addressing the request.
55
- Score 3: The response is adequately helpful, correctly addressing the main request with relevant information and some depth.
56
- Score 4: The response is very helpful, addressing the request thoroughly with accurate and detailed content, but may lack a minor aspect of helpfulness.
57
- Score 5: The response is exceptionally helpful, providing precise, comprehensive content that fully resolves the request with insight and clarity.
58
 
59
  [User Query]: {{input}}
60
 
61
  [AI Response]: {{response}}"""
62
 
63
  # Split the eval prompt into editable and fixed parts
64
- DEFAULT_EVAL_PROMPT_EDITABLE = """You are assessing a chat bot response to a user's input. Your evaluation should focus on the helpfulness of the response given the user's instructions. Do not allow the length of the response to influence your evaluation. Be objective as possible and give a brief explanation for your score.
65
 
66
  Scoring Rubric:
67
- Score 1: The response is unhelpful, providing irrelevant or incorrect content that does not address the request.
68
- Score 2: The response is partially helpful, missing key elements or including minor inaccuracies, and lacks depth in addressing the request.
69
- Score 3: The response is adequately helpful, correctly addressing the main request with relevant information and some depth.
70
- Score 4: The response is very helpful, addressing the request thoroughly with accurate and detailed content, but may lack a minor aspect of helpfulness.
71
- Score 5: The response is exceptionally helpful, providing precise, comprehensive content that fully resolves the request with insight and clarity."""
72
 
73
  # Fixed suffix that will always be appended
74
  FIXED_EVAL_SUFFIX = """
@@ -164,17 +164,17 @@ We’d love to hear your feedback! For general feature requests or to submit / s
164
 
165
 
166
  # Default values for compatible mode
167
- DEFAULT_EVAL_CRITERIA = """Evaluate the helpfulness of the chatbot response given the user's instructions. Focus on relevance, accuracy, and completeness while being objective. Do not consider response length in your evaluation."""
168
 
169
- DEFAULT_SCORE_1 = "The response is unhelpful, providing irrelevant or incorrect content that does not address the request."
170
 
171
- DEFAULT_SCORE_2 = "The response is partially helpful, missing key elements or including minor inaccuracies, and lacks depth in addressing the request."
172
 
173
- DEFAULT_SCORE_3 = "The response is adequately helpful, correctly addressing the main request with relevant information and some depth."
174
 
175
- DEFAULT_SCORE_4 = "The response is very helpful, addressing the request thoroughly with accurate and detailed content, but may lack a minor aspect of helpfulness."
176
 
177
- DEFAULT_SCORE_5 = "The response is exceptionally helpful, providing precise, comprehensive content that fully resolves the request with insight and clarity."
178
 
179
  #**What are the Evaluator Prompt Templates based on?**
180
 
 
47
  - Examples (Optional)
48
  """
49
 
50
+ DEFAULT_EVAL_PROMPT = """Does the model provide relevant and useful responses to the user's needs or questions?
51
 
52
  Scoring Rubric:
53
+ Score 1: The model's responses are irrelevant or unhelpful to the user's needs or queries.
54
+ Score 2: The model sometimes provides helpful information, but often fails to address the user's actual needs or questions.
55
+ Score 3: The model generally provides helpful responses that address the user's needs, though it may occasionally miss the mark.
56
+ Score 4: The model regularly provides helpful responses that are well-aligned with the user's inquiries, with only rare inaccuracies.
57
+ Score 5: The model consistently offers highly relevant and useful responses that perfectly cater to the user's needs and inquiries.
58
 
59
  [User Query]: {{input}}
60
 
61
  [AI Response]: {{response}}"""
62
 
63
  # Split the eval prompt into editable and fixed parts
64
+ DEFAULT_EVAL_PROMPT_EDITABLE = """Does the model provide relevant and useful responses to the user's needs or questions?
65
 
66
  Scoring Rubric:
67
+ Score 1: The model's responses are irrelevant or unhelpful to the user's needs or queries.
68
+ Score 2: The model sometimes provides helpful information, but often fails to address the user's actual needs or questions.
69
+ Score 3: The model generally provides helpful responses that address the user's needs, though it may occasionally miss the mark.
70
+ Score 4: The model regularly provides helpful responses that are well-aligned with the user's inquiries, with only rare inaccuracies.
71
+ Score 5: The model consistently offers highly relevant and useful responses that perfectly cater to the user's needs and inquiries."""
72
 
73
  # Fixed suffix that will always be appended
74
  FIXED_EVAL_SUFFIX = """
 
164
 
165
 
166
  # Default values for compatible mode
167
+ DEFAULT_EVAL_CRITERIA = """Does the model provide relevant and useful responses to the user's needs or questions?"""
168
 
169
+ DEFAULT_SCORE_1 = "The model's responses are irrelevant or unhelpful to the user's needs or queries."
170
 
171
+ DEFAULT_SCORE_2 = "The model sometimes provides helpful information, but often fails to address the user's actual needs or questions."
172
 
173
+ DEFAULT_SCORE_3 = "The model generally provides helpful responses that address the user's needs, though it may occasionally miss the mark."
174
 
175
+ DEFAULT_SCORE_4 = "The model regularly provides helpful responses that are well-aligned with the user's inquiries, with only rare inaccuracies."
176
 
177
+ DEFAULT_SCORE_5 = "The model consistently offers highly relevant and useful responses that perfectly cater to the user's needs and inquiries."
178
 
179
  #**What are the Evaluator Prompt Templates based on?**
180