ESMA-GPT / config.py
Afritz's picture
Update config.py
1c689e8
raw
history blame
4.09 kB
class CFG_APP:
DEBUG = True
K_TOTAL = 10
THRESHOLD = 0.3
DEVICE = "cpu"
BOT_NAME = "ESMA-GPT"
MODEL_NAME = "gpt-3.5-turbo"
DEFAULT_LANGUAGE = "English"
DATA_FOLDER = "data/"
EMBEDDING_MODEL = "sentence-transformers/multi-qa-mpnet-base-dot-v1"
MAX_TOKENS_REF_QUESTION = 128 # Number of tokens in reformulated question
MAX_TOKENS_ANSWER = 1024 # Number of tokens in answers
MAX_TOKENS_API = 3100
INIT_PROMPT = (
f"You are {BOT_NAME}, an expert in market regulations, an AI Assistant by Nexialog Consulting. "
"You are given a question and extracted parts of regulation reports."
"Provide a clear and structured answer based only on the context provided. "
"When relevant, use bullet points and lists to structure your answers."
)
SOURCES_PROMPT = (
"When relevant, use facts and numbers from the following documents in your answer. "
"Whenever you use information from a document, reference it at the end of the sentence by naming it Exc (ex: [exc 2])."
"Very important ! Never use the word Document or Doc for referencing an Excerpt, always exc. "
"You don't have to use all documents, only if it makes sense in the conversation. "
"If no relevant information to answer the question is present in the documents, "
"just say you don't have enough information to answer."
)
DEFAULT_QUESTIONS = (
"What is the definition of PD ?",
"What is the definition of LGD ?",
"What is the definition of EAD ?",
"What is the definition of EL ?",
"What is the definition of ELBE ?",
"What does the credit conversion factor correspond to?",
"Is it mandatory to implement a credit conversion factor?",
"Quand doit on appliquer la période probatoire ?",
"Comment doivent étre traité les mutliples defaut ?",
"What is the significance of Directive 2009/65/EC, also known as the UCITS Directive?",
"How do these directives and regulations interact and influence each other as per the passage?",
)
REFORMULATION_PROMPT = """
Important ! Give the output as a standalone question followed by the detected language whatever the form of the query.
Reformulate the following user message to be a short standalone question in English, in the context of an educational discussion about regulations in banks. Then detect the language of the query
Sometimes, explanations of some abbreviations will be given in parentheses, keep them.
---
query: C'est quoi les régles que les banques américaines doivent suivre ?
standalone question: What are the key regulations that banks in the United States must follow?
language: French
---
query: what are the main effects of bank regulations?
standalone question: What are the main causes of bank regulations change in the last century?
language: English
---
query: UL (Unexpected Loss)
standalone question: What does UL (Unexpected Loss) stand for?
language: English
"""
HYDE_PROMPT ="""
Important ! Give the output as an answer to the query. First you will translate the query in English and then answer it, in English, in 2 sentences maximum using the right vocabulary of the context of the query.
Very important : the answer must be followed by the detected language of the query whatever the form of the query. You must keep the question at the begining of the answer.
Here is an example of the template you must follow to create your answer :
---
query : C'est quoi les régles que les banques américaines doivent suivre ?
output : What are the rules that American banks must follow ? American banks must follow a set of federal and state regulations imposed by agencies such as the Federal Reserve and the Consumer Financial Protection Bureau..
language : French
"""
DOC_METADATA_PATH = f"{DATA_FOLDER}/doc_metadata.json"