<a href="https://colab.research.google.com/github/vanderbilt-data-science/lo-achievement/blob/main/prompt_with_vector_store.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# LLMs for Self-Study
> A prompt and code template for better understanding texts

This notebook provides a guide for using LLMs for self-study programmatically. A number of prompt templates are provided to assist with generating great assessments for self-study, and code is additionally provided for fast usage. This notebook is best leveraged for a set of documents (text or PDF preferred) **to be uploaded** for interaction with the model.

This version of the notebook is best suited for those who prefer to use files from their local drive as context rather than copy and pasting directly into the notebook to be used as context for the model. If you prefer to copy and paste text, you should direct yourself to the [prompt_with_context](https://colab.research.google.com/github/vanderbilt-data-science/lo-achievement/blob/main/prompt_with_context.ipynb) notebook.

In [None]:
# run this code if you're using Google Colab or don't have these packages installed in your computing environment
! pip install pip install git+https://<token>@github.com/vanderbilt-data-science/lo-achievement.git

In [None]:
#libraries for user setup code
from getpass import getpass
from logging import raiseExceptions

#self import code
from ai_classroom_suite.PromptInteractionBase import *
from ai_classroom_suite.IOHelperUtilities import *
from ai_classroom_suite.SelfStudyPrompts import *
from ai_classroom_suite.MediaVectorStores import *

# User Settings
In this section, you'll set your OpenAI API Key (for use with the OpenAI model), configure your environment/files for upload, and upload those files.

In [None]:
# Run this cell and enter your OpenAI API key when prompted
set_openai_key()

In [None]:
# Create model
mdl_name = 'gpt-3.5-turbo-16k'
chat_llm = create_model(mdl_name)

## Define Your Documents Source
You may upload your files directly from your computer, or you may choose to do so via your Google Drive. Below, you will find instructions for both methods.

For either model, begin by setting the `upload_setting` variable to:
* `'Local Drive'` - if you have files that are on your own computer (locally), or
* `'Google Drive'` - if you have files that are stored on Google Drive

e.g.,
`upload_setting='Google Drive'`.
Don't forget the quotes around your selection!

In [None]:
## Settings for upload: via local drive or Google Drive
### Please input either "Google Drive" or "Local Drive" into the empty string

#upload_setting = 'Google Drive'
upload_setting = 'Local Drive'

<p style='color:green'><strong>Before Continuing</strong> - Make sure you have input your choice of upload into the `upload_setting`` variable above (Options: "Local Drive" or "Google Drive") as described in the above instructions.</p>

## Upload your Files
Now, you'll upload your files. When you run the below code cell, you'll be able to follow the instructions for local or Google Drive upload described here. If you would like to use our example document (Robert Frost's "The Road Not Taken", you can download the file from [this link](https://drive.google.com/drive/folders/1wpEoGACUqyNRYa4zBZeNkqcLJrGQbA53?usp=sharing) and upload via the instructions above.

**If you selected **"Local Drive"** :**
> If you selected Local Drive, you'll need to start by selecting your local files. Run the code cell below. Once the icon appears, click the "Choose File". This will direct you to your computer's local drive. Select the file you would like to upload as context. The files will appear in the right sidebar. Then follow the rest of the steps in the "Uploading Your files (Local Drive and Google Drive)" below.

**If you selected **"Google Drive"**: **
> If you selected Google Drive, you'll need to start by allowing access to your Google Drive. Run the code cell below. You will be redirected to a window where you will allow access to your Google Drive by logging into your Google Account. Your Drive will appear as a folder in the left side panel. Navigate through your Google Drive until you've found the file that you'd like to upload.

Your files are now accessible to the code.

In [None]:
# Run this cell then following the instructions to upload your file
selected_files = setup_drives(upload_setting)

FileChooser(path='/workspaces/lo-achievement', filename='', title='Use the following file chooser to add each …

Output()

In [None]:
selected_files

['/workspaces/lo-achievement/roadnottaken.txt']

# Resource and Personal Tutor Creation
Congratulations! You've nearly finished with the setup! From here, you can now run this section of cells using the arrow to the left to set up your vector store and create your model.

## Create a vector store with your document

With the file path, you can now create a vector store using the document that you uploaded. We expose this creation in case you want to modify the kind of vector store that you're creating. Run the cell below to create the default provided vector store.

In [None]:
# Create vector store
doc_segments = get_document_segments(selected_files, data_type = 'files')
chroma_db, vs_retriever = create_local_vector_store(doc_segments, search_kwargs={"k": 1})

## Create the model which will do the vector store lookup and tutoring

In [None]:
# Create retrieval chain
qa_chain = create_tutor_mdl_chain(kind="retrieval_qa", retriever = vs_retriever)

# A guide to prompting for self-study
In this section, we provide a number of different approaches for using AI to help you assess and explain the knowledge of your document. Start by interacting with the model and then try out the rest of the prompts!

## Brief overview of tutoring code options

Now that your vector store is created, you can begin interacting with the model! You will interact with the model with a vector store using the `get_tutoring_answer` function below, and details are provided regarding the functionality below.

Consider the multiple choice code snippet:
```{python}
tutor_q = get_tutoring_answer(context = '',
                              qa_chain,
                              assessment_request = SELF_STUDY_DEFAULTS['mc'],
                              learning_objectives = learning_objs,
                              input_kwargs = {'question':topic})
```

This is how we're able to interact with the model for tutoring when using vector stores. The parameters are as follows:

* `context` will be an empty string or you can also set it to `None`. This is because this field is automatically populated using the vector store retreiver.
* `qa_chain` is the model that you're using - we created this model chain a few cells above. 
* `assessment_request` is your way of telling the model what kind of assessment you want. In the example above, we use some defaults provided for multiple choice. You can also insert your own text here. To learn more about these defaults, see the `prompt_with_context.ipynb` in the CLAS repo.
* `learning_objectives` are the learning objectives that you want to assess in a single paragraph string. You can set this to '' if you don't want to define any learning objectives. If you don't provide one, the model will use the default learning objectives.
* `input_kwargs` are additional inputs that we can define in the prompts. Above, you see that the keyword `question` is defined. `question` is the text used to retrieve relevant texts from the vector store. Above, we define a custom topic. If you were to omit this parameter, the model would use `assessment_request` as the text to retrieve relevant documents from the vector store. See the examples below for both scenarios.



## Sample topics and learning objectives

Below, we define a topic (used to retrieve documents from the vector store if provided) and learning objectives which will be used in the following examples. You can change these as needed for your purpose.

In [None]:
# Code topic
topic = 'The full text of the poem "The Road Not Taken" by Robert Frost'

# set learning objectives if desired
learning_objs = ("""1. Identify the key elements of the work: important takeaways and underlying message.
                 2. Understand the literary devices used in prompting and in literature and their purpose.""")

## Types of Questions and Prompts

Below is a comprehensive list of question types and prompt templates designed by our team. There are also example code blocks, where you can see how the model performed with the example and try it for yourself using the prompt template.

### Multiple Choice

Prompt: The following text should be used as the basis for the instructions which follow: {context}. Please design a 5 question quiz about {name or reference to context} which reflects the learning objectives: {list of learning objectives}. The questions should be multiple choice. If I get an answer wrong, provide me with an explanation of why it was incorrect, and then give me additional chances to respond until I get the correct choice. Explain why the correct choice is right.

In [None]:
# Multiple choice code example
tutor_q = get_tutoring_answer('', qa_chain, assessment_request = SELF_STUDY_DEFAULTS['mc'],
                              learning_objectives = learning_objs, input_kwargs = {'question':topic})

print(tutor_q)

Question 1: What is the underlying message of the excerpt?

A) The speaker regrets not being able to travel both roads.
B) The speaker believes that taking the less traveled road has made a significant impact on their life.
C) The speaker is unsure about which road to choose.
D) The speaker is fascinated by the beauty of the yellow wood.

Please select one of the options (A, B, C, or D) and provide your answer.


### Short Answer

Prompt: Please design a 5-question quiz about {context} which reflects the learning objectives: {list of learning objectives}. The questions should be short answer. Expect the correct answers to be {anticipated length} long. If I get any part of the answer wrong, provide me with an explanation of why it was incorrect, and then give me additional chances to respond until I get the correct choice.

In [None]:
# Short answer code example
tutor_q = get_tutoring_answer(None, qa_chain, assessment_request = SELF_STUDY_DEFAULTS['short_answer'],
                              learning_objectives = learning_objs, input_kwargs = {'question':topic})

print(tutor_q)

Question 1: What is the underlying message of the poem?

Remember to provide your answer in a few sentences.


### Fill-in-the-blank

Prompt: Create a 5 question fill in the blank quiz refrencing {context}. The quiz should reflect the learning objectives: {learning objectives}. Please prompt me one question at a time and proceed when I answer correctly. If I answer incorrectly, please explain why my answer is incorrect.

:::{.callout-info}
In the example below, we omit the `input_kwargs` parameter. This means we'll use the text from `assessment_request` as the question topic.
:::

In [None]:
# Fill in the blank code example
tutor_q = get_tutoring_answer(None, qa_chain, assessment_request = SELF_STUDY_DEFAULTS['fill_blank'],
                              learning_objectives = learning_objs)

print(tutor_q)

Question: The speaker in the poem "The Road Not Taken" is faced with a choice between _______ roads.

Please provide your answer.


### Sequencing

Prompt: Please develop a 5 question questionnaire that will ask me to recall the steps involved in the following learning objectives in regard to {context}: {learning objectives}. After I respond, explain their sequence to me.

In [None]:
# Sequence example
tutor_q = get_tutoring_answer(None, qa_chain, assessment_request = SELF_STUDY_DEFAULTS['sequencing'],
                              learning_objectives = learning_objs)

print(tutor_q)

Question 1: What is the underlying message or theme of the provided text?

(Note: Please provide your response and I will evaluate it.)


### Relationships/drawing connections

Prompt: Please design a 5 question quiz that asks me to explain the relationships that exist within the following learning objectives, referencing {context}: {learning objectives}.

In [None]:
# Relationships example
tutor_q = get_tutoring_answer(None, qa_chain, assessment_request = SELF_STUDY_DEFAULTS['relationships'],
                              learning_objectives = learning_objs)

print(tutor_q)

Question 1: What is the underlying message or theme of the text "The Road Not Taken"?

(Note: The answer to this question will require the student to identify the key elements and important takeaways from the text in order to determine the underlying message or theme.)


### Concepts and Definitions

Prompt: Design a 5 question quiz that asks me about definitions related to the following learning objectives: {learning objectives} - based on {context}".
Once I write out my response, provide me with your own response, highlighting why my answer is correct or incorrect.

In [None]:
# Concepts and definitions example
tutor_q = get_tutoring_answer(None, qa_chain, assessment_request = SELF_STUDY_DEFAULTS['concepts'],
                              learning_objectives = learning_objs)

print(tutor_q)

Question 1: Based on the provided text, what is the underlying message or theme of the work?

Please provide your response.


### Real Word Examples

Prompt: Demonstrate how {context} can be applied to solve a real-world problem related to the following learning objectives: {learning objectives}. Ask me questions regarding this theory/concept.

In [None]:
# Real word example
tutor_q = get_tutoring_answer(None, qa_chain, assessment_request = SELF_STUDY_DEFAULTS['real_world_example'],
                              learning_objectives = learning_objs)

print(tutor_q)

Based on the provided context, it seems that the extracted text is a poem by Robert Frost and does not directly provide any information or context related to problem-solving in the real world. Therefore, it may not be possible to demonstrate how the provided context can be applied to solve a real-world problem. However, I can still assess your understanding of the learning objectives mentioned. Let's start with the first learning objective: identifying the key elements of the work, important takeaways, and underlying message. 

Question 1: Based on your reading of the poem, what are some key elements or important takeaways that you can identify?


### Randomized Question Types

Prompt: Please generate a high-quality assessment consisting of 5 varying questions, each of different types (open-ended, multiple choice, etc.), to determine if I achieved the following learning objectives in regards to {context}: {learning objectives}. If I answer incorrectly for any of the questions, please explain why my answer is incorrect.

In [None]:
# Randomized question types
tutor_q = get_tutoring_answer(None, qa_chain, assessment_request = SELF_STUDY_DEFAULTS['randomized_questions'],
                              learning_objectives = learning_objs)

print(tutor_q)

Question 1 (Open-ended):
Based on the given excerpt, what do you think is the underlying message or theme of the text? Please provide a brief explanation to support your answer.

(Note: The answer to this question will vary depending on the student's interpretation of the text. As the tutor, you can provide feedback on the strengths and weaknesses of their response, and guide them towards a deeper understanding of the text's message.)


### Quantiative evaluation the correctness of a student's answer

Prompt: (A continuation of the previous chat) Please generate the main points of the student’s answer to the previous question, and evaluate on a scale of 1 to 5 how comprehensive the student’s answer was in relation to the learning objectives, and explain why he or she received this rating, including what was missed in his or her answer if the student’s answer wasn’t complete.


In [None]:
# qualitative evaluation
qualitative_query = """ Please generate the main points of the student’s answer to the previous question,
 and evaluate on a scale of 1 to 5 how comprehensive the student’s answer was in relation to the learning objectives,
 and explain why he or she received this rating, including what was missed in his or her answer if the student’s answer wasn’t complete."""

last_answer = ("TUTOR QUESTION: Question 1 (Open-ended): " +
               "Based on the given excerpt, what do you think is the underlying message or theme of the text? Please provide a " + 
               "brief explanation to support your answer.\n" + 
               "STUDENT ANSWER: The underlying message of the text is that people should follow the crowd and the road less traveled is hard "+
               "and painful to traverse. Take the easy way instead. ")

# Note that this uses the previous result and query in the context
tutor_q = get_tutoring_answer(None, qa_chain, assessment_request = qualitative_query + '\n' + last_answer,
                              learning_objectives = learning_objs,
                              input_kwargs = {'question':topic})

print(tutor_q)

Main points of the student's answer:
- The underlying message of the text is that people should follow the crowd and take the easy way instead of the road less traveled.
- The road less traveled is hard and painful to traverse.

Evaluation of the student's answer:
I would rate the student's answer a 2 out of 5 in terms of comprehensiveness in relation to the learning objectives. 

Explanation:
The student correctly identifies that the underlying message of the text is related to choosing between two paths, but their interpretation of the message is not entirely accurate. The student suggests that the text encourages people to follow the crowd and take the easy way, which is not supported by the actual message of the poem. The poem actually suggests that taking the road less traveled can make a significant difference in one's life. The student also mentions that the road less traveled is hard and painful to traverse, which is not explicitly stated in the text. This interpretation may be