Jingxiang Mo commited on
Commit
07fbd40
1 Parent(s): 9f9e047

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -5,10 +5,10 @@ https://rajpurkar.github.io/SQuAD-explorer/
5
 
6
  We will use the Stanford Question Answering Dataset (SQuAD) for our machine learning project because it is a large-scale, diverse dataset containing over 100,000 questions and answers. It has been widely used and evaluated by the research community and is well-suited for training and evaluating models for question answering and machine reading comprehension tasks.
7
 
8
- Project Goal
9
  Question Answering Model: Building a supervised learning logistic regression model that can answer questions based on the information contained within SQuAD. The model could be trained on the questions and answers in the dataset, and then be used to answer new questions.
10
 
11
- Methodology
12
  Data Preprocessing
13
  The data will undergo several preprocessing steps to ensure that it is suitable for the question-answering model. These steps include data cleaning, data transformation, and data encoding.
14
 
@@ -24,6 +24,6 @@ We have also considered other classification models, including KNN, Naive Bayes,
24
  Evaluation Metric
25
  When it comes to the Evaluation Metric we intend to use, since we’re using a classification model, it only makes sense to use a Confusion Matrix. However, we still have to learn how the BLEU score with brevity penalty could help us as it deals with text generation problems which is also what we work on.
26
 
27
- Application
28
  We hope to build a web application and provide a user-friendly interface that allows users to input their questions either through voice or text. This will allow for greater accessibility and convenience for users with different preferences.
29
  The model will then provide its answer via text, which will then be voiced by an API. This will ensure that the user can receive the answer in their preferred format, whether they prefer to hear the answer or read it. The dual output format will also ensure that the bot's answer can be easily shared or recorded, making it more accessible for others to use.
 
5
 
6
  We will use the Stanford Question Answering Dataset (SQuAD) for our machine learning project because it is a large-scale, diverse dataset containing over 100,000 questions and answers. It has been widely used and evaluated by the research community and is well-suited for training and evaluating models for question answering and machine reading comprehension tasks.
7
 
8
+ ### Project Goal
9
  Question Answering Model: Building a supervised learning logistic regression model that can answer questions based on the information contained within SQuAD. The model could be trained on the questions and answers in the dataset, and then be used to answer new questions.
10
 
11
+ ### Methodology
12
  Data Preprocessing
13
  The data will undergo several preprocessing steps to ensure that it is suitable for the question-answering model. These steps include data cleaning, data transformation, and data encoding.
14
 
 
24
  Evaluation Metric
25
  When it comes to the Evaluation Metric we intend to use, since we’re using a classification model, it only makes sense to use a Confusion Matrix. However, we still have to learn how the BLEU score with brevity penalty could help us as it deals with text generation problems which is also what we work on.
26
 
27
+ ### Application
28
  We hope to build a web application and provide a user-friendly interface that allows users to input their questions either through voice or text. This will allow for greater accessibility and convenience for users with different preferences.
29
  The model will then provide its answer via text, which will then be voiced by an API. This will ensure that the user can receive the answer in their preferred format, whether they prefer to hear the answer or read it. The dual output format will also ensure that the bot's answer can be easily shared or recorded, making it more accessible for others to use.