Alpaca233 commited on
Commit
52d0cfd
·
1 Parent(s): 73950be

Upload 18 files

Browse files
README.md ADDED
@@ -0,0 +1,106 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## CHATGPT-PAPER-READER📝
2
+ This repository provides a simple interface that utilizes the gpt-3.5-turbo model to read academic papers in PDF format locally. You can use it to help you summarize papers, create presentation slides, or simply fulfill tasks assigned by your supervisor.
3
+
4
+ ## How Does This Work
5
+ Considering the following issues with using ChatGPT to read complete academic papers:
6
+
7
+ - The ChatGPT model itself has a context window size of 4096 tokens, making it unable to process the entire paper directly.
8
+ - It is easy to forget the context when dealing with long texts.
9
+
10
+ This repository attempts to solve these problems when using the OpenAI interface in the following ways:
11
+
12
+ - Splitting a PDF paper into multiple parts for reading and generating a summary of each part. When reading each part, it will refer to the context of the previous part within the token limit.
13
+ - Combining the summaries of each part to generate a summary of the entire paper. This can partially alleviate the forgetting problem when reading with ChatGPT.
14
+ - Before reading the paper, you can set the questions you are interested in the prompt. This will help ChatGPT focus on the relevant information when reading and summarizing, resulting in better reading performance.
15
+
16
+ By default, the initalized prompt will ask ChatGPT to focus on these points:
17
+ - Who are the authors?
18
+ - What is the process of the proposed method?
19
+ - What is the performance of the proposed method? Please note down its performance metrics.
20
+ - What are the baseline models and their performances? Please note down these baseline methods.
21
+ - What dataset did this paper use?
22
+
23
+ These questions are designed for research articles in the field of computer science.
24
+ After finishing reading the paper, you can ask questions using 'question()' interface, it will anwser your question based on the summaries of each part.
25
+
26
+ ## Example: Read AlexNet Paper
27
+
28
+ ### Summarize AlexNet
29
+ ```python
30
+ from gpt_reader.pdf_reader import PaperReader, BASE_POINTS
31
+
32
+ print('Key points to focus while reading: {}'.format(BASE_POINTS))
33
+
34
+ api_key = 'Your key'
35
+ session = PaperReader(api_key, points_to_focus=BASE_POINTS) # You can set your key points
36
+ summary = session.read_pdf_and_summarize('./alexnet.pdf')
37
+
38
+ print(summary)
39
+ ```
40
+
41
+ ```
42
+ # console print
43
+ reading pdf finished
44
+ page: 0, part: 0
45
+ page: 0, part: 1
46
+ page: 1, part: 0
47
+ page: 1, part: 1
48
+ page: 2, part: 0
49
+ page: 2, part: 1
50
+ page: 3, part: 0
51
+ page: 3, part: 1
52
+ page: 4, part: 0
53
+ page: 4, part: 1
54
+ page: 5, part: 0
55
+ page: 5, part: 1
56
+ page: 6, part: 0
57
+ page: 6, part: 1
58
+ page: 7, part: 0
59
+ page: 7, part: 1
60
+ page: 8, part: 0
61
+ page: 8, part: 1
62
+ ```
63
+
64
+ > reply: The paper presents a deep convolutional neural network architecture that was trained to classify 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into 1000 different classes. The network achieved top-1 and top-5 error rates of 37.5% and 17.0%, respectively, which is considerably better than the previous state-of-the-art. The authors used non-saturating neurons and a very efficient GPU implementation of the convolution operation to make training faster. They also employed a recently-developed regularization method called "dropout" to reduce overfitting in the fully-connected layers. The paper discusses the importance of large datasets for object recognition and the need for models with a large learning capacity. The authors entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry. The paper also discusses the advantages of using convolutional neural networks (CNNs) for image recognition, including their ability to make strong and mostly correct assumptions about the nature of images, and the challenges of applying CNNs to high-resolution images. The authors describe the architecture of their CNN, which contains eight layers with weights, and the techniques they used to combat overfitting, including data augmentation and dropout. The paper provides qualitative evaluations of their CNN architecture and suggests future work, including using very large and deep convolutional nets on video sequences. The paper also lists references cited in the paper, including papers on image recognition challenges, object recognition, handwritten digit recognition, and convolutional networks and their applications in vision.
65
+
66
+
67
+ ### Ask some questions
68
+
69
+ ```python
70
+ session.question('Authors of this paper?')
71
+ ```
72
+
73
+ > reply: The authors of this paper are Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton from the University of Toronto.
74
+
75
+
76
+ ```python
77
+ session.question('Describe the proposed method in details.')
78
+ ```
79
+
80
+ > reply:The proposed method is based on a large, deep convolutional neural network that is trained with stochastic gradient descent using techniques like weight decay, momentum, and dropout for regularization. The network consists of eight layers (five convolutional and three fully-connected) with the output layer consisting of a 1000-way softmax to produce a distribution over 1000 class labels. The authors used a ReLU activation function and local response normalization to improve training speed, and overlapping pooling to improve generalization. They also reduced overfitting by augmenting the training data with techniques like image translations, horizontal reflections, and altering the intensities of the RGB channels in training images using PCA. The proposed method was trained on subsets of the ImageNet dataset used in the ILSVRC-2010 and ILSVRC-2012 competitions, which contain roughly 1.2 million training images, 50,000 validation images, and 150,000 testing images. By using a large, deep convolutional neural network with various optimization strategies, the proposed method achieved top-5 error rates of 16.4% and 15.3% on the ILSVRC-2012 validation and test sets, respectively, outperforming the previous state-of-the-art results.
81
+ >
82
+
83
+ ```python
84
+ session.question('What dataset does this paper use for experiment?')
85
+ ```
86
+
87
+ > reply: The authors used subsets of the ImageNet dataset used in the ILSVRC-2010 and ILSVRC-2012 competitions for their experiments. These subsets contain approximately 1.2 million training images, 50,000 validation images, and 150,000 testing images.
88
+
89
+ ```python
90
+ session.question('What metrics does this paper use to evaluate the performace?')
91
+ ```
92
+
93
+ > reply: The paper evaluated the performance of their proposed method using Top-1 and Top-5 error rates on the validation and test sets of the ILSVRC-2010 and ILSVRC-2012 image recognition challenges. The Top-1 error rate measures the percentage of images for which the predicted class label is not the correct label, while the Top-5 error rate measures the percentage of images for which the correct label is not among the five most probable labels predicted by the model.
94
+
95
+ ```python
96
+ session.question('Summarize this paper and your anwsers above, then translate the result into Chinese')
97
+ ```
98
+ > reply: 这篇论文的作者是来自Toronto大学的Alex Krizhevsky、Ilya Sutskever和Geoffrey E. Hinton。该论文提出了一种基于卷积神经网络的深度学习方法,该方法在 ImageNet LSVRC-2010 和 ImageNet LSVRC-2012 图像识别挑战中达到了最佳结果。\n\n该方法使用了一系列技术来优化性能,包括使用ReLU激活函数和局部响应归一化等方法提高训练速度,使用重叠池化等方法提高泛化能力。为了减少过拟合,该方法还使用了数据增强技术,如图像平移、水平反射和使用PCA修改训练图像的RGB通道强度。作者在多个GPU上使用随机梯度下降进行训练,通过调整学习速率和使用权重衰减、动量和dropout等技术来优化性能。\n\n用于实验的数据集是 ImageNet LSVRC-2010 和 ImageNet LSVRC-2012 数据集的子集,其中包括大约120万个训练图像、5万个验证图像和15万个测试图像。该方法相比之前的最佳结果,达到了 更好的Top-1错误率和Top-5错误率。作者使用这两个错误率来评估性能,Top-1错误率表示预测的类别不是正确标签的百分率,而Top-5错误率表示真实标签不在模型预测的五个最可能标签中的百分率。
99
+
100
+
101
+ ## TODO
102
+
103
+ - This demo still needs to be improved to support longer articles. Articles of more than 10 pages have the possibility to exceed the token limit during processing.
104
+ - You may exceed the token limit when asking questions.
105
+ - More prompt tuning needed to let it outputs stable results.
106
+ - Imporve summary accuracies
app.py ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+
3
+ from gpt_reader.pdf_reader import PaperReader
4
+ from gpt_reader.prompt import BASE_POINTS
5
+
6
+
7
+ class GUI:
8
+ def __init__(self):
9
+ self.api_key = ""
10
+ self.session = ""
11
+
12
+ def analyse(self, api_key, pdf_file):
13
+ self.session = PaperReader(api_key, points_to_focus=BASE_POINTS)
14
+ return self.session.read_pdf_and_summarize(pdf_file)
15
+
16
+ def ask_question(self, question):
17
+ if self.session == "":
18
+ return "Please upload PDF file first!"
19
+ return self.session.question(question)
20
+
21
+
22
+ with gr.Blocks() as demo:
23
+ gr.Markdown(
24
+ """
25
+ # CHATGPT-PAPER-READER
26
+ """)
27
+
28
+ with gr.Tab("Upload PDF File"):
29
+ pdf_input = gr.File(label="PDF File")
30
+ api_input = gr.Textbox(label="OpenAI API Key")
31
+ result = gr.Textbox(label="PDF Summary")
32
+ upload_button = gr.Button("Start Analyse")
33
+ with gr.Tab("Ask question about your PDF"):
34
+ question_input = gr.Textbox(label="Your Question", placeholder="Authors of this paper?")
35
+ answer = gr.Textbox(label="Answer")
36
+ ask_button = gr.Button("Ask")
37
+ with gr.Accordion("About this project"):
38
+ gr.Markdown(
39
+ """## CHATGPT-PAPER-READER📝
40
+ This repository provides a simple interface that utilizes the gpt-3.5-turbo
41
+ model to read academic papers in PDF format locally. You can use it to help you summarize papers,
42
+ create presentation slides, or simply fulfill tasks assigned by your supervisor.\n
43
+ [Github](https://github.com/talkingwallace/ChatGPT-Paper-Reader)""")
44
+
45
+ app = GUI()
46
+ upload_button.click(fn=app.analyse, inputs=[api_input, pdf_input], outputs=result)
47
+ ask_button.click(app.ask_question, inputs=question_input, outputs=answer)
48
+
49
+ if __name__ == "__main__":
50
+ demo.title = "CHATGPT-PAPER-READER"
51
+ demo.launch(server_port=2333) # add "share=True" to share CHATGPT-PAPER-READER app on Internet.
gpt_reader/__init__.py ADDED
File without changes
gpt_reader/__pycache__/__init__.cpython-38.pyc ADDED
Binary file (148 Bytes). View file
 
gpt_reader/__pycache__/__init__.cpython-39.pyc ADDED
Binary file (148 Bytes). View file
 
gpt_reader/__pycache__/model_interface.cpython-38.pyc ADDED
Binary file (1.36 kB). View file
 
gpt_reader/__pycache__/model_interface.cpython-39.pyc ADDED
Binary file (1.36 kB). View file
 
gpt_reader/__pycache__/paper.cpython-38.pyc ADDED
Binary file (961 Bytes). View file
 
gpt_reader/__pycache__/paper.cpython-39.pyc ADDED
Binary file (961 Bytes). View file
 
gpt_reader/__pycache__/pdf_reader.cpython-38.pyc ADDED
Binary file (3.37 kB). View file
 
gpt_reader/__pycache__/pdf_reader.cpython-39.pyc ADDED
Binary file (3.37 kB). View file
 
gpt_reader/__pycache__/prompt.cpython-38.pyc ADDED
Binary file (1.28 kB). View file
 
gpt_reader/__pycache__/prompt.cpython-39.pyc ADDED
Binary file (1.28 kB). View file
 
gpt_reader/model_interface.py ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from typing import List
2
+ import openai
3
+
4
+
5
+ class ModelInterface(object):
6
+
7
+ def __init__(self) -> None:
8
+ pass
9
+
10
+ def send_msg(self, *args):
11
+ pass
12
+
13
+
14
+ class OpenAIModel(object):
15
+
16
+ def __init__(self, api_key, model='gpt-3.5-turbo', temperature=0.2) -> None:
17
+ openai.api_key = api_key
18
+ self.model = model
19
+ self.temperature = temperature
20
+
21
+ def send_msg(self, msg: List[dict], return_raw_text=True):
22
+
23
+ response = openai.ChatCompletion.create(
24
+ model=self.model,
25
+ messages=msg,
26
+ temperature=self.temperature
27
+ )
28
+
29
+ if return_raw_text:
30
+ return response["choices"][0]["message"]["content"]
31
+ else:
32
+ return response
gpt_reader/paper.py ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from PyPDF2 import PdfReader
2
+
3
+ class Paper(object):
4
+
5
+ def __init__(self, pdf_obj: PdfReader) -> None:
6
+ self._pdf_obj = pdf_obj
7
+ self._paper_meta = self._pdf_obj.metadata
8
+
9
+ def iter_pages(self, iter_text_len: int = 3000):
10
+ page_idx = 0
11
+ for page in self._pdf_obj.pages:
12
+ txt = page.extract_text()
13
+ for i in range((len(txt) // iter_text_len) + 1):
14
+ yield page_idx, i, txt[i * iter_text_len:(i + 1) * iter_text_len]
15
+ page_idx += 1
16
+
17
+
18
+ if __name__ == '__main__':
19
+ reader = PdfReader('../alexnet.pdf')
20
+ paper = Paper(reader)
gpt_reader/pdf_reader.py ADDED
@@ -0,0 +1,121 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from PyPDF2 import PdfReader
2
+ import openai
3
+ from .prompt import BASE_POINTS, READING_PROMT_V2
4
+ from .paper import Paper
5
+ from .model_interface import OpenAIModel
6
+
7
+
8
+ # Setting the API key to use the OpenAI API
9
+ class PaperReader:
10
+
11
+ """
12
+ A class for summarizing research papers using the OpenAI API.
13
+
14
+ Attributes:
15
+ openai_key (str): The API key to use the OpenAI API.
16
+ token_length (int): The length of text to send to the API at a time.
17
+ model (str): The GPT model to use for summarization.
18
+ points_to_focus (str): The key points to focus on while summarizing.
19
+ verbose (bool): A flag to enable/disable verbose logging.
20
+
21
+ """
22
+
23
+ def __init__(self, openai_key, token_length=4000, model="gpt-3.5-turbo",
24
+ points_to_focus=BASE_POINTS, verbose=False):
25
+
26
+ # Setting the API key to use the OpenAI API
27
+ openai.api_key = openai_key
28
+
29
+ # Initializing prompts for the conversation
30
+ self.init_prompt = READING_PROMT_V2.format(points_to_focus)
31
+
32
+ self.summary_prompt = 'You are a researcher helper bot. Now you need to read the summaries of a research paper.'
33
+ self.messages = [] # Initializing the conversation messages
34
+ self.summary_msg = [] # Initializing the summary messages
35
+ self.token_len = token_length # Setting the token length to use
36
+ self.keep_round = 2 # Rounds of previous dialogues to keep in conversation
37
+ self.model = model # Setting the GPT model to use
38
+ self.verbose = verbose # Flag to enable/disable verbose logging
39
+ self.model = OpenAIModel(api_key=openai_key, model=model)
40
+
41
+ def drop_conversation(self, msg):
42
+ # This method is used to drop previous messages from the conversation and keep only recent ones
43
+ if len(msg) >= (self.keep_round + 1) * 2 + 1:
44
+ new_msg = [msg[0]]
45
+ for i in range(3, len(msg)):
46
+ new_msg.append(msg[i])
47
+ return new_msg
48
+ else:
49
+ return msg
50
+
51
+ def send_msg(self, msg):
52
+ return self.model.send_msg(msg)
53
+
54
+ def _chat(self, message):
55
+ # This method is used to send a message and get a response from the OpenAI API
56
+
57
+ # Adding the user message to the conversation messages
58
+ self.messages.append({"role": "user", "content": message})
59
+ # Sending the messages to the API and getting the response
60
+ response = self.send_msg(self.messages)
61
+ # Adding the system response to the conversation messages
62
+ self.messages.append({"role": "system", "content": response})
63
+ # Dropping previous conversation messages to keep the conversation history short
64
+ self.messages = self.drop_conversation(self.messages)
65
+ # Returning the system response
66
+ return response
67
+
68
+ def summarize(self, paper: Paper):
69
+ # This method is used to summarize a given research paper
70
+
71
+ # Adding the initial prompt to the conversation messages
72
+ self.messages = [
73
+ {"role": "system", "content": self.init_prompt},
74
+ ]
75
+ # Adding the summary prompt to the summary messages
76
+ self.summary_msg = [{"role": "system", "content": self.summary_prompt}]
77
+
78
+ # Reading and summarizing each part of the research paper
79
+ for (page_idx, part_idx, text) in paper.iter_pages():
80
+ print('page: {}, part: {}'.format(page_idx, part_idx))
81
+ # Sending the text to the API and getting the response
82
+ summary = self._chat('now I send you page {}, part {}:{}'.format(page_idx, part_idx, text))
83
+ # Logging the summary if verbose logging is enabled
84
+ if self.verbose:
85
+ print(summary)
86
+ # Adding the summary of the part to the summary messages
87
+ self.summary_msg.append({"role": "user", "content": '{}'.format(summary)})
88
+
89
+ # Adding a prompt for the user to summarize the whole paper to the summary messages
90
+ self.summary_msg.append({"role": "user", "content": 'Now please make a summary of the whole paper'})
91
+ # Sending the summary messages to the API and getting the response
92
+ result = self.send_msg(self.summary_msg)
93
+ # Returning the summary of the whole paper
94
+ return result
95
+
96
+ def read_pdf_and_summarize(self, pdf_path):
97
+ # This method is used to read a research paper from a PDF file and summarize it
98
+
99
+ # Creating a PdfReader object to read the PDF file
100
+ pdf_reader = PdfReader(pdf_path)
101
+ paper = Paper(pdf_reader)
102
+ # Summarizing the full text of the research paper and returning the summary
103
+ print('reading pdf finished')
104
+ summary = self.summarize(paper)
105
+ return summary
106
+
107
+ def get_summary_of_each_part(self):
108
+ # This method is used to get the summary of each part of the research paper
109
+ return self.summary_msg
110
+
111
+ def question(self, question):
112
+ # This method is used to ask a question after summarizing a paper
113
+
114
+ # Adding the question to the summary messages
115
+ self.summary_msg.append({"role": "user", "content": question})
116
+ # Sending the summary messages to the API and getting the response
117
+ response = self.send_msg(self.summary_msg)
118
+ # Adding the system response to the summary messages
119
+ self.summary_msg.append({"role": "system", "content": response})
120
+ # Returning the system response
121
+ return response
gpt_reader/prompt.py ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ BASE_POINTS = """
2
+ 1. Who are the authors?
3
+ 2. What is the process of the proposed method?
4
+ 3. What is the performance of the proposed method? Please note down its performance metrics.
5
+ 4. What are the baseline models and their performances? Please note down these baseline methods.
6
+ 5. What dataset did this paper use?
7
+ """
8
+
9
+ READING_PROMPT = """
10
+ You are a researcher helper bot. You can help the user with research paper reading and summarizing. \n
11
+ Now I am going to send you a paper. You need to read it and summarize it for me part by part. \n
12
+ When you are reading, You need to focus on these key points:{}
13
+ """
14
+
15
+ READING_PROMT_V2 = """
16
+ You are a researcher helper bot. You can help the user with research paper reading and summarizing. \n
17
+ Now I am going to send you a paper. You need to read it and summarize it for me part by part. \n
18
+ When you are reading, You need to focus on these key points:{},
19
+
20
+ And You need to generate a brief but informative title for this part.
21
+ Your return format:
22
+ - title: '...'
23
+ - summary: '...'
24
+ """
25
+
26
+ SUMMARY_PROMPT = "You are a researcher helper bot. Now you need to read the summaries of a research paper."
requirements.txt ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ aiofiles==23.1.0
2
+ aiohttp==3.8.4
3
+ aiosignal==1.3.1
4
+ altair==4.2.2
5
+ anyio==3.6.2
6
+ async-timeout==4.0.2
7
+ attrs==22.2.0
8
+ certifi==2022.12.7
9
+ charset-normalizer==3.1.0
10
+ click==8.1.3
11
+ contourpy==1.0.7
12
+ cycler==0.11.0
13
+ entrypoints==0.4
14
+ fastapi==0.94.0
15
+ ffmpy==0.3.0
16
+ fonttools==4.39.0
17
+ frozenlist==1.3.3
18
+ fsspec==2023.3.0
19
+ gradio==3.20.1
20
+ h11==0.14.0
21
+ httpcore==0.16.3
22
+ httpx==0.23.3
23
+ idna==3.4
24
+ importlib-resources==5.12.0
25
+ Jinja2==3.1.2
26
+ jsonschema==4.17.3
27
+ kiwisolver==1.4.4
28
+ linkify-it-py==2.0.0
29
+ markdown-it-py==2.2.0
30
+ MarkupSafe==2.1.2
31
+ matplotlib==3.7.1
32
+ mdit-py-plugins==0.3.3
33
+ mdurl==0.1.2
34
+ multidict==6.0.4
35
+ numpy==1.24.2
36
+ openai==0.27.1
37
+ orjson==3.8.7
38
+ packaging==23.0
39
+ pandas==1.5.3
40
+ Pillow==9.4.0
41
+ pkgutil_resolve_name==1.3.10
42
+ pycryptodome==3.17
43
+ pydantic==1.10.6
44
+ pydub==0.25.1
45
+ pyparsing==3.0.9
46
+ PyPDF2==3.0.1
47
+ pyrsistent==0.19.3
48
+ python-dateutil==2.8.2
49
+ python-multipart==0.0.6
50
+ pytz==2022.7.1
51
+ PyYAML==6.0
52
+ requests==2.28.2
53
+ rfc3986==1.5.0
54
+ six==1.16.0
55
+ sniffio==1.3.0
56
+ starlette==0.26.1
57
+ toolz==0.12.0
58
+ tqdm==4.65.0
59
+ typing_extensions==4.5.0
60
+ uc-micro-py==1.0.1
61
+ urllib3==1.26.15
62
+ uvicorn==0.21.0
63
+ websockets==10.4
64
+ yarl==1.8.2
65
+ zipp==3.15.0