lihuigu commited on
Commit
88253fe
·
1 Parent(s): e7f10cc
README.md CHANGED
@@ -10,5 +10,176 @@ pinned: false
10
  license: mit
11
  short_description: Quickly generating novel research ideas.
12
  ---
 
13
 
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  license: mit
11
  short_description: Quickly generating novel research ideas.
12
  ---
13
+ <center><h1> 💡SciPIP: An LLM-based Scientific Paper Idea Proposer </h1></center>
14
 
15
+ <div align="center">
16
+ <p>
17
+ <a href="https://github.com/cheerss/SciPIP/issues">
18
+ <img src="https://img.shields.io/github/issues/cheerss/SciPIP" alt="GitHub issues">
19
+ </a>
20
+ <a href="LICENSE">
21
+ <img src="https://img.shields.io/github/license/cheerss/SciPIP" alt="License">
22
+ </a>
23
+ <a href="https://arxiv.org/abs/2410.23166">
24
+ <img src="https://img.shields.io/badge/arXiv-2410.23166-b31b1b" alt="arXiv">
25
+ </a>
26
+ <img src="https://img.shields.io/github/stars/cheerss/SciPIP?color=green&style=social" alt="GitHub stars">
27
+ <img src="https://img.shields.io/badge/python->=3.10.3-blue" alt="Python version">
28
+ </p>
29
+ </div>
30
+
31
+ ![SciPIP](./assets/pic/logo.jpg)
32
+
33
+ ## Introduction
34
+
35
+ SciPIP is a scientific paper idea generation tool powered by a large language model (LLM) designed to **assist researchers in quickly generating novel research ideas**. Based on the background information provided by the user, SciPIP first conducts a literature review to identify relevant research, then generates fresh ideas for potential studies.
36
+ ![SciPIP](./assets/pic/demo.png)
37
+
38
+
39
+ 🤗 Try it on the Hugging Face (Coming Soon... You can deploy it at your own computer now.)
40
+
41
+ ## Updates
42
+
43
+ - [x] Idea generation in a GUI enviroment (web app).
44
+ - [x] Idea generation for the NLP and multimodal (partial) field.
45
+ - [ ] Idea generation for the CV field.
46
+ - [ ] Idea generation for other fields.
47
+ - [ ] Release the Huggingface demo.
48
+
49
+ ## Prerequisites
50
+
51
+ The following enviroments are tested under Ubuntu 22.04 with python>=3.10.3.
52
+
53
+ 1. **Install essential packages**, feel free to copy and paste the following commans into your terminal. After that, you can visit your Neo4j databse in a browser.
54
+
55
+ ```bash
56
+ ## Install git-lfs
57
+ curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
58
+ sudo apt install git-lfs
59
+
60
+ ## Create new conda environment scipip
61
+ conda env create -f environment.yml
62
+ conda activate scipip
63
+
64
+ ## Install Neo4j database
65
+ sudo apt install -y openjdk-17-jre # Install Neo4j required JDK
66
+ # cd ~/Downloads # or /your/path/to/download/Neo4j
67
+ wget http://dist.neo4j.org/neo4j-community-5.20.0-unix.tar.gz
68
+ tar -xvf neo4j-community-5.20.0-unix.tar.gz
69
+
70
+ ## Start Neo4j
71
+ cd ./neo4j-community-5.20.0
72
+ # Uncomment server.default_listen_address=0.0.0.0 in conf/neo4j.conf to visit Neo4j through a browser
73
+ sed -i 's/# server.default_listen_address=0.0.0.0/server.default_listen_address=0.0.0.0/g' ./conf/neo4j.conf
74
+ ./bin/neo4j start
75
+
76
+ # Default URL for neo4j is "http://127.0.0.1:7474"
77
+ # Default URI for ner4j is "bolt://127.0.0.1:7687"
78
+ # Default username and password for neo4j database are both "neo4j"
79
+ # !![IMPORTANT] You must visit "http://127.0.0.1:7474" and change the default password before next step. It is because Neo4j does not permit running with a default password.
80
+ ```
81
+ 2. **Clone this repository (SciPIP) and edit the configuration files.** Specifically, LLMs' API token and the Neo4j' username/password are need configuring, and we have provided the template.
82
+
83
+ ```bash
84
+ ## Clone our repository
85
+ git clone [email protected]:cheerss/SciPIP.git && cd SciPIP
86
+
87
+ ## Edit scripts/env.sh
88
+ # Must be corrected: NEO4J_USERNAME / NEO4J_PASSWD / MODEL_API_KEY / MODEL_URL
89
+ # Others are optional
90
+
91
+ ## source env
92
+ source scripts/env.sh
93
+ ```
94
+ 3. **Prepare the literature database**
95
+
96
+ 1. Download the literature data from [this link](https://drive.google.com/file/d/1NZTDpxKo7bmxwXPI03dgikEemKGLkwne/view?usp=sharing) and save it to `assets/data/scipip_neo4j_clean_backup.json`.
97
+ 2. Then, run the following command to load the literature into Neo4j database (It may 40-60 minutes):
98
+ ```
99
+ python src/utils/paper_client.py
100
+ ```
101
+
102
+ 4. **[Optional] Prepare the embedding model**. Our algorithm uses SentenceBERT and **will automatically download** it from Huggingface the first time the program is run. However, if you're concerned about potential download failures due to network issues, you can download it in advance and place it in the specified directory.
103
+ ```bash
104
+ cd /root/path/of/SciPIP && mkdir -p assets/model/sentence-transformers
105
+ git clone https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2 assets/model/sentence-transformers/all-MiniLM-L6-v2 assets/model/sentence-transformers
106
+ ```
107
+
108
+ ## Run In a Browser (Recommended)
109
+
110
+ ```bash
111
+ streamlit run app.py
112
+ # OR
113
+ python -m streamlit run app.py
114
+ ```
115
+ Then, visit `http://localhost:8501` in your browser with an interactive enviroment.
116
+
117
+ ## Run In a Terminal
118
+
119
+ **1. BackTracking of ACL 2024**
120
+
121
+ ```
122
+ python src/generator.py backtracking --brainstorm-mode mode_c --use-cue-words True --use-inspiration True --num 1
123
+ ```
124
+
125
+ Results dump in `assets/output_idea/output_backtracking_mode_c_cue_True_ins_True.json`.
126
+
127
+ **2. Generate new idea**
128
+
129
+ Input your backgound and cue words in `assets/data/test_background.json`
130
+
131
+ ```
132
+ python src/generator.py new-idea --brainstorm-mode mode_c --use-inspiration True --num 2
133
+ ```
134
+
135
+ Results dump in `assets/output_idea/output_new_idea_mode_c_ins_True.json`.
136
+
137
+ ## Others
138
+
139
+ ### Retrieve Eval
140
+
141
+ Generate retrieve eval log result in `./log`.
142
+
143
+ ```
144
+ bash scripts/retriever_eval.sh
145
+ ```
146
+
147
+ ### Database Construction
148
+ SciPIP uses Neo4j as its database. You can directly import the provided data or add your own research papers.
149
+ ```
150
+ wget https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.7.1/en_core_web_sm-3.7.1-py3-none-any.whl
151
+ pip install en_core_web_sm-3.7.1-py3-none-any.whl
152
+ ```
153
+ The directory for storing papers can be modified in the `pdf_cached` field of `configs/datasets.yaml`.
154
+
155
+ **1. Generate json list**
156
+
157
+ ```
158
+ python src/paper_manager.py crawling --year all --venue-name nips
159
+ ```
160
+
161
+ json files are saved at `./assets/paper/<$venue-name>/<$year>`
162
+
163
+ **2. Fetch Papers**
164
+
165
+ ```
166
+ python src/paper_manager.py update --year all --venue-name nips
167
+ ```
168
+
169
+ ## Cite Us
170
+
171
+ ```
172
+ @article{wang2024scipip,
173
+ title={SciPIP: An LLM-based Scientific Paper Idea Proposer},
174
+ author={Wenxiao Wang, Lihui Gu, Liye Zhang, Yunxiang Luo, Yi Dai, Chen Shen, Liang Xie, Binbin Lin, Xiaofei He, Jieping Ye},
175
+ journal={arXiv preprint arXiv:2410.23166},
176
+ url={https://arxiv.org/abs/2410.23166},
177
+ year={2024}
178
+ }
179
+ ```
180
+
181
+ ## Help Us To Improve
182
+
183
+ https://forms.gle/YpLUrhqs1ahyCAe99
184
+
185
+ Thank you for your use! We hope SciPIP can help you generate research ideas! 🎉
assets/pic/demo.png ADDED
assets/pic/figure_idea_proposal.svg ADDED
assets/pic/logo.jpg ADDED
assets/pic/logo.svg ADDED
assets/pic/sys.png ADDED
src/ai_scientist_idea.py CHANGED
@@ -89,9 +89,7 @@ def generate(config_path, ids_path, retriever_name, **kwargs):
89
  logger.debug("Original entities from background: {}".format(entities))
90
  rt = RetrieverFactory.get_retriever_factory().create_retriever(
91
  retriever_name,
92
- config,
93
- use_cocite=config.RETRIEVE.use_cocite,
94
- use_cluster_to_filter=config.RETRIEVE.use_cluster_to_filter
95
  )
96
  result = rt.retrieve(bg, entities, need_evaluate=False, target_paper_id_list=[], top_k=5)
97
  related_paper = result["related_paper"]
 
89
  logger.debug("Original entities from background: {}".format(entities))
90
  rt = RetrieverFactory.get_retriever_factory().create_retriever(
91
  retriever_name,
92
+ config
 
 
93
  )
94
  result = rt.retrieve(bg, entities, need_evaluate=False, target_paper_id_list=[], top_k=5)
95
  related_paper = result["related_paper"]
src/app_pages/button_interface.py CHANGED
@@ -66,21 +66,16 @@ class Backend(object):
66
 
67
  def entities2literature_callback(self, background, entities, json_strs=None):
68
  if json_strs is not None:
69
- json_contents = json.loads(json_strs)
70
- res = ""
71
- for i, p in enumerate(json_contents["related_paper"]):
72
- res += "%d. " % (i + 1) + str(p)
73
- if i < len(json_contents["related_paper"]) - 1:
74
- res += "\n"
75
- return res, res
76
  else:
77
  result = self.retriever_factory.retrieve(background, entities, need_evaluate=False, target_paper_id_list=[])
78
- res = ""
79
  for i, p in enumerate(result["related_paper"]):
80
- res += "%d. " % (i + 1) + str(p["title"])
81
- if i < len(result["related_paper"]) - 1:
82
- res += "\n"
83
- return res, result["related_paper"]
84
 
85
  def literature2initial_ideas_callback(self, background, brainstorms, retrieved_literature, json_strs=None):
86
  if json_strs is not None:
 
66
 
67
  def entities2literature_callback(self, background, entities, json_strs=None):
68
  if json_strs is not None:
69
+ result = json.loads(json_strs)
70
+ res = []
71
+ for i, p in enumerate(result["related_paper"]):
72
+ res.append(str(p))
 
 
 
73
  else:
74
  result = self.retriever_factory.retrieve(background, entities, need_evaluate=False, target_paper_id_list=[])
75
+ res = []
76
  for i, p in enumerate(result["related_paper"]):
77
+ res.append(f'{p["title"]}. {p["venue_name"].upper()} {p["year"]}.')
78
+ return res, result["related_paper"]
 
 
79
 
80
  def literature2initial_ideas_callback(self, background, brainstorms, retrieved_literature, json_strs=None):
81
  if json_strs is not None:
src/app_pages/homepage.py CHANGED
@@ -4,19 +4,76 @@ from .locale import _
4
  from .sidebar_components import get_sidebar_header, get_sidebar_supported_fields, get_help_us_improve, get_language_select
5
 
6
  def generate_sidebar():
 
7
  get_sidebar_header()
8
- st.sidebar.markdown("Make AI research easy")
9
  get_sidebar_supported_fields()
10
  get_help_us_improve()
11
- get_language_select()
12
 
13
 
14
  def generate_mainpage():
15
- st.title("🏠️ 💡SciPIP: An LLM-based Scientific Paper Idea Proposer")
16
- # st.image("./assets/pic/logo.pdf")
17
- st.header("Introduction")
18
- st.markdown("SciPIP is a scientific paper idea generation tool powered by a large language model (LLM) designed to **assist researchers in quickly generating novel research ideas**. Based on the background information provided by the user, SciPIP first conducts a literature review to identify relevant research, then generates fresh ideas for potential studies.")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
 
 
 
 
 
 
20
 
21
  def home_page():
22
  generate_sidebar()
 
4
  from .sidebar_components import get_sidebar_header, get_sidebar_supported_fields, get_help_us_improve, get_language_select
5
 
6
  def generate_sidebar():
7
+ get_language_select()
8
  get_sidebar_header()
9
+ st.sidebar.markdown(_("Make AI research easy"))
10
  get_sidebar_supported_fields()
11
  get_help_us_improve()
 
12
 
13
 
14
  def generate_mainpage():
15
+ if st.session_state.get("language", "en") == "en":
16
+ st.title("🏠️ 💡SciPIP: An LLM-based Scientific Paper Idea Proposer")
17
+ _, logo_col, _ = st.columns(3)
18
+ logo_col.image("./assets/pic/logo.svg", width=None)
19
+
20
+ st.header("Introduction", divider="blue")
21
+ st.markdown("SciPIP is a scientific paper idea generation tool powered by a large language model (LLM) designed to **assist researchers in quickly generating novel research ideas**. Based on the background information provided by the user, SciPIP first conducts a literature review to identify relevant research, then generates fresh ideas for potential studies.")
22
+
23
+ st.header("Pipeline", divider="blue")
24
+ _, idea_proposal_col, _ = st.columns([1, 5, 1])
25
+ idea_proposal_col.image("./assets/pic/figure_idea_proposal.svg", width=None)
26
+ st.markdown("""This demo uses SciPIP-C, as described in the [paper](https://arxiv.org/abs/2410.23166), as the default idea generation method. The generation process is mainly divided into six steps:
27
+
28
+ 1. **Input Background**: The user inputs the background of the research.
29
+ 2. **Brainstorming**: The large model, without retrieving any literature, generates solutions to the problems in the user-inputted background based solely on its own knowledge.
30
+ 3. **Extracting Entities**: Extract keywords from the user’s input background and the content generated during brainstorming.
31
+ 4. **Retrieving Related Works**: Search for relevant literature in the database based on the extracted keywords and the user’s input background.
32
+ 5. **Generating Initial Ideas**: Draw inspiration from the retrieved literature and, combined with the brainstorming content, propose initial ideas.
33
+ 6. **Generating Final Ideas**: Filter, refine, and process the initial ideas to produce the final ideas.
34
+ """)
35
+
36
+ st.header("One-click Generation vs. Step-by-step Generation", divider="blue")
37
+ # st.markdown("一键生成与逐步生成均使用相同的算法(SciPIP-C),对于一键生成而言,用户无需关心所有的中间输出,可以直接得到最终的Ideas。而逐步生成会按照Pipeline的步骤逐步生成,每步生成结束后,用户都可以修订此步骤生成的内容,从而影响后续生成结果。")
38
+ st.markdown("Both one-click generation and step-by-step generation use the same algorithm (SciPIP-C). For one-click generation, the user does not need to concern themselves with the intermediate outputs and can directly obtain the final ideas. In contrast, step-by-step generation follows the pipeline process, where the content is generated step by step. After each step, the user can revise the content generated in that step, which will influence the results of subsequent steps.")
39
+
40
+ st.header("Resources")
41
+ st.markdown("Our paper: [https://arxiv.org/abs/2410.23166](https://arxiv.org/abs/2410.23166)")
42
+ st.markdown("Our github repository: [https://github.com/cheerss/SciPIP](https://github.com/cheerss/SciPIP)")
43
+ st.markdown("Our Huggingface demo: Coming soon...")
44
+ # st.page_link("https://arxiv.org/abs/2410.23166", label="Our paper: https://arxiv.org/abs/2410.23166", icon=None)
45
+ # st.page_link("https://github.com/cheerss/SciPIP", label="Our github repository: https://github.com/cheerss/SciPIP", icon=None)
46
+
47
+ else:
48
+ st.title("🏠️ 💡SciPIP: 基于大语言模型的科学论文创意生成器")
49
+ _, logo_col, _ = st.columns(3)
50
+ logo_col.image("./assets/pic/logo.svg", width=None)
51
+
52
+ st.header("简介", divider="blue")
53
+ st.markdown("SciPIP 是一个由大语言模型(LLM)驱动的科学论文创意生成工具,旨在**帮助研究人员快速生成新颖的研究思路**。基于用户提供的背景信息,SciPIP首先进行文献回顾以识别相关研究,然后为潜在的研究方向生成新的创意。")
54
+
55
+ st.header("流程", divider="blue")
56
+ _, idea_proposal_col, _ = st.columns([1, 5, 1])
57
+ idea_proposal_col.image("./assets/pic/figure_idea_proposal.svg", width=None)
58
+ st.markdown("""本演示采用论文中所述的SciPIP-C作为默认的创意生成方法,生成流程主要分为六个步骤:
59
+
60
+ 1. **输入背景**:用户输入研究的背景信息。
61
+ 2. **头脑风暴**:大模型在不检索任何文献的情况下,仅凭自身知识为用户输入的背景中的问题生成解决方案。
62
+ 3. **提取实体**:从用户输入的背景和头脑风暴生成的内容中提取关键词。
63
+ 4. **检索相关文献**:根据提取的关键词和用户输入的背景信息,在数据库中检索相关文献。
64
+ 5. **生成初始创意**:从检索到的文献中汲取灵感,并结合头脑风暴的内容提出初步创意。
65
+ 6. **生成最终创意**:对初始创意进行筛选、精炼和加工,最终生成创意。
66
+ """)
67
+
68
+ st.header("一键生成 与 逐步生成", divider="blue")
69
+ st.markdown("一键生成与逐步生成均使用相同的算法(SciPIP-C),对于一键生成而言,用户无需关心所有的中间输出,可以直接得到最终的Ideas。而逐步生成会按照Pipeline的步骤逐步生成,每步生成结束后,用户都可以修订此步骤生成的内容,从而影响后续生成结果。")
70
 
71
+ st.header("相关资源")
72
+ st.markdown("论文: [https://arxiv.org/abs/2410.23166](https://arxiv.org/abs/2410.23166)")
73
+ st.markdown("Github仓库: [https://github.com/cheerss/SciPIP](https://github.com/cheerss/SciPIP)")
74
+ st.markdown("Huggingface演示: 敬请期待...")
75
+ # st.page_link("https://arxiv.org/abs/2410.23166", label="Our paper: https://arxiv.org/abs/2410.23166", icon=None)
76
+ # st.page_link("https://github.com/cheerss/SciPIP", label="Our github repository: https://github.com/cheerss/SciPIP", icon=None)
77
 
78
  def home_page():
79
  generate_sidebar()
src/app_pages/locale.json CHANGED
@@ -1,7 +1,11 @@
1
  {
 
 
 
 
2
  "SciPIP will generate ideas in one click. The generation pipeline is the same as step-by-step generation, but you are free from caring about intermediate outputs.": {
3
  "en": "-",
4
- "zh": "SciPIP将一键生成Ideas,用户无需关心中间输出,Ideas生成使用的算法与逐步生成相同。"
5
  },
6
  "1. Input Background": {
7
  "en": "-",
@@ -19,13 +23,13 @@
19
  "en": "-",
20
  "zh": "检索相关工作"
21
  },
22
- "5. Generate Initial Ideas": {
23
  "en": "-",
24
- "zh": "生成初始Ideas"
25
  },
26
- "6. Generate Final Ideas": {
27
  "en": "-",
28
- "zh": "生成最终Ideas"
29
  },
30
  "Pipeline": {
31
  "en": "-",
@@ -33,11 +37,11 @@
33
  },
34
  "Supported Fields": {
35
  "en": "-",
36
- "zh": "支持领域"
37
  },
38
  "The supported fields are temporarily limited because we only collect literature from ICML, ICLR, NeurIPS, ACL, and EMNLP. Support for other fields are in progress.": {
39
  "en": "-",
40
- "zh": "由于当前我们构建的文献库中仅包含过去10年来自ICML、ICLR、NeurIPS、ACL和EMNLP的论文,因此Ideas生成支持的领域暂时有限"
41
  },
42
  "Natural Language Processing (NLP)": {
43
  "en": "-",
@@ -61,7 +65,7 @@
61
  },
62
  "💧 One-click Generation": {
63
  "en": "-",
64
- "zh": "💧 一键生成Idea"
65
  },
66
  "Check Brainstorms": {
67
  "en": "-",
@@ -82,11 +86,11 @@
82
 
83
  "SciPIP will generate ideas step by step. The generation pipeline is the same as one-click generation, while you can improve each part manually after SciPIP providing the manuscript.": {
84
  "en": "-",
85
- "zh": "SciPIP将会逐步生成Ideas,生成使用的算法与一键生成相同,但是用户可以在SciPIP给出中间过程的初稿后修改其中内容。"
86
  },
87
  "💦 Step-by-step Generation": {
88
  "en": "-",
89
- "zh": "💦 逐步生成Idea"
90
  },
91
  "🐳 Background": {
92
  "en": "-",
@@ -96,6 +100,10 @@
96
  "en": "-",
97
  "zh": "提交"
98
  },
 
 
 
 
99
  "👻 Brainstorms": {
100
  "en": "-",
101
  "zh": "👻 头脑风暴"
@@ -110,11 +118,11 @@
110
  },
111
  "😼 Generated Initial Ideas": {
112
  "en": "-",
113
- "zh": "😼 生成初始Ideas"
114
  },
115
  "😸 Generated Final Ideas": {
116
  "en": "-",
117
- "zh": "😸 生成最终Ideas"
118
  },
119
  "Brainstorming...": {
120
  "en": "-",
@@ -130,11 +138,11 @@
130
  },
131
  "Generating initial ideas...": {
132
  "en": "-",
133
- "zh": "生成初步Ideas……"
134
  },
135
  "Generating final ideas...": {
136
  "en": "-",
137
- "zh": "生成最终Ideas……"
138
  },
139
  "Please input the brainstorms on the left.": {
140
  "en": "-",
@@ -150,11 +158,11 @@
150
  },
151
  "Please input the initial ideas on the left.": {
152
  "en": "-",
153
- "zh": "请在左侧修改初始Ideas"
154
  },
155
  "Please input the final ideas on the left.": {
156
  "en": "-",
157
- "zh": "请在左侧修改最终Ideas"
158
  },
159
  "🏠️ Homepage": {
160
  "en": "-",
 
1
  {
2
+ "Make AI research easy": {
3
+ "en": "-",
4
+ "zh": "让AI研究变得简单"
5
+ },
6
  "SciPIP will generate ideas in one click. The generation pipeline is the same as step-by-step generation, but you are free from caring about intermediate outputs.": {
7
  "en": "-",
8
+ "zh": "SciPIP将一键生成创意,用户无需关心中间输出,创意生成使用的算法与逐步生成相同。"
9
  },
10
  "1. Input Background": {
11
  "en": "-",
 
23
  "en": "-",
24
  "zh": "检索相关工作"
25
  },
26
+ "5. Generating Initial Ideas": {
27
  "en": "-",
28
+ "zh": "生成初始创意"
29
  },
30
+ "6. Generating Final Ideas": {
31
  "en": "-",
32
+ "zh": "生成最终创意"
33
  },
34
  "Pipeline": {
35
  "en": "-",
 
37
  },
38
  "Supported Fields": {
39
  "en": "-",
40
+ "zh": "支持的领域"
41
  },
42
  "The supported fields are temporarily limited because we only collect literature from ICML, ICLR, NeurIPS, ACL, and EMNLP. Support for other fields are in progress.": {
43
  "en": "-",
44
+ "zh": "由于当前我们构建的文献库中仅包含过去10年来自ICML、ICLR、NeurIPS、ACL和EMNLP的论文,因此创意生成支持的领域暂时有限"
45
  },
46
  "Natural Language Processing (NLP)": {
47
  "en": "-",
 
65
  },
66
  "💧 One-click Generation": {
67
  "en": "-",
68
+ "zh": "💧 一键生成创意"
69
  },
70
  "Check Brainstorms": {
71
  "en": "-",
 
86
 
87
  "SciPIP will generate ideas step by step. The generation pipeline is the same as one-click generation, while you can improve each part manually after SciPIP providing the manuscript.": {
88
  "en": "-",
89
+ "zh": "SciPIP将会逐步生成创意,生成使用的算法与一键生成相同,但是用户可以在SciPIP给出中间过程的初稿后修改其中内容。"
90
  },
91
  "💦 Step-by-step Generation": {
92
  "en": "-",
93
+ "zh": "💦 逐步生成创意"
94
  },
95
  "🐳 Background": {
96
  "en": "-",
 
100
  "en": "-",
101
  "zh": "提交"
102
  },
103
+ "Example": {
104
+ "en": "-",
105
+ "zh": "例"
106
+ },
107
  "👻 Brainstorms": {
108
  "en": "-",
109
  "zh": "👻 头脑风暴"
 
118
  },
119
  "😼 Generated Initial Ideas": {
120
  "en": "-",
121
+ "zh": "😼 生成初始创意"
122
  },
123
  "😸 Generated Final Ideas": {
124
  "en": "-",
125
+ "zh": "😸 生成最终创意"
126
  },
127
  "Brainstorming...": {
128
  "en": "-",
 
138
  },
139
  "Generating initial ideas...": {
140
  "en": "-",
141
+ "zh": "生成初步创意……"
142
  },
143
  "Generating final ideas...": {
144
  "en": "-",
145
+ "zh": "生成最终创意……"
146
  },
147
  "Please input the brainstorms on the left.": {
148
  "en": "-",
 
158
  },
159
  "Please input the initial ideas on the left.": {
160
  "en": "-",
161
+ "zh": "请在左侧修改初始创意"
162
  },
163
  "Please input the final ideas on the left.": {
164
  "en": "-",
165
+ "zh": "请在左侧修改最终创意"
166
  },
167
  "🏠️ Homepage": {
168
  "en": "-",
src/app_pages/locale.py CHANGED
@@ -4,7 +4,7 @@ import streamlit as st
4
  json_contents = json.loads(open("./src/app_pages/locale.json", "r").read())
5
 
6
  def _(content: str):
7
- if st.session_state["language"] == "en":
8
  return content
9
  a = json_contents.get(content, content)
10
  if isinstance(a, dict):
 
4
  json_contents = json.loads(open("./src/app_pages/locale.json", "r").read())
5
 
6
  def _(content: str):
7
+ if st.session_state.get("language", "en") == "en":
8
  return content
9
  a = json_contents.get(content, content)
10
  if isinstance(a, dict):
src/app_pages/one_click_generation.py CHANGED
@@ -20,6 +20,7 @@ if "global_state_one_click" not in st.session_state:
20
  st.session_state["global_state_one_click"] = 1.0
21
 
22
  def generate_sidebar():
 
23
  get_sidebar_header()
24
  st.sidebar.markdown(
25
  _("SciPIP will generate ideas in one click. The generation pipeline is the same as "
@@ -27,14 +28,13 @@ def generate_sidebar():
27
  )
28
 
29
  pipeline_list = [_("1. Input Background"), _("2. Brainstorming"), _("3. Extracting Entities"), _("4. Retrieving Related Works"),
30
- _("5. Generate Initial Ideas"), _("6. Generate Final Ideas")]
31
  st.sidebar.header(_("Pipeline"), divider="red")
32
  for i in range(6):
33
  st.sidebar.markdown(f"<font color='black'>{pipeline_list[i]}</font>", unsafe_allow_html=True)
34
 
35
  get_sidebar_supported_fields()
36
  get_help_us_improve()
37
- get_language_select()
38
 
39
  def generate_mainpage(backend):
40
  st.title(_("💧 One-click Generation"))
@@ -67,10 +67,12 @@ def generate_mainpage(backend):
67
  st.session_state["demo_input"] = demo_input
68
 
69
  cols = st.columns([1, 1, 1, 1])
70
- cols[0].button(_("Example 1"), on_click=get_demo_n, args=(0,), use_container_width=True, disabled=not st.session_state.get("enable_submmit", True))
71
- cols[1].button(_("Example 2"), on_click=get_demo_n, args=(1,), use_container_width=True, disabled=not st.session_state.get("enable_submmit", True))
72
- cols[2].button(_("Example 3"), on_click=get_demo_n, args=(2,), use_container_width=True, disabled=not st.session_state.get("enable_submmit", True))
73
- cols[3].button(_("Example 4"), on_click=get_demo_n, args=(3,), use_container_width=True, disabled=not st.session_state.get("enable_submmit", True))
 
 
74
 
75
  def check_intermediate_outputs(id="brainstorms"):
76
  msg = st.session_state["intermediate_output"].get(id, None)
@@ -81,6 +83,7 @@ def generate_mainpage(backend):
81
 
82
  def reset():
83
  del(st.session_state["messages"])
 
84
  st.session_state["enable_submmit"] = True
85
  st.session_state["global_state_one_click"] = 1.0
86
  st.toast(f"The chat has been reset!")
 
20
  st.session_state["global_state_one_click"] = 1.0
21
 
22
  def generate_sidebar():
23
+ get_language_select()
24
  get_sidebar_header()
25
  st.sidebar.markdown(
26
  _("SciPIP will generate ideas in one click. The generation pipeline is the same as "
 
28
  )
29
 
30
  pipeline_list = [_("1. Input Background"), _("2. Brainstorming"), _("3. Extracting Entities"), _("4. Retrieving Related Works"),
31
+ _("5. Generating Initial Ideas"), _("6. Generating Final Ideas")]
32
  st.sidebar.header(_("Pipeline"), divider="red")
33
  for i in range(6):
34
  st.sidebar.markdown(f"<font color='black'>{pipeline_list[i]}</font>", unsafe_allow_html=True)
35
 
36
  get_sidebar_supported_fields()
37
  get_help_us_improve()
 
38
 
39
  def generate_mainpage(backend):
40
  st.title(_("💧 One-click Generation"))
 
67
  st.session_state["demo_input"] = demo_input
68
 
69
  cols = st.columns([1, 1, 1, 1])
70
+ for i in range(4):
71
+ cols[i].button(_("Example") + f" {i+1}", on_click=get_demo_n, args=(i,), use_container_width=True, disabled=not st.session_state.get("enable_submmit", True))
72
+ # cols[0].button(_("Example 1"), on_click=get_demo_n, args=(0,), use_container_width=True, disabled=not st.session_state.get("enable_submmit", True))
73
+ # cols[1].button(_("Example 2"), on_click=get_demo_n, args=(1,), use_container_width=True, disabled=not st.session_state.get("enable_submmit", True))
74
+ # cols[2].button(_("Example 3"), on_click=get_demo_n, args=(2,), use_container_width=True, disabled=not st.session_state.get("enable_submmit", True))
75
+ # cols[3].button(_("Example 4"), on_click=get_demo_n, args=(3,), use_container_width=True, disabled=not st.session_state.get("enable_submmit", True))
76
 
77
  def check_intermediate_outputs(id="brainstorms"):
78
  msg = st.session_state["intermediate_output"].get(id, None)
 
83
 
84
  def reset():
85
  del(st.session_state["messages"])
86
+ del(st.session_state["intermediate_output"])
87
  st.session_state["enable_submmit"] = True
88
  st.session_state["global_state_one_click"] = 1.0
89
  st.toast(f"The chat has been reset!")
src/app_pages/sidebar_components.py CHANGED
@@ -21,15 +21,17 @@ def get_help_us_improve():
21
  st.sidebar.markdown("https://forms.gle/YpLUrhqs1ahyCAe99", unsafe_allow_html=True)
22
 
23
  def get_language_select():
24
- st.sidebar.header("语言 / Language", divider="blue")
25
- language_option = st.sidebar.selectbox(
26
  "选择语言 / Select Language",
27
  options=["中文", "English"],
 
 
28
  )
29
  if language_option == "中文":
30
  language = "zh"
31
  elif language_option == "English":
32
  language = "en"
33
- if language != st.session_state["language"]:
34
  st.session_state["language"] = language
35
  st.rerun()
 
21
  st.sidebar.markdown("https://forms.gle/YpLUrhqs1ahyCAe99", unsafe_allow_html=True)
22
 
23
  def get_language_select():
24
+ language = st.session_state.get("language", "en")
25
+ language_option = st.sidebar.segmented_control(
26
  "选择语言 / Select Language",
27
  options=["中文", "English"],
28
+ selection_mode="single",
29
+ default=("中文" if language == "zh" else "English")
30
  )
31
  if language_option == "中文":
32
  language = "zh"
33
  elif language_option == "English":
34
  language = "en"
35
+ if language != st.session_state.get("language", "en"):
36
  st.session_state["language"] = language
37
  st.rerun()
src/app_pages/step_by_step_generation.py CHANGED
@@ -4,6 +4,7 @@ from .locale import _
4
  from .sidebar_components import get_sidebar_header, get_sidebar_supported_fields, get_help_us_improve, get_language_select
5
 
6
  def generate_sidebar():
 
7
  get_sidebar_header()
8
  st.sidebar.markdown(
9
  _("SciPIP will generate ideas step by step. The generation pipeline is the same as "
@@ -16,7 +17,7 @@ def generate_sidebar():
16
  INPROGRESS_COLOR = "black"
17
  color_list = []
18
  pipeline_list = [_("1. Input Background"), _("2. Brainstorming"), _("3. Extracting Entities"), _("4. Retrieving Related Works"),
19
- _("5. Generate Initial Ideas"), _("6. Generate Final Ideas")]
20
  for i in range(1, 8):
21
  if st.session_state["global_state_step"] < i:
22
  color_list.append(UNDONE_COLOR)
@@ -32,7 +33,6 @@ def generate_sidebar():
32
 
33
  get_sidebar_supported_fields()
34
  get_help_us_improve()
35
- get_language_select()
36
 
37
  def get_textarea_height(text_content):
38
  if text_content is None:
@@ -44,7 +44,6 @@ def get_textarea_height(text_content):
44
  return max(count * 23 + 20, 100) # 23 is a magic number
45
 
46
  def generate_mainpage(backend):
47
- # print("refresh mainpage")
48
  st.title(_("💦 Step-by-step Generation"))
49
  st.header(_("🐳 Background"))
50
  with st.form('background_form') as bg_form:
@@ -55,7 +54,7 @@ def generate_mainpage(backend):
55
  def click_demo_i(i):
56
  st.session_state["background"] = backend.get_demo_i(i)
57
  for i, col in enumerate(cols):
58
- col.form_submit_button(f"Example {i + 1}", use_container_width=True, on_click=click_demo_i, args=(i,))
59
 
60
  col1, col2 = st.columns([2, 20])
61
  submitted = col1.form_submit_button(_("Submit"), type="primary")
@@ -94,16 +93,6 @@ def generate_mainpage(backend):
94
  ## Entities
95
  st.header(_("🐱 Extracted Entities"))
96
  with st.expander("", expanded=st.session_state.get("entities_expand", False)):
97
- ## text area
98
- # col1, col2 = st.columns(2, )
99
- # entities_old = col1.text_area(label="entities", value=st.session_state.get("entities", "[]"), label_visibility="collapsed")
100
- # entities_old = ast.literal_eval(entities_old)
101
- # st.session_state["entities"] = entities_old
102
- # if entities_old:
103
- # col2.markdown(f"{entities_old}")
104
- # else:
105
- # col2.markdown(_("Please input the entities on the left."))
106
-
107
  ## pills
108
  def update_entities():
109
  return
@@ -112,36 +101,33 @@ def generate_mainpage(backend):
112
  entities_updated = st.pills(label="entities", options=ori_entities, selection_mode="multi",
113
  default=ori_entities, label_visibility="collapsed", on_change=update_entities)
114
  st.session_state["entities_updated"] = entities_updated
115
- print("=" * 10)
116
- print(entities_updated)
117
- print(st.session_state["entities_updated"])
118
- print("=" * 10)
119
 
120
  submitted = st.button(_("Submit"), key="entities_button", type="primary")
121
  if submitted:
122
  st.session_state["global_state_step"] = 4.0
123
  with st.spinner(text="Retrieving related works..."):
124
  st.session_state["related_works"], st.session_state["related_works_intact"] = backend.entities2literature_callback(background, entities_updated)
125
- # st.session_state["related_works"] = "related works"
126
  st.session_state["global_state_step"] = 4.5
127
  st.session_state["related_works_expand"] = True
128
 
129
  ## Retrieved related works
130
  st.header(_("📖 Retrieved Related Works"))
131
  with st.expander("", expanded=st.session_state.get("related_works_expand", False)):
132
- col1, col2 = st.columns(2, )
133
- widget_height = get_textarea_height(st.session_state.get("related_works", ""))
134
- related_works_title = col1.text_area(label="related_works", value=st.session_state.get("related_works", ""),
135
- label_visibility="collapsed", height=widget_height)
136
- if related_works_title:
137
- col2.markdown(f"{related_works_title}")
138
- else:
139
- col2.markdown(_("Please input the related works on the left."))
140
- submitted = col1.button(_("Submit"), key="related_works_button", type="primary")
141
  if submitted:
142
  st.session_state["global_state_step"] = 5.0
143
  with st.spinner(text="Generating initial ideas..."):
144
- res = backend.literature2initial_ideas_callback(background, brainstorms, st.session_state["related_works_intact"])
 
 
 
 
145
  st.session_state["initial_ideas"] = res[0]
146
  st.session_state["final_ideas"] = res[1]
147
  # st.session_state["initial_ideas"] = "initial ideas"
 
4
  from .sidebar_components import get_sidebar_header, get_sidebar_supported_fields, get_help_us_improve, get_language_select
5
 
6
  def generate_sidebar():
7
+ get_language_select()
8
  get_sidebar_header()
9
  st.sidebar.markdown(
10
  _("SciPIP will generate ideas step by step. The generation pipeline is the same as "
 
17
  INPROGRESS_COLOR = "black"
18
  color_list = []
19
  pipeline_list = [_("1. Input Background"), _("2. Brainstorming"), _("3. Extracting Entities"), _("4. Retrieving Related Works"),
20
+ _("5. Generating Initial Ideas"), _("6. Generating Final Ideas")]
21
  for i in range(1, 8):
22
  if st.session_state["global_state_step"] < i:
23
  color_list.append(UNDONE_COLOR)
 
33
 
34
  get_sidebar_supported_fields()
35
  get_help_us_improve()
 
36
 
37
  def get_textarea_height(text_content):
38
  if text_content is None:
 
44
  return max(count * 23 + 20, 100) # 23 is a magic number
45
 
46
  def generate_mainpage(backend):
 
47
  st.title(_("💦 Step-by-step Generation"))
48
  st.header(_("🐳 Background"))
49
  with st.form('background_form') as bg_form:
 
54
  def click_demo_i(i):
55
  st.session_state["background"] = backend.get_demo_i(i)
56
  for i, col in enumerate(cols):
57
+ col.form_submit_button(_("Example") + f" {i+1}", use_container_width=True, on_click=click_demo_i, args=(i,))
58
 
59
  col1, col2 = st.columns([2, 20])
60
  submitted = col1.form_submit_button(_("Submit"), type="primary")
 
93
  ## Entities
94
  st.header(_("🐱 Extracted Entities"))
95
  with st.expander("", expanded=st.session_state.get("entities_expand", False)):
 
 
 
 
 
 
 
 
 
 
96
  ## pills
97
  def update_entities():
98
  return
 
101
  entities_updated = st.pills(label="entities", options=ori_entities, selection_mode="multi",
102
  default=ori_entities, label_visibility="collapsed", on_change=update_entities)
103
  st.session_state["entities_updated"] = entities_updated
 
 
 
 
104
 
105
  submitted = st.button(_("Submit"), key="entities_button", type="primary")
106
  if submitted:
107
  st.session_state["global_state_step"] = 4.0
108
  with st.spinner(text="Retrieving related works..."):
109
  st.session_state["related_works"], st.session_state["related_works_intact"] = backend.entities2literature_callback(background, entities_updated)
110
+ st.session_state["related_works_use_state"] = [True] * len(st.session_state["related_works"])
111
  st.session_state["global_state_step"] = 4.5
112
  st.session_state["related_works_expand"] = True
113
 
114
  ## Retrieved related works
115
  st.header(_("📖 Retrieved Related Works"))
116
  with st.expander("", expanded=st.session_state.get("related_works_expand", False)):
117
+ related_works = st.session_state.get("related_works", [])
118
+ for i, rw in enumerate(related_works):
119
+ checked = st.checkbox(rw, value=st.session_state.get("related_works_use_state")[i])
120
+ st.session_state.get("related_works_use_state")[i] = checked
121
+
122
+ submitted = st.button(_("Submit"), key="related_works_button", type="primary")
 
 
 
123
  if submitted:
124
  st.session_state["global_state_step"] = 5.0
125
  with st.spinner(text="Generating initial ideas..."):
126
+ st.session_state["selected_related_works_intact"] = []
127
+ for s, p in zip(st.session_state.get("related_works_use_state"), st.session_state["related_works_intact"]):
128
+ if s:
129
+ st.session_state["selected_related_works_intact"].append(p)
130
+ res = backend.literature2initial_ideas_callback(background, brainstorms, st.session_state["selected_related_works_intact"])
131
  st.session_state["initial_ideas"] = res[0]
132
  st.session_state["final_ideas"] = res[1]
133
  # st.session_state["initial_ideas"] = "initial ideas"
src/generator.py CHANGED
@@ -26,7 +26,7 @@ def extract_problem(problem, background):
26
 
27
  class IdeaGenerator:
28
  def __init__(
29
- self, config, paper_list: list[dict], cue_words: list = None, brainstorm: str = None
30
  ) -> None:
31
  self.api_helper = APIHelper(config)
32
  self.paper_list = paper_list
@@ -405,9 +405,7 @@ def backtracking(config_path, ids_path, retriever_name, brainstorm_mode, use_cue
405
  # 3. 检索相关论文
406
  rt = RetrieverFactory.get_retriever_factory().create_retriever(
407
  retriever_name,
408
- config,
409
- use_cocite=True,
410
- use_cluster_to_filter=True
411
  )
412
  result = rt.retrieve(
413
  bg, entities_all, need_evaluate=False, target_paper_id_list=[]
@@ -577,9 +575,7 @@ def new_idea(config_path, ids_path, retriever_name, brainstorm_mode, use_inspira
577
  # 2. 检索相关论文
578
  rt = RetrieverFactory.get_retriever_factory().create_retriever(
579
  retriever_name,
580
- config,
581
- use_cocite=config.RETRIEVE.use_cocite,
582
- use_cluster_to_filter=config.RETRIEVE.use_cluster_to_filter,
583
  )
584
  result = rt.retrieve(bg, entities_all, need_evaluate=False, target_paper_id_list=[])
585
  related_paper = result["related_paper"]
 
26
 
27
  class IdeaGenerator:
28
  def __init__(
29
+ self, config, paper_list: list[dict] = [], cue_words: list = None, brainstorm: str = None
30
  ) -> None:
31
  self.api_helper = APIHelper(config)
32
  self.paper_list = paper_list
 
405
  # 3. 检索相关论文
406
  rt = RetrieverFactory.get_retriever_factory().create_retriever(
407
  retriever_name,
408
+ config
 
 
409
  )
410
  result = rt.retrieve(
411
  bg, entities_all, need_evaluate=False, target_paper_id_list=[]
 
575
  # 2. 检索相关论文
576
  rt = RetrieverFactory.get_retriever_factory().create_retriever(
577
  retriever_name,
578
+ config
 
 
579
  )
580
  result = rt.retrieve(bg, entities_all, need_evaluate=False, target_paper_id_list=[])
581
  related_paper = result["related_paper"]
src/paper_manager.py CHANGED
@@ -163,7 +163,7 @@ class PaperManager:
163
  self.venue_name = venue_name
164
  self.year = year
165
  self.data_type = "train"
166
- self.paper_client = PaperClient(config)
167
  self.paper_crawling = PaperCrawling(config, data_type=self.data_type)
168
  self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
169
  self.embedding_model = get_embedding_model(config)
 
163
  self.venue_name = venue_name
164
  self.year = year
165
  self.data_type = "train"
166
+ self.paper_client = PaperClient()
167
  self.paper_crawling = PaperCrawling(config, data_type=self.data_type)
168
  self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
169
  self.embedding_model = get_embedding_model(config)
src/utils/paper_retriever.py CHANGED
@@ -605,7 +605,7 @@ class KGRetriever(Retriever):
605
  }
606
  return result
607
 
608
- def retrieve(self, bg, entities, need_evaluate=True, target_paper_id_list=[]):
609
  """
610
  Args:
611
  context: string
 
605
  }
606
  return result
607
 
608
+ def retrieve(self, bg, entities, need_evaluate=False, target_paper_id_list=[]):
609
  """
610
  Args:
611
  context: string