huggingface init
Browse files
README.md
CHANGED
@@ -1,10 +1,38 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# kth-qa
|
2 |
+
|
3 |
+
## Env file
|
4 |
+
You need to create a file called .env in the main folder (kth-qa) and add your openai api key.
|
5 |
+
The content should look like this:
|
6 |
+
```
|
7 |
+
OPENAI_API_KEY=YOUR_KEY
|
8 |
+
OPENAI_CHAT_MODEL=gpt-3.5-turbo
|
9 |
+
```
|
10 |
+
|
11 |
+
## How to run
|
12 |
+
1. Install Poetry (https://python-poetry.org/)
|
13 |
+
- For M1 Macs: https://github.com/rybodiddly/Poetry-Pyenv-Homebrew-Numpy-TensorFlow-on-Apple-Silicon-M1
|
14 |
+
2. cd into ``kth-qa``
|
15 |
+
3. run ``poetry install``
|
16 |
+
4. make sure poetry venv is activated (.venv). If not, run ``poetry shell``
|
17 |
+
|
18 |
+
### Scrape files
|
19 |
+
1. Run ``make courses``
|
20 |
+
This runs
|
21 |
+
1. ``python webscraping/scrape_info.py`` which scrapes the list of KTH courses
|
22 |
+
2. ``python webscraping/scrape_course.py`` which scrapes the course pages
|
23 |
+
|
24 |
+
You can limit the number of courses scraped in both files by changing the variable ``limit``
|
25 |
+
You can also select what languages to scrape in
|
26 |
+
|
27 |
+
### Ingest files
|
28 |
+
1. Have .txt files in kth_qa/files/en or kth_qa/files/sv
|
29 |
+
2. Stand in root and run ```make ingest```
|
30 |
+
|
31 |
+
### Run site
|
32 |
+
Run either
|
33 |
+
- ``python kth_qa/main.py`` or
|
34 |
+
- ``make start``
|
35 |
+
|
36 |
+
## How to search
|
37 |
+
- If you just want to test the search functionality, you can search for "test"
|
38 |
+
- You can also define your own test searchs in test_queries.json
|