Commit History

ECS not allowing me to save files so increasing container privileges in Dockerfile
9c0a094

Sean-Case commited on

Now loads in embedding model locally in Dockerfile
3034296

Sean-Case commited on

Improved code for cleaning and outputting files. Added Dockerfile
4ee3470

Sean-Case commited on

Improved xlsx output formatting. Deals better with cleaning data then analysing in same session.
352c02a

Sean-Case commited on

Added highlight search term functionality to keyword search output
36a404e

seanpedrickcase commited on

Updated to Gradio 4.16.0. Now works correctly with BGE embeddings
2bcd818

seanpedrickcase commited on

Upgraded to Gradio 4.16.0. Added Spacy fuzzy search functionality.
4ce2224

Sean-Case commited on

Changed intro text
8c115b3

Sean-Case commited on

Cut out semantic search temporarily while issues with Jina gated model resolved. Improved error/progress tracking and messaging. Placeholder for Spacy fuzzy search.
739b386

seanpedrickcase commited on

Better error checking. Doesn't load in embeddings file twice now.
63049fe

Sean-Case commited on

Fixed data input for semantic search. Allowed for docs to be loaded in directly for semantic search. 0.2.1
3df8e40

Sean-Case commited on

Minor changes to file path for outputs, documentation, location of pyinstaller build dependencies
200480d

seanpedrickcase commited on

Many changes to code organisation. More efficient searches from using intermediate outputs. Version 0.1
99d6fba

seanpedrickcase commited on

Now works correctly with npz. Minor formatting improvements
d3b1ac5

seanpedrickcase commited on

Faster embedding with GPU, fast document split, writes to chromadb file correctly. No longer needs FAISS or langchain
2cb9977

seanpedrickcase commited on

Now outputs correct dataframe for semantic search. Can join on extra details
2a8aba8

Sean-Case commited on

Added basic semantic search functionality
78d71d4

Sean-Case commited on

added nltk punkt load
ba838fc

Sean-Case commited on

Added stopwords and wordnet nltk dependencies
c1da670

Sean-Case commited on

Added nltk download for names for HF use
7e5fca9

Sean-Case commited on

Updates to readme file. Changed app file name to work with HF.
5c04910

Sean-Case commited on