data_text_search / search_funcs /helper_functions.py

Commit History

Improvements with embeddings load and file save
650da6e
Running

seanpedrickcase commited on

Minor bug fix to connections parameter function
b1c3d49

seanpedrickcase commited on

Cognito authorisation option added to app, some other minor changes.
759001a

seanpedrickcase commited on

When running on cloud now checks for relevant header details on load
f9e3451

seanpedrickcase commited on

Now accepts .zip file as inputs. Moved semantic search option bar. Minor API mode changes.
7f029b5

seanpedrickcase commited on

Changed embedding model to MiniLM-L6 as faster. Compressed embeddings are now int8. General improvements to API mode
ea0dd40

seanpedrickcase commited on

General code improvements and refinements.
a95ef9f

seanpedrickcase commited on

Set bm25 in functions explicitly. Some API updates. Now can get connection params on startup.
2393537

seanpedrickcase commited on

Some package updates and minor changes
2754a2b

seanpedrickcase commited on

Changed all intermediate file outputs to save to output folder
fea085c

seanpedrickcase commited on

Allowed for custom output folder, returned Dockerfile to work under user account and port 7860
d3ff2e2

seanpedrickcase commited on

Now checks for output folder before saving. Minor code cleaning
2089141

seanpedrickcase commited on

Gradio 4.21. Limitations on file size and creating embeddings. Added AWS integration
e0fe055

seanpedrickcase commited on

Improved code for cleaning and outputting files. Added Dockerfile
4ee3470

Sean-Case commited on

Improved xlsx output formatting. Deals better with cleaning data then analysing in same session.
352c02a

Sean-Case commited on

Added highlight search term functionality to keyword search output
36a404e

seanpedrickcase commited on

Updated to Gradio 4.16.0. Now works correctly with BGE embeddings
2bcd818

seanpedrickcase commited on

Upgraded to Gradio 4.16.0. Added Spacy fuzzy search functionality.
4ce2224

Sean-Case commited on

Cut out semantic search temporarily while issues with Jina gated model resolved. Improved error/progress tracking and messaging. Placeholder for Spacy fuzzy search.
739b386

seanpedrickcase commited on

Better error checking. Doesn't load in embeddings file twice now.
63049fe

Sean-Case commited on

Fixed data input for semantic search. Allowed for docs to be loaded in directly for semantic search. 0.2.1
3df8e40

Sean-Case commited on

Minor changes to file path for outputs, documentation, location of pyinstaller build dependencies
200480d

seanpedrickcase commited on

Many changes to code organisation. More efficient searches from using intermediate outputs. Version 0.1
99d6fba

seanpedrickcase commited on