Spaces:

seanpedrickcase
/

data_text_search

Running

App Files Files Community

data_text_search / search_funcs /semantic_ingest_functions.py

Commit History

Changed embedding model to MiniLM-L6 as faster. Compressed embeddings are now int8. General improvements to API mode

ea0dd40

seanpedrickcase commited on Jul 3

Changed all intermediate file outputs to save to output folder

fea085c

seanpedrickcase commited on Jun 5

Allowed for custom output folder, returned Dockerfile to work under user account and port 7860

d3ff2e2

seanpedrickcase commited on Jun 4

Now checks for output folder before saving. Minor code cleaning

2089141

seanpedrickcase commited on May 20

Fixed cleaning for semantic search. Handles text with backslashes in (if cleaned). Updated packages. requirements file for only keyword search added.

8466e45

seanpedrickcase commited on May 20

Improved code for cleaning and outputting files. Added Dockerfile

4ee3470

Sean-Case commited on Feb 27

Improved xlsx output formatting. Deals better with cleaning data then analysing in same session.

352c02a

Sean-Case commited on Feb 16

Updated to Gradio 4.16.0. Now works correctly with BGE embeddings

2bcd818

seanpedrickcase commited on Feb 5

Cut out semantic search temporarily while issues with Jina gated model resolved. Improved error/progress tracking and messaging. Placeholder for Spacy fuzzy search.

739b386

seanpedrickcase commited on Feb 2

Better error checking. Doesn't load in embeddings file twice now.

63049fe

Sean-Case commited on Jan 31

Fixed data input for semantic search. Allowed for docs to be loaded in directly for semantic search. 0.2.1

3df8e40

Sean-Case commited on Jan 30

Minor changes to file path for outputs, documentation, location of pyinstaller build dependencies

200480d

seanpedrickcase commited on Jan 11

Many changes to code organisation. More efficient searches from using intermediate outputs. Version 0.1

99d6fba

seanpedrickcase commited on Jan 10