Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Spaces:
seanpedrickcase
/
topic_modelling
like
12
Running
App
Files
Files
Community
Fetching metadata from the HF Docker repository...
e1c1f68
topic_modelling
/
funcs
3 contributors
History:
30 commits
Sonnyjim
Reduce outliers now more efficient and relabels with correct vectoriser. Default topic labels now tidier. Hiearchical topics outputs more useful for joining to df afterwards. Switched low resource reduction algorithm to UMAP as default is not good.
e1c1f68
about 1 year ago
__init__.py
Safe
0 Bytes
first commit
about 1 year ago
anonymiser.py
Safe
10.2 kB
Added clean data options, improved re-representation options and visualisation. General format changes
about 1 year ago
bertopic_vis_documents.py
Safe
47.2 kB
Reduce outliers now more efficient and relabels with correct vectoriser. Default topic labels now tidier. Hiearchical topics outputs more useful for joining to df afterwards. Switched low resource reduction algorithm to UMAP as default is not good.
about 1 year ago
clean_funcs.py
Safe
5.03 kB
Reduce outliers now more efficient and relabels with correct vectoriser. Default topic labels now tidier. Hiearchical topics outputs more useful for joining to df afterwards. Switched low resource reduction algorithm to UMAP as default is not good.
about 1 year ago
embeddings.py
Safe
2.54 kB
Hopefully now LLM download from hub should work
about 1 year ago
helper_functions.py
Safe
9.91 kB
Should now parse custom regex correctly. Will now wipe previously created embeddings if 'low resource mode' option switched.
about 1 year ago
presidio_analyzer_custom.py
Safe
4.18 kB
Added clean data options, improved re-representation options and visualisation. General format changes
about 1 year ago
prompts.py
Safe
4.86 kB
Model export changed to safetensors. Improved representational model function. Got zero shot topic modelling working
about 1 year ago
representation_model.py
Safe
6.74 kB
Hopefully now LLM download from hub should work
about 1 year ago
topic_core_funcs.py
Safe
25.1 kB
Reduce outliers now more efficient and relabels with correct vectoriser. Default topic labels now tidier. Hiearchical topics outputs more useful for joining to df afterwards. Switched low resource reduction algorithm to UMAP as default is not good.
about 1 year ago