Reduce outliers now more efficient and relabels with correct vectoriser. Default topic labels now tidier. Hiearchical topics outputs more useful for joining to df afterwards. Switched low resource reduction algorithm to UMAP as default is not good. e1c1f68 Sonnyjim commited on Feb 7, 2024
Should now parse custom regex correctly. Will now wipe previously created embeddings if 'low resource mode' option switched. 0a543a0 Sean-Case commited on Feb 7, 2024
Allowed for uploading custom regex for cleaning. Fixed calculate all probabilities, reduce outliers. Added text tree for hierarchical modelling. 381f959 Sonnyjim commited on Feb 6, 2024
Added clean data options, improved re-representation options and visualisation. General format changes 4effac0 Sonnyjim commited on Feb 2, 2024