BERTopic model card bias topic model

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("davanstrien/BERTopic_model_card_bias")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 11
  • Number of training documents: 1271
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 evaluation - claim - reasoning - parameters - university 13 -1_evaluation_claim_reasoning_parameters
0 checkpoint - fairly - characterized - even - sectionhttpshuggingfacecobertbaseuncased 13 0_checkpoint_fairly_characterized_even
1 generative - research - uses - processes - artistic 137 1_generative_research_uses_processes
2 checkpoint - try - snippet - sectionhttpshuggingfacecobertbaseuncased - limitation 48 2_checkpoint_try_snippet_sectionhttpshuggingfacecobertbaseuncased
3 meant - technical - sociotechnical - convey - needed 32 3_meant_technical_sociotechnical_convey
4 gpt2 - team - their - cardhttpsgithubcomopenaigpt2blobmastermodelcardmd - worked 32 4_gpt2_team_their_cardhttpsgithubcomopenaigpt2blobmastermodelcardmd
5 datasets - internet - unfiltered - therefore - lot 27 5_datasets_internet_unfiltered_therefore
6 dacy - danish - pipelines - transformer - bert 25 6_dacy_danish_pipelines_transformer
7 your - pythia - branch - checkpoints - provide 20 7_your_pythia_branch_checkpoints
8 opt - trained - large - software - code 15 8_opt_trained_large_software
9 al - et - identity - occupational - groups 15 9_al_et_identity_occupational

Training hyperparameters

  • calculate_probabilities: False
  • language: english
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: None
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: False

Framework versions

  • Numpy: 1.22.4
  • HDBSCAN: 0.8.29
  • UMAP: 0.5.3
  • Pandas: 1.5.3
  • Scikit-Learn: 1.2.2
  • Sentence-transformers: 2.2.2
  • Transformers: 4.29.0
  • Numba: 0.56.4
  • Plotly: 5.13.1
  • Python: 3.10.11
Downloads last month
6
Inference Examples
Inference API (serverless) has been turned off for this model.

Dataset used to train ethanpai/bert