cnn_dailymail_108_3000_1500_train
This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.
Usage
To use this model, please install BERTopic:
pip install -U bertopic
You can use the model as follows:
from bertopic import BERTopic
topic_model = BERTopic.load("KingKazma/cnn_dailymail_108_3000_1500_train")
topic_model.get_topic_info()
Topic overview
- Number of topics: 51
- Number of training documents: 3000
Click here for an overview of all topics.
Topic ID | Topic Keywords | Topic Frequency | Label |
---|---|---|---|
-1 | said - one - people - year - would | 10 | -1_said_one_people_year |
0 | league - player - cup - club - game | 954 | 0_league_player_cup_club |
1 | police - said - court - told - murder | 308 | 1_police_said_court_told |
2 | dog - animal - cat - elephant - zoo | 290 | 2_dog_animal_cat_elephant |
3 | mr - minister - labour - cameron - prime | 113 | 3_mr_minister_labour_cameron |
4 | obama - clinton - president - republican - campaign | 104 | 4_obama_clinton_president_republican |
5 | school - teacher - student - nfl - said | 84 | 5_school_teacher_student_nfl |
6 | food - milk - drink - wine - bottle | 72 | 6_food_milk_drink_wine |
7 | flight - plane - passenger - pilot - aircraft | 49 | 7_flight_plane_passenger_pilot |
8 | user - facebook - google - ipad - device | 48 | 8_user_facebook_google_ipad |
9 | olympic - gold - race - games - medal | 46 | 9_olympic_gold_race_games |
10 | doll - dress - fashion - look - style | 44 | 10_doll_dress_fashion_look |
11 | afghan - afghanistan - taliban - military - pakistan | 43 | 11_afghan_afghanistan_taliban_military |
12 | transplant - patient - heart - hospital - cancer | 42 | 12_transplant_patient_heart_hospital |
13 | iran - syrian - said - president - egypt | 42 | 13_iran_syrian_said_president |
14 | show - film - million - like - movie | 39 | 14_show_film_million_like |
15 | property - house - price - home - apartment | 38 | 15_property_house_price_home |
16 | earth - asteroid - moon - volcano - planet | 34 | 16_earth_asteroid_moon_volcano |
17 | federer - djokovic - match - murray - seed | 33 | 17_federer_djokovic_match_murray |
18 | jackson - jacksons - album - song - music | 31 | 18_jackson_jacksons_album_song |
19 | ship - boat - coast - said - vessel | 30 | 19_ship_boat_coast_said |
20 | russia - russian - putin - ukraine - moscow | 30 | 20_russia_russian_putin_ukraine |
21 | snow - weather - temperature - climate - water | 29 | 21_snow_weather_temperature_climate |
22 | police - station - mr - man - gang | 28 | 22_police_station_mr_man |
23 | ebola - disease - vaccine - virus - health | 28 | 23_ebola_disease_vaccine_virus |
24 | weight - fat - diet - burn - exercise | 28 | 24_weight_fat_diet_burn |
25 | syria - isis - islamic - muslims - alqudsi | 23 | 25_syria_isis_islamic_muslims |
26 | boko - haram - nigeria - nigerian - turkana | 23 | 26_boko_haram_nigeria_nigerian |
27 | korea - north - korean - kim - pyongyang | 22 | 27_korea_north_korean_kim |
28 | driver - driving - road - car - speed | 22 | 28_driver_driving_road_car |
29 | school - child - education - internet - english | 21 | 29_school_child_education_internet |
30 | mcilroy - woods - pga - tournament - round | 20 | 30_mcilroy_woods_pga_tournament |
31 | race - car - driver - team - f1 | 19 | 31_race_car_driver_team |
32 | princess - prince - diana - royal - palace | 18 | 32_princess_prince_diana_royal |
33 | climbing - climb - mountain - everest - ang | 18 | 33_climbing_climb_mountain_everest |
34 | wedding - bieber - couple - together - love | 18 | 34_wedding_bieber_couple_together |
35 | nhs - care - patient - hospital - health | 17 | 35_nhs_care_patient_hospital |
36 | iraq - iraqi - isis - baghdad - kurdish | 16 | 36_iraq_iraqi_isis_baghdad |
37 | cartel - drug - mexican - mexico - crack | 15 | 37_cartel_drug_mexican_mexico |
38 | painting - picasso - art - artist - gogh | 15 | 38_painting_picasso_art_artist |
39 | castro - zelaya - fidel - micheletti - president | 14 | 39_castro_zelaya_fidel_micheletti |
40 | french - ford - traveller - southampton - taxi | 14 | 40_french_ford_traveller_southampton |
41 | fire - florissant - bell - firefighter - burned | 14 | 41_fire_florissant_bell_firefighter |
42 | fight - ali - heavyweight - pacquiao - title | 13 | 42_fight_ali_heavyweight_pacquiao |
43 | fish - sea - jellyfish - manta - swell | 13 | 43_fish_sea_jellyfish_manta |
44 | pope - francis - vatican - falkland - islands | 12 | 44_pope_francis_vatican_falkland |
45 | gay - samesex - lgbt - marriage - state | 12 | 45_gay_samesex_lgbt_marriage |
46 | castle - tower - building - brent - lego | 12 | 46_castle_tower_building_brent |
47 | chinese - china - xinhua - chinas - communist | 12 | 47_chinese_china_xinhua_chinas |
48 | delivery - customer - market - vacuum - coin | 10 | 48_delivery_customer_market_vacuum |
49 | water - rain - storm - flooding - methane | 10 | 49_water_rain_storm_flooding |
Training hyperparameters
- calculate_probabilities: True
- language: english
- low_memory: False
- min_topic_size: 10
- n_gram_range: (1, 1)
- nr_topics: None
- seed_topic_list: None
- top_n_words: 10
- verbose: False
Framework versions
- Numpy: 1.22.4
- HDBSCAN: 0.8.33
- UMAP: 0.5.3
- Pandas: 1.5.3
- Scikit-Learn: 1.2.2
- Sentence-transformers: 2.2.2
- Transformers: 4.31.0
- Numba: 0.56.4
- Plotly: 5.13.1
- Python: 3.10.6
- Downloads last month
- 10
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.