--- tags: - bertopic library_name: bertopic pipeline_tag: text-classification --- # cnn_dailymail_6789_3000_1500_train This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets. ## Usage To use this model, please install BERTopic: ``` pip install -U bertopic ``` You can use the model as follows: ```python from bertopic import BERTopic topic_model = BERTopic.load("KingKazma/cnn_dailymail_6789_3000_1500_train") topic_model.get_topic_info() ``` ## Topic overview * Number of topics: 54 * Number of training documents: 3000
Click here for an overview of all topics. | Topic ID | Topic Keywords | Topic Frequency | Label | |----------|----------------|-----------------|-------| | -1 | said - people - one - police - year | 10 | -1_said_people_one_police | | 0 | player - league - cup - club - game | 1072 | 0_player_league_cup_club | | 1 | police - said - death - murder - found | 291 | 1_police_said_death_murder | | 2 | obama - president - republicans - house - republican | 152 | 2_obama_president_republicans_house | | 3 | labour - mr - cameron - minister - prime | 98 | 3_labour_mr_cameron_minister | | 4 | hospital - baby - surgery - heart - doctor | 77 | 4_hospital_baby_surgery_heart | | 5 | iphone - apple - user - device - phone | 74 | 5_iphone_apple_user_device | | 6 | doll - fashion - look - collection - like | 69 | 6_doll_fashion_look_collection | | 7 | syria - isis - syrian - iraq - iraqi | 46 | 7_syria_isis_syrian_iraq | | 8 | pakistan - taliban - al - drone - afghanistan | 45 | 8_pakistan_taliban_al_drone | | 9 | food - restaurant - menu - burger - coffee | 43 | 9_food_restaurant_menu_burger | | 10 | car - driver - vehicle - crash - driving | 41 | 10_car_driver_vehicle_crash | | 11 | space - tower - car - airport - nasa | 40 | 11_space_tower_car_airport | | 12 | property - house - home - apartment - room | 40 | 12_property_house_home_apartment | | 13 | school - rape - sexual - student - sex | 36 | 13_school_rape_sexual_student | | 14 | nfl - rice - quarterback - said - coach | 36 | 14_nfl_rice_quarterback_said | | 15 | music - album - song - miley - cnn | 33 | 15_music_album_song_miley | | 16 | olympic - gold - olympics - athlete - world | 33 | 16_olympic_gold_olympics_athlete | | 17 | zoo - bear - tian - elephant - ivory | 33 | 17_zoo_bear_tian_elephant | | 18 | flight - plane - aircraft - pilot - airport | 32 | 18_flight_plane_aircraft_pilot | | 19 | flu - bacteria - vaccine - health - disease | 31 | 19_flu_bacteria_vaccine_health | | 20 | dog - animal - pet - cat - dogs | 30 | 20_dog_animal_pet_cat | | 21 | school - education - exam - child - degree | 30 | 21_school_education_exam_child | | 22 | kenya - kenyan - mall - said - nairobi | 28 | 22_kenya_kenyan_mall_said | | 23 | cent - per - price - cadbury - christmas | 27 | 23_cent_per_price_cadbury | | 24 | french - france - sarkozy - hollande - minister | 26 | 24_french_france_sarkozy_hollande | | 25 | russian - ukraine - russia - putin - ukrainian | 25 | 25_russian_ukraine_russia_putin | | 26 | iran - nuclear - iranian - israel - irans | 24 | 26_iran_nuclear_iranian_israel | | 27 | film - bond - novel - the - cnn | 24 | 27_film_bond_novel_the | | 28 | lava - fire - snow - pahoa - volcano | 24 | 28_lava_fire_snow_pahoa | | 29 | drug - mexican - chavez - cartel - said | 23 | 29_drug_mexican_chavez_cartel | | 30 | ship - vessel - captain - crew - coast | 23 | 30_ship_vessel_captain_crew | | 31 | snowden - us - intelligence - information - gebregeorgis | 23 | 31_snowden_us_intelligence_information | | 32 | match - wimbledon - federer - final - open | 22 | 32_match_wimbledon_federer_final | | 33 | chinese - china - beijing - hong - protester | 21 | 33_chinese_china_beijing_hong | | 34 | jury - white - ferguson - police - said | 21 | 34_jury_white_ferguson_police | | 35 | weather - temperature - rain - warm - park | 21 | 35_weather_temperature_rain_warm | | 36 | prince - royal - william - princess - queen | 20 | 36_prince_royal_william_princess | | 37 | weight - fat - diet - gym - size | 19 | 37_weight_fat_diet_gym | | 38 | golf - mcilroy - round - pga - championship | 19 | 38_golf_mcilroy_round_pga | | 39 | hamilton - race - rosberg - prix - button | 19 | 39_hamilton_race_rosberg_prix | | 40 | north - kim - korean - korea - koreas | 18 | 40_north_kim_korean_korea | | 41 | human - found - fossil - ancient - fish | 18 | 41_human_found_fossil_ancient | | 42 | climate - change - global - energy - wind | 17 | 42_climate_change_global_energy | | 43 | school - teacher - pupil - schools - ofsted | 17 | 43_school_teacher_pupil_schools | | 44 | ebola - virus - health - outbreak - liberia | 17 | 44_ebola_virus_health_outbreak | | 45 | whale - nyad - shark - swim - beach | 17 | 45_whale_nyad_shark_swim | | 46 | money - kallakis - foster - court - wines | 15 | 46_money_kallakis_foster_court | | 47 | painting - art - portrait - auction - artist | 14 | 47_painting_art_portrait_auction | | 48 | solar - planet - sun - bubble - earth | 14 | 48_solar_planet_sun_bubble | | 49 | tsarnaev - oswald - boston - marathon - kennedy | 14 | 49_tsarnaev_oswald_boston_marathon | | 50 | patient - care - va - hospital - patients | 14 | 50_patient_care_va_hospital | | 51 | love - woman - im - relationship - men | 13 | 51_love_woman_im_relationship | | 52 | marijuana - alcohol - drug - hangover - liver | 11 | 52_marijuana_alcohol_drug_hangover |
## Training hyperparameters * calculate_probabilities: True * language: english * low_memory: False * min_topic_size: 10 * n_gram_range: (1, 1) * nr_topics: None * seed_topic_list: None * top_n_words: 10 * verbose: False ## Framework versions * Numpy: 1.22.4 * HDBSCAN: 0.8.33 * UMAP: 0.5.3 * Pandas: 1.5.3 * Scikit-Learn: 1.2.2 * Sentence-transformers: 2.2.2 * Transformers: 4.31.0 * Numba: 0.56.4 * Plotly: 5.13.1 * Python: 3.10.6