WGNEWS_APR20

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("tyrealqian/WGNEWS_APR20")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 57
  • Number of training documents: 6094
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 media - athletes - ceremony - opening - opening ceremony 20 -1_media_athletes_ceremony_opening
0 covid - cases - pandemic - omicron - covid cases 2071 0_covid_cases_pandemic_omicron
1 teamcanada - curling - canada - hockey - canadas 438 1_teamcanada_curling_canada_hockey
2 kamila - valieva - kamila valieva - russian figure - skater kamila 268 2_kamila_valieva_kamila valieva_russian figure
3 winter sports - sports - ice snow - globalink - passion 199 3_winter sports_sports_ice snow_globalink
4 spokesperson - globalink - ambassador - turkish - said globalink 197 4_spokesperson_globalink_ambassador_turkish
5 dwen - mascot - bing dwen - dwen dwen - bing 182 5_dwen_mascot_bing dwen_dwen dwen
6 speed - short track - speed skating - track speed - track 173 6_speed_short track_speed skating_track speed
7 gu - eileen - womens freeski - eileen gu - gu ailing 145 7_gu_eileen_womens freeski_eileen gu
8 torch - flame - torch relay - relay - olympic flame 139 8_torch_flame_torch relay_relay
9 australia - diplomatic boycott - diplomatic - boycott - join diplomatic 127 9_australia_diplomatic boycott_diplomatic_boycott
10 traditional - chinese new - new year - papercutting - culture 107 10_traditional_chinese new_new year_papercutting
11 carbon - green - pollution - air quality - climate 105 11_carbon_green_pollution_air quality
12 biden - diplomatic boycott - diplomatic - white house - joe 79 12_biden_diplomatic boycott_diplomatic_white house
13 oval - skating oval - national speed - venue - ice 73 13_oval_skating oval_national speed_venue
14 mikaela - shiffrin - mikaela shiffrin - slalom - race 71 14_mikaela_shiffrin_mikaela shiffrin_slalom
15 socalled - boycott beijing - boycott - politicians - opposes 69 15_socalled_boycott beijing_boycott_politicians
16 robot - robots - food - serving - served 68 16_robot_robots_food_serving
17 putin - vladimir - vladimir putin - russian president - president vladimir 66 17_putin_vladimir_vladimir putin_russian president
18 kiara reddingius - kiara - reddingius - australian - australias 65 18_kiara reddingius_kiara_reddingius_australian
19 xi jinping - jinping - president xi - xi - chinese president 65 19_xi jinping_jinping_president xi_xi
20 human rights - rights - human - chinas human - rights watch 61 20_human rights_rights_human_chinas human
21 yanqing - paralympic - villages - winter paralympic - completed 60 21_yanqing_paralympic_villages_winter paralympic
22 nest - birds nest - birds - rehearsal - fireworks 59 22_nest_birds nest_birds_rehearsal
23 sadowskisynnott - new zealands - zealands - womens slopestyle - jakara 59 23_sadowskisynnott_new zealands_zealands_womens slopestyle
24 shaun - shaun white - white - halfpipe - hirano 58 24_shaun_shaun white_white_halfpipe
25 su - yiming - su yiming - mens snowboard - snowboarder su 57 25_su_yiming_su yiming_mens snowboard
26 heres watch - opening ceremony - opening - cbc - watch opening 51 26_heres watch_opening ceremony_opening_cbc
27 nhl - players - nhl players - send players - olympics nhl 49 27_nhl_players_nhl players_send players
28 medals grabs - gold delegations - heres breakdown - breakdown - count stands 47 28_medals grabs_gold delegations_heres breakdown_breakdown
29 president thomas - bach - thomas bach - thomas - ioc president 47 29_president thomas_bach_thomas bach_thomas
30 nathan chen - chen - nathan - skater nathan - elton 45 30_nathan chen_chen_nathan_skater nathan
31 snowfall - heavy - heavy snowfall - weather - snow 43 31_snowfall_heavy_heavy snowfall_weather
32 jamaicas - jamaica - benjamin alexander - benjamin - alexander 43 32_jamaicas_jamaica_benjamin alexander_benjamin
33 countdown - countdown beijing - day countdown - days - days beijing 42 33_countdown_countdown beijing_day countdown_days
34 reuterspictures - pictures - day pictures - pictures beijing - olympics day 40 34_reuterspictures_pictures_day pictures_pictures beijing
35 bank - commemorative - yuan - digital - coins 40 35_bank_commemorative_yuan_digital
36 yuzuruhanyu - hanyu - yuzuru - japanese - yuzuru hanyu 39 36_yuzuruhanyu_hanyu_yuzuru_japanese
37 xi - president xi - xi jinping - jinping - chinese president 37 37_xi_president xi_xi jinping_jinping
38 aerials - xu mengtao - mengtao - xu - womens aerials 34 38_aerials_xu mengtao_mengtao_xu
39 tickets - sell - sell tickets - sold - spectators 33 39_tickets_sell_sell tickets_sold
40 technologies - aerospace - technology - technologies used - scitech 33 40_technologies_aerospace_technology_technologies used
41 sui - cong - wenjing - sui wenjing - han cong 32 41_sui_cong_wenjing_sui wenjing
42 erin - jackson - erin jackson - woman - black 32 42_erin_jackson_erin jackson_woman
43 burner - personal - phones - apps smartphonelike - smartphonelike device 31 43_burner_personal_phones_apps smartphonelike
44 leduc - nonbinary - openly - timothy leduc - timothy 30 44_leduc_nonbinary_openly_timothy leduc
45 summer winter - host summer - city host - summer - city 27 45_summer winter_host summer_city host_summer
46 ukraine - invasion - ukraines - ukraine beijing - invasion ukraine 26 46_ukraine_invasion_ukraines_ukraine beijing
47 ralph lauren - lauren - uniforms - ralph - lauren unveiled 26 47_ralph lauren_lauren_uniforms_ralph
48 argentine - fernandez - president alberto - argentine president - alberto fernandez 25 48_argentine_fernandez_president alberto_argentine president
49 truce - olympic truce - antonio guterres - secretarygeneral antonio - guterres 25 49_truce_olympic truce_antonio guterres_secretarygeneral antonio
50 peng - shuai - peng shuai - tennis - chinese tennis 25 50_peng_shuai_peng shuai_tennis
51 zhangjiakou - chongli district - chongli - hebei - cohost 23 51_zhangjiakou_chongli district_chongli_hebei
52 athletes watch - watch beijing - names - biggest names - look biggest 23 52_athletes watch_watch beijing_names_biggest names
53 railway - highspeed - train - highspeed railway - beijingzhangjiakou highspeed 23 53_railway_highspeed_train_highspeed railway
54 peel - laura peel - laura - kerry - aerial 22 54_peel_laura peel_laura_kerry
55 shougang - shougang park - industrial - park - big air 20 55_shougang_shougang park_industrial_park

Training hyperparameters

  • calculate_probabilities: True
  • language: None
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: None
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: True
  • zeroshot_min_similarity: 0.7
  • zeroshot_topic_list: None

Framework versions

  • Numpy: 2.0.2
  • HDBSCAN: 0.8.40
  • UMAP: 0.5.7
  • Pandas: 2.2.2
  • Scikit-Learn: 1.6.1
  • Sentence-transformers: 3.4.1
  • Transformers: 4.51.3
  • Numba: 0.60.0
  • Plotly: 5.24.1
  • Python: 3.11.12
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support