BERTopic_andattakstruk_2

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("GiganticLemon/BERTopic_andattakstruk_2")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 62
  • Number of training documents: 16559
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 the - and - to - of - in 21 -1_the_and_to_of
0 the - to - of - and - is 8983 0_the_to_of_and
1 the - to - that - he - and 1232 1_the_to_that_he
2 her - she - to - and - is 605 2_her_she_to_and
3 and - the - of - to - in 506 3_and_the_of_to
4 the - of - earth - to - and 473 4_the_of_earth_to
5 the - and - to - he - his 459 5_the_and_to_he
6 the - and - to - of - ship 416 6_the_and_to_of
7 the - to - of - and - his 370 7_the_to_of_and
8 de - his - he - to - the 306 8_de_his_he_to
9 her - she - to - and - is 192 9_her_she_to_and
10 chinese - the - and - of - to 160 10_chinese_the_and_of
11 the - president - soviet - of - us 150 11_the_president_soviet_of
12 russian - the - his - to - of 145 12_russian_the_his_to
13 asterix - roman - obelix - the - rome 141 13_asterix_roman_obelix_the
14 doctor - tardis - the - ace - to 140 14_doctor_tardis_the_ace
15 of - that - the - in - or 138 15_of_that_the_in
16 socrates - theseus - the - of - and 130 16_socrates_theseus_the_of
17 vampire - vampires - darren - sookie - to 111 17_vampire_vampires_darren_sookie
18 kirk - enterprise - spock - federation - klingon 111 18_kirk_enterprise_spock_federation
19 reacher - hardy - frank - boys - hardys 101 19_reacher_hardy_frank_boys
20 cadfael - his - the - to - of 99 20_cadfael_his_the_to
21 jedi - vong - luke - leia - han 87 21_jedi_vong_luke_leia
22 german - szpilman - hitler - was - the 78 22_german_szpilman_hitler_was
23 jesus - judah - god - of - the 78 23_jesus_judah_god_of
24 animorphs - jake - visser - ax - cassie 67 24_animorphs_jake_visser_ax
25 spirou - fantasio - champignac - count - marsupilami 66 25_spirou_fantasio_champignac_count
26 henson - white - black - the - slaves 57 26_henson_white_black_the
27 novel - of - his - in - book 56 27_novel_of_his_in
28 dawkins - of - that - science - religion 55 28_dawkins_of_that_science
29 obiwan - jedi - quigon - kenobi - anakin 52 29_obiwan_jedi_quigon_kenobi
30 cats - clan - thunderclan - kits - firestar 48 30_cats_clan_thunderclan_kits
31 redwall - abbey - the - and - vermin 48 31_redwall_abbey_the_and
32 virus - the - to - is - of 47 32_virus_the_to_is
33 buffy - sunnydale - willow - slayer - giles 46 33_buffy_sunnydale_willow_slayer
34 time - machine - traveller - in - the 44 34_time_machine_traveller_in
35 confederate - lee - scarlett - rhett - the 38 35_confederate_lee_scarlett_rhett
36 bond - bonds - to - leiter - by 37 36_bond_bonds_to_leiter
37 baseball - hobbs - game - team - belichick 37 37_baseball_hobbs_game_team
38 sharpe - scene - french - sharpes - harper 36 38_sharpe_scene_french_sharpes
39 nancy - bess - nancys - george - mystery 33 39_nancy_bess_nancys_george
40 women - of - ellador - men - in 33 40_women_of_ellador_men
41 manticore - sten - haven - fleet - honor 32 41_manticore_sten_haven_fleet
42 billy - john - horse - ranch - harold 31 42_billy_john_horse_ranch
43 global - warming - climate - energy - carbon 30 43_global_warming_climate_energy
44 christmas - claus - santa - roger - mimi 30 44_christmas_claus_santa_roger
45 holmes - sherlock - watson - douglas - that 29 45_holmes_sherlock_watson_douglas
46 tarzan - ape - lion - tarzans - opar 28 46_tarzan_ape_lion_tarzans
47 conan - conans - dake - aquilonia - raseri 28 47_conan_conans_dake_aquilonia
48 angel - angels - quillon - archangel - alleluia 27 48_angel_angels_quillon_archangel
49 lone - wolf - kai - magnamund - darklords 27 49_lone_wolf_kai_magnamund
50 helm - matt - helms - mac - agency 27 50_helm_matt_helms_mac
51 dorothy - oz - elphaba - wizard - ozma 27 51_dorothy_oz_elphaba_wizard
52 max - fang - flock - roland - victor 26 52_max_fang_flock_roland
53 tom - swift - mr - airship - toms 25 53_tom_swift_mr_airship
54 tintin - haddock - calculus - snowy - the 25 54_tintin_haddock_calculus_snowy
55 robot - robots - derec - ariel - city 23 55_robot_robots_derec_ariel
56 bertie - jeeves - emsworth - gally - freddie 23 56_bertie_jeeves_emsworth_gally
57 alex - sarov - alexs - mi6 - to 23 57_alex_sarov_alexs_mi6
58 carson - rayford - tribulation - carpathia - buck 22 58_carson_rayford_tribulation_carpathia
59 dresden - harry - thomas - murphy - dresdens 22 59_dresden_harry_thomas_murphy
60 brigitta - major - life - novel - of 22 60_brigitta_major_life_novel

Training hyperparameters

  • calculate_probabilities: False
  • language: english
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: None
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: True
  • zeroshot_min_similarity: 0.7
  • zeroshot_topic_list: None

Framework versions

  • Numpy: 2.0.2
  • HDBSCAN: 0.8.40
  • UMAP: 0.5.7
  • Pandas: 2.2.2
  • Scikit-Learn: 1.6.1
  • Sentence-transformers: 3.4.1
  • Transformers: 4.51.3
  • Numba: 0.60.0
  • Plotly: 5.24.1
  • Python: 3.11.12
Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support