us-only-mar11

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("Thang203/us-only-mar11")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 20
  • Number of training documents: 1908
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 models - language - model - language models - llms 10 -1_models_language_model_language models
0 models - language - reasoning - language models - large 616 0_models_language_reasoning_language models
1 code - llms - language - models - programming 467 1_code_llms_language_models
2 learning - reinforcement - reinforcement learning - planning - rl 139 2_learning_reinforcement_reinforcement learning_planning
3 clinical - medical - models - language - data 92 3_clinical_medical_models_language
4 language - models - language models - llms - scaling 86 4_language_models_language models_llms
5 summarization - event - generation - events - text 75 5_summarization_event_generation_events
6 dialogue - dialog - systems - conversational - conversations 59 6_dialogue_dialog_systems_conversational
7 text - adversarial - attacks - detection - models 58 7_text_adversarial_attacks_detection
8 bias - biases - social - gender - models 52 8_bias_biases_social_gender
9 ai - chatgpt - ethical - artificial intelligence - intelligence 49 9_ai_chatgpt_ethical_artificial intelligence
10 education - students - programming - educational - questions 49 10_education_students_programming_educational
11 privacy - private - federated - attacks - models 37 11_privacy_private_federated_attacks
12 speech - audio - asr - speech recognition - recognition 21 12_speech_audio_asr_speech recognition
13 materials - chemistry - chemical - molecular - model 20 13_materials_chemistry_chemical_molecular
14 recommendation - user - item - reviews - news 20 14_recommendation_user_item_reviews
15 financial - sentiment - stock - data - market 17 15_financial_sentiment_stock_data
16 game - games - state - generate - state information 15 16_game_games_state_generate
17 legal - law - argumentative - court - standards 14 17_legal_law_argumentative_court
18 metadata - language - keyphrase - large - user intents 12 18_metadata_language_keyphrase_large

Training hyperparameters

  • calculate_probabilities: False
  • language: english
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: 20
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: True
  • zeroshot_min_similarity: 0.7
  • zeroshot_topic_list: None

Framework versions

  • Numpy: 1.25.2
  • HDBSCAN: 0.8.33
  • UMAP: 0.5.5
  • Pandas: 1.5.3
  • Scikit-Learn: 1.2.2
  • Sentence-transformers: 2.5.1
  • Transformers: 4.38.2
  • Numba: 0.58.1
  • Plotly: 5.15.0
  • Python: 3.10.12
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support