us-only-mar11
This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.
Usage
To use this model, please install BERTopic:
pip install -U bertopic
You can use the model as follows:
from bertopic import BERTopic
topic_model = BERTopic.load("Thang203/us-only-mar11")
topic_model.get_topic_info()
Topic overview
- Number of topics: 20
- Number of training documents: 1908
Click here for an overview of all topics.
Topic ID | Topic Keywords | Topic Frequency | Label |
---|---|---|---|
-1 | models - language - model - language models - llms | 10 | -1_models_language_model_language models |
0 | models - language - reasoning - language models - large | 616 | 0_models_language_reasoning_language models |
1 | code - llms - language - models - programming | 467 | 1_code_llms_language_models |
2 | learning - reinforcement - reinforcement learning - planning - rl | 139 | 2_learning_reinforcement_reinforcement learning_planning |
3 | clinical - medical - models - language - data | 92 | 3_clinical_medical_models_language |
4 | language - models - language models - llms - scaling | 86 | 4_language_models_language models_llms |
5 | summarization - event - generation - events - text | 75 | 5_summarization_event_generation_events |
6 | dialogue - dialog - systems - conversational - conversations | 59 | 6_dialogue_dialog_systems_conversational |
7 | text - adversarial - attacks - detection - models | 58 | 7_text_adversarial_attacks_detection |
8 | bias - biases - social - gender - models | 52 | 8_bias_biases_social_gender |
9 | ai - chatgpt - ethical - artificial intelligence - intelligence | 49 | 9_ai_chatgpt_ethical_artificial intelligence |
10 | education - students - programming - educational - questions | 49 | 10_education_students_programming_educational |
11 | privacy - private - federated - attacks - models | 37 | 11_privacy_private_federated_attacks |
12 | speech - audio - asr - speech recognition - recognition | 21 | 12_speech_audio_asr_speech recognition |
13 | materials - chemistry - chemical - molecular - model | 20 | 13_materials_chemistry_chemical_molecular |
14 | recommendation - user - item - reviews - news | 20 | 14_recommendation_user_item_reviews |
15 | financial - sentiment - stock - data - market | 17 | 15_financial_sentiment_stock_data |
16 | game - games - state - generate - state information | 15 | 16_game_games_state_generate |
17 | legal - law - argumentative - court - standards | 14 | 17_legal_law_argumentative_court |
18 | metadata - language - keyphrase - large - user intents | 12 | 18_metadata_language_keyphrase_large |
Training hyperparameters
- calculate_probabilities: False
- language: english
- low_memory: False
- min_topic_size: 10
- n_gram_range: (1, 1)
- nr_topics: 20
- seed_topic_list: None
- top_n_words: 10
- verbose: True
- zeroshot_min_similarity: 0.7
- zeroshot_topic_list: None
Framework versions
- Numpy: 1.25.2
- HDBSCAN: 0.8.33
- UMAP: 0.5.5
- Pandas: 1.5.3
- Scikit-Learn: 1.2.2
- Sentence-transformers: 2.5.1
- Transformers: 4.38.2
- Numba: 0.58.1
- Plotly: 5.15.0
- Python: 3.10.12
- Downloads last month
- 4
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support