Swahili Hate Speech Classification Model

This is a fine-tuned BERT model for multi-class text classification in Swahili. It predicts whether a given text is:

  • Non-hate speech
  • Political hate speech
  • Offensive language

🧠 Model Details

  • Architecture: BERT (base)
  • Languages: Swahili
  • Classes: 3
  • Model size: 178M parameters
  • Framework: PyTorch
  • Training data: A custom labeled dataset of Swahili social media or online comments (non-public)

🏷️ Labels

Label ID Class Name
LABEL_0 Non-hate speech
LABEL_1 Political hate speech
LABEL_2 Offensive language

πŸš€ Usage

You can load and test the model using the transformers library:

from transformers import pipeline

classifier = pipeline("text-classification", model="sandbox338/hatespeech")

result = classifier("Hii ni ujumbe wa kawaida bila matusi.")
print(result)  # [{'label': 'LABEL_0', 'score': 0.98}]
Downloads last month
31
Safetensors
Model size
178M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Space using sandbox338/hatespeech 1