BERT-Based Emotion Detection on ma2za/many_emotions

This repository hosts a fine-tuned emotion detection model built on BERT-base-cased. The model is trained on the ma2za/many_emotions dataset to classify text into one of seven emotion categories: anger, fear, joy, love, sadness, surprise, and neutral. The model is available in both PyTorch and ONNX formats for efficient deployment.

Model Details

Model Description

Developed by: Your Name or Organization
Model Type: Sequence Classification (Emotion Detection)
Base Model: bert-base-cased
Dataset: ma2za/many_emotions
Export Format: ONNX (for deployment)
License: Apache-2.0
Tags: onnx, emotion-detection, BERT, sequence-classification

This model was fine-tuned on the ma2za/many_emotions dataset, where the text is classified into emotion categories based on the content. For quick experimentation, a subset of the training data was used; however, the full model has been trained with the complete dataset and is now publicly available.

Training Details

Dataset Details

Dataset ID: ma2za/many_emotions
Text Column: text
Label Column: label

Training Hyperparameters

Epochs: 1 (for quick test; adjust to your needs)
Per Device Batch Size: 96
Learning Rate: 1e-5
Weight Decay: 0.01
Optimizer: AdamW
Training Duration: The full training run on the complete dataset (approximately 2.44 million training examples) was completed in about 3 hours and 40 minutes.

ONNX Export

The model has been exported to the ONNX format using opset version 14, ensuring support for modern operators such as scaled_dot_product_attention. This enables flexible deployment scenarios across different platforms using ONNX Runtime.

How to Load the Model

Instead of loading the model from a local directory, you can load it directly from the Hugging Face Hub using the repository name iimran/EmotionDetection.

Loading with Transformers (PyTorch)

import os
import numpy as np
import onnxruntime as ort
from transformers import AutoTokenizer, AutoConfig
from huggingface_hub import hf_hub_download

# Specify the repository details.
repo_id = "iimran/EmotionDetection"
filename = "model.onnx"

# Download the ONNX model file from the Hub.
onnx_model_path = hf_hub_download(repo_id=repo_id, filename=filename)
print("Model downloaded to:", onnx_model_path)

# Load the tokenizer and configuration from the repository.
tokenizer = AutoTokenizer.from_pretrained(repo_id)
config = AutoConfig.from_pretrained(repo_id)

# Check whether the configuration contains an id2label mapping.
if hasattr(config, "id2label") and config.id2label and len(config.id2label) > 0:
    id2label = config.id2label
else:
    # Default mapping for ma2za/many_emotions if not present in the config.
    id2label = {
        0: "anger",
        1: "fear",
        2: "joy",
        3: "love",
        4: "sadness",
        5: "surprise",
        6: "neutral"
    }
print("id2label mapping:", id2label)

# Create an ONNX Runtime inference session using the local model file.
session = ort.InferenceSession(onnx_model_path)

def onnx_infer(text):
    """
    Perform inference on the input text using the exported ONNX model.
    Returns the predicted emotion label.
    """
    # Tokenize the input text with a fixed maximum sequence length matching the model export.
    inputs = tokenizer(
        text,
        return_tensors="np",
        truncation=True,
        padding="max_length",
        max_length=256
    )
    
    # Prepare the model inputs.
    ort_inputs = {
        "input_ids": inputs["input_ids"],
        "attention_mask": inputs["attention_mask"]
    }
    
    # Run the model.
    outputs = session.run(None, ort_inputs)
    logits = outputs[0]
    
    # Get the predicted class id.
    predicted_class_id = int(np.argmax(logits, axis=-1)[0])
    
    # Map the predicted class id to its emotion label.
    predicted_label = id2label.get(str(predicted_class_id), id2label.get(predicted_class_id, str(predicted_class_id)))
    
    print("Predicted Emotion ID:", predicted_class_id)
    print("Predicted Emotion:", predicted_label)
    return predicted_label

# Test the inference function.
onnx_infer("That rude customer made me furious.")

Evaluation

The model is primarily evaluated using the accuracy metric during training. For deployment, further evaluation on unseen data is recommended to ensure robustness in production settings.

iimran
/

EmotionDetection