Sentiment Analysis Model v2.0

This is an improved version of the sentiment analysis model, fine-tuned with additional challenging examples to handle difficult cases like negation, sarcasm, and subtle expressions.

Model Details

  • Model Type: DistilBERT (fine-tuned)
  • Task: Binary Sentiment Classification (Positive/Negative)
  • Training Data: IMDB Movie Reviews Dataset
  • Language: English
  • License: MIT
  • Version: 2.0

Performance

Metric Value
Accuracy 86.50%
F1 Score 0.8672
Precision 84.21%
Recall 89.47%

Training Details

The model was trained on the IMDB dataset augmented with challenging examples specifically designed to improve performance on difficult sentiment analysis cases.

Training Hyperparameters

  • Learning Rate: 2e-5
  • Batch Size: 16 (effective batch size: 32 with gradient accumulation)
  • Epochs: 3
  • Optimizer: AdamW with weight decay
  • Mixed Precision: FP16

Usage

Direct Use with Pipeline

from transformers import pipeline

# Load the model
sentiment = pipeline("sentiment-analysis", model="shane-reaume/imdb-sentiment-analysis-v2")

# Analyze text
result = sentiment("I really enjoyed this movie!")
print(result)  # [{'label': 'POSITIVE', 'score': 0.9998}]

# Batch processing
texts = [
    "This movie was absolutely amazing, I loved every minute of it!",
    "The acting was terrible and the plot made no sense at all."
]
results = sentiment(texts)
for i, (text, result) in enumerate(zip(texts, results)):
    print(f"Text: {{text}}")
    print(f"Sentiment: {{result['label']}}, Score: {{result['score']:.4f}}")

Loading Model Directly

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

# Load model and tokenizer
model_name = "shane-reaume/imdb-sentiment-analysis-v2"
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Prepare text
text = "I really enjoyed this movie!"
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)

# Get prediction
with torch.no_grad():
    outputs = model(**inputs)
    
# Process outputs
probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)
prediction = torch.argmax(probabilities, dim=-1).item()
confidence = probabilities[0][prediction].item()

# Map prediction to label (0: negative, 1: positive)
sentiment_label = "POSITIVE" if prediction == 1 else "NEGATIVE"
print(f"Sentiment: {{sentiment_label}}, Confidence: {{confidence:.4f}}")

Limitations

  • The model is trained primarily on movie reviews and may not perform as well on other domains.
  • The model may struggle with certain types of text:
    • Sarcasm and irony
    • Mixed sentiment expressions
    • Subtle negative expressions
    • Complex negations

Citation

If you use this model in your research, please cite:

@misc{sentiment-analysis-model,
  author = {Your Name},
  title = {Sentiment Analysis Model based on DistilBERT},
  year = {2023},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/shane-reaume/imdb-sentiment-analysis-v2}}
}
Downloads last month
4
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train shane-reaume/imdb-sentiment-analysis-v2

Evaluation results