Sentiment Analysis Model v2.0
This is an improved version of the sentiment analysis model, fine-tuned with additional challenging examples to handle difficult cases like negation, sarcasm, and subtle expressions.
Model Details
- Model Type: DistilBERT (fine-tuned)
- Task: Binary Sentiment Classification (Positive/Negative)
- Training Data: IMDB Movie Reviews Dataset
- Language: English
- License: MIT
- Version: 2.0
Performance
Metric | Value |
---|---|
Accuracy | 86.50% |
F1 Score | 0.8672 |
Precision | 84.21% |
Recall | 89.47% |
Training Details
The model was trained on the IMDB dataset augmented with challenging examples specifically designed to improve performance on difficult sentiment analysis cases.
Training Hyperparameters
- Learning Rate: 2e-5
- Batch Size: 16 (effective batch size: 32 with gradient accumulation)
- Epochs: 3
- Optimizer: AdamW with weight decay
- Mixed Precision: FP16
Usage
Direct Use with Pipeline
from transformers import pipeline
# Load the model
sentiment = pipeline("sentiment-analysis", model="shane-reaume/imdb-sentiment-analysis-v2")
# Analyze text
result = sentiment("I really enjoyed this movie!")
print(result) # [{'label': 'POSITIVE', 'score': 0.9998}]
# Batch processing
texts = [
"This movie was absolutely amazing, I loved every minute of it!",
"The acting was terrible and the plot made no sense at all."
]
results = sentiment(texts)
for i, (text, result) in enumerate(zip(texts, results)):
print(f"Text: {{text}}")
print(f"Sentiment: {{result['label']}}, Score: {{result['score']:.4f}}")
Loading Model Directly
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
# Load model and tokenizer
model_name = "shane-reaume/imdb-sentiment-analysis-v2"
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Prepare text
text = "I really enjoyed this movie!"
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
# Get prediction
with torch.no_grad():
outputs = model(**inputs)
# Process outputs
probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)
prediction = torch.argmax(probabilities, dim=-1).item()
confidence = probabilities[0][prediction].item()
# Map prediction to label (0: negative, 1: positive)
sentiment_label = "POSITIVE" if prediction == 1 else "NEGATIVE"
print(f"Sentiment: {{sentiment_label}}, Confidence: {{confidence:.4f}}")
Limitations
- The model is trained primarily on movie reviews and may not perform as well on other domains.
- The model may struggle with certain types of text:
- Sarcasm and irony
- Mixed sentiment expressions
- Subtle negative expressions
- Complex negations
Citation
If you use this model in your research, please cite:
@misc{sentiment-analysis-model,
author = {Your Name},
title = {Sentiment Analysis Model based on DistilBERT},
year = {2023},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/shane-reaume/imdb-sentiment-analysis-v2}}
}
- Downloads last month
- 4
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Dataset used to train shane-reaume/imdb-sentiment-analysis-v2
Evaluation results
- Accuracy on IMDBtest set self-reported86.500
- F1 Score on IMDBtest set self-reported0.867