Sentiment Analysis Model (Vibescribe)

Vibescribe built with Hugging Face Transformers, fine-tuned on IMDB reviews.

Setup

  1. Clone the repository:
git clone https://github.com/your-username/sentiment-analysis
cd sentiment-analysis
  1. Create virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Log in to Hugging Face:
huggingface-cli login

Project Structure

sentiment-analysis/
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ train.py
β”œβ”€β”€ inference.py
β”œβ”€β”€ utils.py
└── README.md

Files to Create

requirements.txt

transformers==4.37.2
datasets==2.16.1
torch==2.1.2
scikit-learn==1.4.0

utils.py

from sklearn.metrics import accuracy_score, precision_recall_fscore_support

def compute_metrics(pred):
    labels = pred.label_ids
    preds = pred.predictions.argmax(-1)
    precision, recall, f1, _ = precision_recall_fscore_support(labels, preds, average='binary')
    return {
        'accuracy': accuracy_score(labels, preds),
        'f1': f1,
        'precision': precision,
        'recall': recall
    }

inference.py

from transformers import pipeline

def load_model(model_path):
    return pipeline("sentiment-analysis", model=model_path)

def predict(classifier, text):
    return classifier(text)

if __name__ == "__main__":
    model_path = "your-username/sentiment-analysis-model"
    classifier = load_model(model_path)
    
    # Example prediction
    text = "This movie was really great!"
    result = predict(classifier, text)
    print(f"Text: {text}\nSentiment: {result}")

Training

  1. Update model configuration in train.py:
training_args = TrainingArguments(
    output_dir="sentiment-analysis-model",
    hub_model_id="your-username/sentiment-analysis-model",  # Change this
    ...
)
  1. Start training:
python train.py

Making Predictions

from inference import load_model, predict

classifier = load_model("your-username/sentiment-analysis-model")
result = predict(classifier, "Your text here")

Model Details

  • Base model: DistilBERT
  • Dataset: IMDB Reviews
  • Task: Binary sentiment classification (positive/negative)
  • Training time: ~2-3 hours on GPU
  • Model size: ~260MB

Performance Metrics

  • Accuracy: ~91-93%
  • F1 Score: ~91-92%
  • Precision: ~90-91%
  • Recall: ~91-92%

Contributing

  1. Fork the repository
  2. Create feature branch
  3. Commit changes
  4. Push to branch
  5. Open pull request

License

MIT License

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support