ayushsinha commited on
Commit
d43ca14
Β·
verified Β·
1 Parent(s): 750f2c8

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +102 -0
README.md ADDED
@@ -0,0 +1,102 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # BERT-Base-Uncased Quantized Model for social media sentiment analysis
2
+
3
+ This repository hosts a quantized version of the **bert-base-uncased** model, fine-tuned for social media sentiment analysis tasks. The model has been optimized for efficient deployment while maintaining high accuracy, making it suitable for resource-constrained environments.
4
+
5
+ ## Model Details
6
+
7
+ - **Model Architecture:** BERT Base Uncased
8
+ - **Task:** Social Media Sentiment Analysis
9
+ - **Dataset:** Social Media Sentiments Analysis Dataset [Kaggle]
10
+ - **Quantization:** Float16
11
+ - **Fine-tuning Framework:** Hugging Face Transformers
12
+
13
+ ## Usage
14
+
15
+ ### Installation
16
+
17
+ ```sh
18
+ pip install transformers torch
19
+ ```
20
+
21
+
22
+ ### Loading the Model
23
+
24
+ ```python
25
+
26
+ from transformers import BertForSequenceClassification, BertTokenizer
27
+ import torch
28
+ device = 'cuda' if torch.cuda.is_available() else 'cpu'
29
+
30
+ # Load quantized model
31
+ model_name = "AventIQ-AI/bert-social-media-sentiment-analysis"
32
+ model = BertForSequenceClassification.from_pretrained(model_name).to(device)
33
+ tokenizer = BertTokenizer.from_pretrained(model_name)
34
+
35
+ #Function to make analysis
36
+ def predict_sentiment(text):
37
+ # Tokenize input text
38
+ inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=512)
39
+
40
+ # Move tensors to GPU if available
41
+ inputs = {key: val.to(device) for key, val in inputs.items()}
42
+
43
+ # Get model prediction
44
+ with torch.no_grad():
45
+ outputs = model(**inputs)
46
+
47
+ # Get predicted class
48
+ logits = outputs.logits
49
+ predicted_class = torch.argmax(logits, dim=1).item()
50
+
51
+ # Map back to sentiment labels
52
+ sentiment_labels = {0: "Negative", 1: "Neutral", 2: "Positive"}
53
+ return sentiment_labels[predicted_class]
54
+
55
+ # Define a test sentence
56
+ test_sentence = "Spending time with family always brings me so much joy."
57
+ print(f"Predicted Sentiment: {predict_sentiment(text)}")
58
+ ```
59
+
60
+ ## Performance Metrics
61
+
62
+ - **Accuracy:** 0.82
63
+ - **Precision:** 0.68
64
+ - **Recall:** 0.82
65
+ - **F1 Score:** 0.73
66
+
67
+ ## Fine-Tuning Details
68
+
69
+ ### Dataset
70
+
71
+ The dataset is taken from Kaggle Social Media Sentiment Analysis.
72
+
73
+ ### Training
74
+
75
+ - Number of epochs: 6
76
+ - Batch size: 8
77
+ - Evaluation strategy: epoch
78
+ - Learning rate: 3e-5
79
+
80
+ ### Quantization
81
+
82
+ Post-training quantization was applied using PyTorch's built-in quantization framework to reduce the model size and improve inference efficiency.
83
+
84
+ ## Repository Structure
85
+
86
+ ```
87
+ .
88
+ β”œβ”€β”€ model/ # Contains the quantized model files
89
+ β”œβ”€β”€ tokenizer_config/ # Tokenizer configuration and vocabulary files
90
+ β”œβ”€β”€ model.safensors/ # Fine Tuned Model
91
+ β”œβ”€β”€ README.md # Model documentation
92
+ ```
93
+
94
+ ## Limitations
95
+
96
+ - The model may not generalize well to domains outside the fine-tuning dataset.
97
+ - Quantization may result in minor accuracy degradation compared to full-precision models.
98
+
99
+ ## Contributing
100
+
101
+ Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions or improvements.
102
+