Sentiment Analysis Using LSTM and CNN

This project implements a hybrid deep learning model combining Long Short-Term Memory (LSTM) networks and Convolutional Neural Networks (CNN) for sentiment analysis. The architecture leverages the strengths of both LSTM and CNN to process textual data and classify sentiments effectively.

Model Architecture

The architecture consists of two parallel branches that process the input text sequences and merge their outputs for final classification:

Branch 1: CNN-Based Processing

Embedding Layer: Converts input sequences into dense vector representations.
Conv1D + Activation: Extracts local features from the text using convolutional filters.
MaxPooling1D: Reduces the spatial dimensions while retaining the most important features.
BatchNormalization: Normalizes the activations to stabilize and accelerate training.
Conv1D + MaxPooling1D + BatchNormalization: Repeats the convolution and pooling process to extract deeper features.
Flatten: Converts the 2D feature maps into a 1D vector.

Branch 2: LSTM-Based Processing

Embedding Layer: Similar to the CNN branch, converts input sequences into dense vector representations.
Bidirectional LSTM: Captures long-term dependencies in the text by processing it in both forward and backward directions.
LayerNormalization: Normalizes the outputs of the LSTM layer.
Bidirectional GRU: Further processes the sequence with Gated Recurrent Units for efficiency.
LayerNormalization: Normalizes the GRU outputs.
Flatten: Converts the sequence outputs into a 1D vector.

Merging and Classification

Concatenate: Combines the outputs of the CNN and LSTM branches.
Dense Layers with Dropout: Fully connected layers with ReLU activation and dropout for regularization.
Output Layer: A dense layer with a softmax activation function to classify the sentiment into three categories: Positive, Neutral, and Negative.

Why LSTM + CNN for Sentiment Analysis?

LSTM Strengths

LSTMs are well-suited for capturing long-term dependencies in sequential data, such as text.
They excel at understanding the context and relationships between words in a sentence.

CNN Strengths

CNNs are effective at extracting local patterns and features, such as n-grams, from text data.
They are computationally efficient and can process data in parallel.

Hybrid Approach

By combining LSTM and CNN, the model benefits from:

Contextual Understanding: LSTM captures the sequential nature of text.
Feature Extraction: CNN identifies important local patterns.
Robustness: The merged architecture ensures better generalization and performance on sentiment classification tasks.

Applications

This model can be used for:

Social media sentiment analysis (e.g., Twitter, Reddit).
Customer feedback classification.
Opinion mining in reviews and surveys.

Training and Evaluation

The model is trained on labeled datasets with text and sentiment labels. It uses:

Sparse Categorical Crossentropy as the loss function.
AdamW Optimizer for efficient training.
Early Stopping and Model Checkpoints to prevent overfitting and save the best model.

The performance is evaluated using metrics like accuracy, confusion matrix, and classification report.

Conclusion

The hybrid LSTM + CNN architecture provides a powerful framework for sentiment analysis, combining the strengths of sequential modeling and feature extraction. This approach is versatile and can be adapted to various text classification tasks.

Lisence

MIT Lisence

longdnk20
/

LSTM_CNN_SENTIMENT_PRETRAIN