--- library_name: transformers language: - en base_model: - distilbert/distilbert-base-uncased pipeline_tag: text-classification tags: - phishing-detection - email-security - transformers - distilbert --- # 📧 Model Card for [aamoshdahal/email-phishing-distilbert-finetuned](https://huggingface.co/aamoshdahal/email-phishing-distilbert-finetuned) This model is a fine-tuned version of **DistilBERT (distilbert-base-uncased)** trained specifically for **phishing email detection**. It classifies email content into two categories: **phishing** and **legitimate**. The model was trained using a [`Phishing Email Dataset`](https://www.kaggle.com/datasets/naserabdullahalam/phishing-email-dataset?select=phishing_email.csv) and evaluated against the [`cybersectony/PhishingEmailDetectionv2.0`](https://huggingface.co/datasets/cybersectony/PhishingEmailDetectionv2.0) dataset. It is optimized for: - **High recall** to catch most phishing attempts - **High precision** to reduce false positives - **Fast inference** via the lightweight DistilBERT architecture - **Interpretability**, with support for token-level explanations using [`transformers-interpret`](https://github.com/cdpierse/transformers-interpret) This model is ideal for security tools, email scanning systems, awareness training platforms, and research on adversarial phishing attacks. ## Model Details ### Model Description This is a fine-tuned DistilBERT model trained to classify email content as either **phishing** or **legitimate**. It was developed as part of a cybersecurity research project to detect phishing attempts in email messages using finetuned transformer model. - **Developed by:** [@aamoshdahal](https://huggingface.co/aamoshdahal) - **Model type:** DistilBERT (Transformer-based sequence classifier) - **Language(s):** English - **Finetuned from model:** distilbert-base-uncased ### Intended Uses & Users This model is intended to be used as a lightweight and reliable phishing email detector. It can be integrated into: - **Email clients or gateways** to filter phishing emails in real time - **Security software or firewalls** as an additional phishing classifier - **Educational tools** for training users to recognize phishing attempts - **Research environments** to study adversarial or evolving phishing tactics #### Foreseeable Users: - Cybersecurity professionals - Software developers integrating NLP into email platforms - Researchers working on phishing detection #### Foreseeable Impact: - Improved early detection of phishing attacks - Reduced exposure to credential theft and fraud - Increased public understanding of phishing strategies ## 🚀 How to Get Started with the Model You can use the code snippet below to quickly load the fine-tuned model and make predictions on any email content: ``` from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch from transformers_interpret import SequenceClassificationExplainer # Load the model and tokenizer from Hugging Face Hub model_id = "aamoshdahal/email-phishing-distilbert-finetuned" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForSequenceClassification.from_pretrained(model_id) # Set device (GPU if available) device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model.to(device) model.eval() # Example email for prediction email = \"\"\"Dear user, We detected suspicious activity on your account. Please verify your identity immediately by clicking the link below to avoid suspension. [Phishing Link Here] Thank you, Security Team\"\"\" # Tokenize and prepare the input encoded_input = tokenizer(email, return_tensors='pt', truncation=True, padding=True).to(device) # Make prediction with torch.no_grad(): outputs = model(**encoded_input) probs = torch.nn.functional.softmax(outputs.logits, dim=1) # Output prediction labels = ["legitimate", "phishing"] pred_label = labels[probs.argmax()] confidence = probs.max().item() print(f"Prediction: {pred_label} ({confidence:.2%} confidence)") explainer = SequenceClassificationExplainer(model=model, tokenizer=tokenizer) word_attributions = explainer(email, class_name="LABEL_0") explainer.visualize() ``` ## 🏋️‍♂️ Training Details ### 📦 Training Data The model was fine-tuned on a **balanced phishing email dataset** compiled from multiple public sources, including: - Enron Email Dataset - CEAS 2008 Phishing Corpus - Ling-Spam Dataset - SpamAssassin - Nazario Phishing Emails - Nigerian Fraud Email Dataset These were aggregated and preprocessed via the [Phishing Email Dataset on Kaggle](https://www.kaggle.com/datasets/Alam97/phishing-email-dataset). Each data entry includes a combined `text_combined` field, which concatenates the subject line, body text, sender address, and timestamp to provide full context for classification. --- ### ⚙️ Training Procedure This model was fine-tuned using the Hugging Face 🤗 `Trainer` API with the following configuration: - **Base model**: `distilbert-base-uncased` - **Architecture**: Transformer-based sequence classifier (`DistilBertForSequenceClassification`) - **Epochs**: 3 - **Batch size**: 16 - **Learning rate**: 2e-5 - **Weight decay**: 0.01 - **Evaluation strategy**: Per epoch - **Monitoring**: All metrics logged via Weights & Biases (W&B) The model was trained using a Tesla A100 GPU (40GB VRAM) on Google Colab Pro. #### Preprocessing - Duplicate and null record removal - Lowercasing and whitespace cleanup - Tokenization using `DistilBertTokenizer` - Label encoding (0 = legitimate, 1 = phishing) - Random Undersampling to ensure class balance ## 📊 Evaluation Results For updated results and runs check this public wandb project. [Full Report](https://wandb.ai/dahalaamosh-harrisburg-university/Phishing_Detection_DistilBERT_Uncased) The fine-tuned DistilBERT model was evaluated on a test dataset containing both phishing and legitimate emails. Below is a summary of its performance compared to baseline models (raw DistilBERT and raw BERT): ### 📈 Fine-Tuned DistilBERT (Best Performing) | Epoch | Training Loss | Validation Loss | Accuracy | Precision | Recall | F1 Score | ROC AUC | |-------|----------------|------------------|----------|-----------|--------|----------|---------| | 1 | 0.0323 | 0.0243 | 0.9936 | 0.9916 | 0.9961 | 0.9939 | 0.9996 | | 2 | 0.0083 | 0.0297 | 0.9938 | 0.9968 | 0.9912 | 0.9940 | 0.9998 | | 3 | 0.0044 | 0.0275 | **0.9951** | **0.9959** | **0.9947** | **0.9953** | **0.9997** | - **Test Set Summary:** - Accuracy: **96.62%** - Precision: **95.90%** - Recall: **97.46%** - F1 Score: **96.67%** - ROC AUC: **0.9953** --- ### ⚠️ Raw DistilBERT (Untrained) - Accuracy: 49.57% - Precision: 0.00% - Recall: 0.00% - F1 Score: 0.00 - ROC AUC: 0.5694 --- ### ⚠️ Raw BERT (Untrained) - Accuracy: 49.57% - Precision: 0.00% - Recall: 0.00% - F1 Score: 0.00 - ROC AUC: 0.4984 ---