---
library_name: transformers
language:
- en
base_model:
- distilbert/distilbert-base-uncased
pipeline_tag: text-classification
tags:
- phishing-detection
- email-security
- transformers
- distilbert
---

# 📧 Model Card for [aamoshdahal/email-phishing-distilbert-finetuned](https://huggingface.co/aamoshdahal/email-phishing-distilbert-finetuned)

This model is a fine-tuned version of **DistilBERT (distilbert-base-uncased)** trained specifically for **phishing email detection**. It classifies email content into two categories: **phishing** and **legitimate**.

The model was trained using a [`Phishing Email Dataset`](https://www.kaggle.com/datasets/naserabdullahalam/phishing-email-dataset?select=phishing_email.csv) and evaluated against the [`cybersectony/PhishingEmailDetectionv2.0`](https://huggingface.co/datasets/cybersectony/PhishingEmailDetectionv2.0) dataset.

It is optimized for:
- **High recall** to catch most phishing attempts
- **High precision** to reduce false positives
- **Fast inference** via the lightweight DistilBERT architecture
- **Interpretability**, with support for token-level explanations using [`transformers-interpret`](https://github.com/cdpierse/transformers-interpret)

This model is ideal for security tools, email scanning systems, awareness training platforms, and research on adversarial phishing attacks.

## Model Details

### Model Description

This is a fine-tuned DistilBERT model trained to classify email content as either **phishing** or **legitimate**. It was developed as part of a cybersecurity research project to detect phishing attempts in email messages using finetuned transformer model.

- **Developed by:** [@aamoshdahal](https://huggingface.co/aamoshdahal)
- **Model type:** DistilBERT (Transformer-based sequence classifier)
- **Language(s):** English
- **Finetuned from model:** distilbert-base-uncased


### Intended Uses & Users

This model is intended to be used as a lightweight and reliable phishing email detector. It can be integrated into:

- **Email clients or gateways** to filter phishing emails in real time
- **Security software or firewalls** as an additional phishing classifier
- **Educational tools** for training users to recognize phishing attempts
- **Research environments** to study adversarial or evolving phishing tactics

#### Foreseeable Users:
- Cybersecurity professionals
- Software developers integrating NLP into email platforms
- Researchers working on phishing detection

#### Foreseeable Impact:
- Improved early detection of phishing attacks
- Reduced exposure to credential theft and fraud
- Increased public understanding of phishing strategies

## 🚀 How to Get Started with the Model

You can use the code snippet below to quickly load the fine-tuned model and make predictions on any email content:

```
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
from transformers_interpret import SequenceClassificationExplainer

# Load the model and tokenizer from Hugging Face Hub
model_id = "aamoshdahal/email-phishing-distilbert-finetuned"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)

# Set device (GPU if available)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
model.eval()

# Example email for prediction
email = \"\"\"Dear user,

We detected suspicious activity on your account. Please verify your identity immediately by clicking the link below to avoid suspension.

[Phishing Link Here]

Thank you,
Security Team\"\"\"

# Tokenize and prepare the input
encoded_input = tokenizer(email, return_tensors='pt', truncation=True, padding=True).to(device)

# Make prediction
with torch.no_grad():
    outputs = model(**encoded_input)
    probs = torch.nn.functional.softmax(outputs.logits, dim=1)

# Output prediction
labels = ["legitimate", "phishing"]
pred_label = labels[probs.argmax()]
confidence = probs.max().item()

print(f"Prediction: {pred_label} ({confidence:.2%} confidence)")

explainer = SequenceClassificationExplainer(model=model, tokenizer=tokenizer)
word_attributions = explainer(email, class_name="LABEL_0")
explainer.visualize()
```

## 🏋️‍♂️ Training Details

### 📦 Training Data

The model was fine-tuned on a **balanced phishing email dataset** compiled from multiple public sources, including:

- Enron Email Dataset  
- CEAS 2008 Phishing Corpus  
- Ling-Spam Dataset  
- SpamAssassin  
- Nazario Phishing Emails  
- Nigerian Fraud Email Dataset  

These were aggregated and preprocessed via the [Phishing Email Dataset on Kaggle](https://www.kaggle.com/datasets/Alam97/phishing-email-dataset). Each data entry includes a combined `text_combined` field, which concatenates the subject line, body text, sender address, and timestamp to provide full context for classification.

---

### ⚙️ Training Procedure

This model was fine-tuned using the Hugging Face 🤗 `Trainer` API with the following configuration:

- **Base model**: `distilbert-base-uncased`
- **Architecture**: Transformer-based sequence classifier (`DistilBertForSequenceClassification`)
- **Epochs**: 3  
- **Batch size**: 16  
- **Learning rate**: 2e-5  
- **Weight decay**: 0.01  
- **Evaluation strategy**: Per epoch  
- **Monitoring**: All metrics logged via Weights & Biases (W&B)

The model was trained using a Tesla A100 GPU (40GB VRAM) on Google Colab Pro.

#### Preprocessing

- Duplicate and null record removal  
- Lowercasing and whitespace cleanup  
- Tokenization using `DistilBertTokenizer`  
- Label encoding (0 = legitimate, 1 = phishing)  
- Random Undersampling to ensure class balance  


## 📊 Evaluation Results

For updated results and runs check this public wandb project. [Full Report](https://wandb.ai/dahalaamosh-harrisburg-university/Phishing_Detection_DistilBERT_Uncased)

The fine-tuned DistilBERT model was evaluated on a test dataset containing both phishing and legitimate emails. Below is a summary of its performance compared to baseline models (raw DistilBERT and raw BERT):

### 📈 Fine-Tuned DistilBERT (Best Performing)

| Epoch | Training Loss | Validation Loss | Accuracy | Precision | Recall | F1 Score | ROC AUC |
|-------|----------------|------------------|----------|-----------|--------|----------|---------|
| 1     | 0.0323         | 0.0243           | 0.9936   | 0.9916    | 0.9961 | 0.9939   | 0.9996  |
| 2     | 0.0083         | 0.0297           | 0.9938   | 0.9968    | 0.9912 | 0.9940   | 0.9998  |
| 3     | 0.0044         | 0.0275           | **0.9951** | **0.9959** | **0.9947** | **0.9953** | **0.9997** |

- **Test Set Summary:**  
  - Accuracy: **96.62%**  
  - Precision: **95.90%**  
  - Recall: **97.46%**  
  - F1 Score: **96.67%**  
  - ROC AUC: **0.9953**

---

### ⚠️ Raw DistilBERT (Untrained)

- Accuracy: 49.57%  
- Precision: 0.00%  
- Recall: 0.00%  
- F1 Score: 0.00  
- ROC AUC: 0.5694

---

### ⚠️ Raw BERT (Untrained)

- Accuracy: 49.57%  
- Precision: 0.00%  
- Recall: 0.00%  
- F1 Score: 0.00  
- ROC AUC: 0.4984

---