AI Source Detector (ViT-Base)

Detects and classifies the source of AI-generated images into five classes
(stable_diffusion, midjourney, dalle, real, other_ai).

Model Details

Architecture: ViT-Base Patch-16 × 224
Parameters: 86 M
Fine-tuning epochs: 10
Optimizer: AdamW (lr = 3e-5, wd = 0.01)
Hardware: 1× NVIDIA RTX 4090 (24 GB)

Training Data

Class	Images
Stable Diffusion	12 000
Midjourney	10 500
DALL-E 3	9 400
Real	11 800
Other AI	8 200

Total ≈ 52 k images - 80 % train / 10 % val / 10 % test.

Evaluation

Metric	Top-1	Macro F1
Validation	92.8 %	0.928
Test	91.6 %	0.914

Confusion Matrix (click to open)

Usage

from transformers import ViTImageProcessor, ViTForImageClassification, pipeline
classifier = pipeline(
    task="image-classification",
    model="yaya36095/ai-source-detector",
    top_k=1
)
classifier("demo.jpg")
# → [{'label': 'stable_diffusion', 'score': 0.97}]