Variational Autoencoder (VAE) - CelebA Dataset

This repository contains a trained Variational Autoencoder (VAE) model on the CelebA dataset. The model is designed to encode and decode facial images, enabling tasks such as image reconstruction, latent space interpolation, and attribute manipulation.

Model Details

Architecture: Variational Autoencoder (VAE)
Dataset: CelebA
Latent Dimension: 200
Training Subset Size: 80,000 images
Batch Size: 64
Learning Rate: 1e-3
Epochs: 10

Weights and Biases Run

The training process was tracked using Weights and Biases. You can view the full training logs and metrics here.

Usage

Loading the Model

To load the trained model, use the following code snippet:

import torch
from vae_model import VAE  # Ensure the VAE class is defined in vae_model.py

# Define the latent dimension
latent_dim = 200

# Initialize the model
model = VAE(latent_dim=latent_dim)

# Load the trained weights
model_path = "./vae_celeba_latent_200_epochs_10_batch_64_subset_80000.pth"
model.load_state_dict(torch.load(model_path))
model.eval()

Applications

Image Reconstruction: Reconstruct input images using the encoder and decoder.
Latent Space Interpolation: Generate smooth transitions between two images by interpolating in the latent space.
Attribute Manipulation: Modify specific attributes (e.g., smiling, hair color) by moving along attribute directions in the latent space.

Example Results

Reconstruction

Below is a reconstruction example where the first row represents the original images, and the second row represents the reconstructed images:

Latent Space Interpolation

Below is an example of interpolating between two images in the latent space:

Attribute Manipulation

Manipulating the "Smiling" attribute in the latent space:

License

This project is licensed under the MIT License. See the LICENSE file for details.

hussamalafandi
/

VAE-CelebA