StyleTTS 2 - lite

Online Demo

Explore the model on Hugging Face Spaces:
https://huggingface.co/spaces/dangtr0408/StyleTTS2-lite-space

Fine-tune

https://github.com/dangtr0408/StyleTTS2-lite

Training Details

  1. Base Checkpoint: Initialized from the official StyleTTS 2 weights pre-trained on LibriTTS.
  2. Components Removal: PLBert, Diffusion, Prosodic Encoder, SLM, and Spectral Normalization.
  3. Training Data: LibriTTS corpus.
  4. Training Schedule: Trained for 100,000 steps.

Model Architecture

Component Parameters
Decoder 54 ,289 ,492
Predictor 16 ,194 ,612
Style Encoder 13 ,845 ,440
Text Encoder 5,612 ,320
Total 89 ,941 ,576

Prerequisites

  • Python: Version 3.7 or higher
  • Git: To clone the repository

Installation & Setup

  1. Clone the repository

git  clone  https://huggingface.co/dangtr0408/StyleTTS2-lite

cd  StyleTTS2-lite
  1. Install dependencies:

pip  install  -r  requirements.txt
  1. On Linux, manually install espeak:

sudo  apt-get  install  espeak-ng

Usage Example

See run.ipynb file.

Disclaimer

Before using these pre-trained models, you agree to inform the listeners that the speech samples are synthesized by the pre-trained models, unless you have the permission to use the voice you synthesize. That is, you agree to only use voices whose speakers grant the permission to have their voice cloned, either directly or by license before making synthesized voices public, or you have to publicly announce that these voices are synthesized if you do not have the permission to use these voices.

References

License

Code: MIT License

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dangtr0408/StyleTTS2-lite

Finetuned
(3)
this model

Space using dangtr0408/StyleTTS2-lite 1