Update README.md
Browse files
README.md
CHANGED
@@ -11,8 +11,16 @@ base_model:
|
|
11 |
- Emova-ollm/emova_speech_tokenizer
|
12 |
---
|
13 |
|
|
|
|
|
|
|
|
|
14 |
# EMOVA Speech Tokenizer HF
|
15 |
|
|
|
|
|
|
|
|
|
16 |
## Model Summary
|
17 |
|
18 |
This repo contains the discrete speech tokenizer used to train the [EMOVA](https://emova-ollm.github.io/) series of models. With a semantic-acoustic disentangled design, it not only facilitates seamless omni-modal alignment among vision, language and audio modalities, but also empowers flexible speech style controls including emotions and pitches. It contains a **speech-to-unit (S2U)** tokenizer to convert speech signals to discrete speech units, and a **unit-to-speech (U2S)** de-tokenizer to reconstruct speech signals from the speech units.
|
|
|
11 |
- Emova-ollm/emova_speech_tokenizer
|
12 |
---
|
13 |
|
14 |
+
<div align="center">
|
15 |
+
|
16 |
+
<img src="./examples/images/emova_icon2.png" width="300em"></img>
|
17 |
+
|
18 |
# EMOVA Speech Tokenizer HF
|
19 |
|
20 |
+
🤗 [HuggingFace](https://huggingface.co/Emova-ollm/emova_speech_tokenizer_hf) | 💻 [EMOVA-Main-Repo](https://github.com/emova-ollm/EMOVA) | 📄 [EMOVA-Paper](https://arxiv.org/abs/2409.18042) | 🌐 [Project-Page](https://emova-ollm.github.io/)
|
21 |
+
|
22 |
+
</div>
|
23 |
+
|
24 |
## Model Summary
|
25 |
|
26 |
This repo contains the discrete speech tokenizer used to train the [EMOVA](https://emova-ollm.github.io/) series of models. With a semantic-acoustic disentangled design, it not only facilitates seamless omni-modal alignment among vision, language and audio modalities, but also empowers flexible speech style controls including emotions and pitches. It contains a **speech-to-unit (S2U)** tokenizer to convert speech signals to discrete speech units, and a **unit-to-speech (U2S)** de-tokenizer to reconstruct speech signals from the speech units.
|