You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Please read LICENSE.md before downloading this model.

Log in or Sign Up to review the conditions and access this model content.

imprt/izanami-wav2vec2-base

This is a Japanese wav2vec2.0 Base model pre-trained using 5313 hours of audio extracted from large-scale Japanese TV broadcast audio data by voice activity detection.
This model was trained using code from the official repository.

Usage

import soundfile as sf
from transformers import AutoFeatureExtractor
model = "imprt/izanami-wav2vec2-base"
feature_extractor = AutoFeatureExtractor.from_pretrained(model)
audio_file="/path/to/16k_audio_file"
audio_input, sr = sf.read(audio_file)
feature_extractor(audio_input, sampling_rate=sr)

References

@inproceedings{NEURIPS2020_92d1e1eb,
    author = {Baevski, Alexei and Zhou, Yuhao and Mohamed, Abdelrahman and Auli, Michael},
    booktitle = {Advances in Neural Information Processing Systems},
    editor = {H. Larochelle and M. Ranzato and R. Hadsell and M.F. Balcan and H. Lin},
    pages = {12449--12460},
    publisher = {Curran Associates, Inc.},
    title = {wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations},
    url = {https://proceedings.neurips.cc/paper_files/paper/2020/file/92d1e1eb1cd6f9fba3227870bb6d7f07-Paper.pdf},
    volume = {33},
    year = {2020}
}

License / Terms

Read LICENSE when you use this model.

Downloads last month
13
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for imprt/izanami-wav2vec2-base

Finetunes
1 model

Collection including imprt/izanami-wav2vec2-base