wav2vec-S-Base-ft-960h

wav2vec-S is a streaming variant of wav2vec 2.0, designed to maintain consistent representations between training and inference for streaming speech processing.

wav2vec-S-Base-ft-960h was continually pre-trained and fine-tuned on the LibriSpeech 960h dataset.

See our paper for details and check out the GitHub repository for usage instructions.