nvidia/parakeet-tdt-0.6b-v2

1 day ago

Thank you NVIDIA team for releasing yet another excellent ASR model!

Is there a guide on how to achieve streaming transcription using the latest parakeet-tdt-0.6b-v2 model?

NVIDIA org 1 day ago

You could do chunked streaming by following this script: https://github.com/NVIDIA/NeMo/blob/main/examples/asr/asr_chunked_inference/rnnt/speech_to_text_buffered_infer_rnnt.py directions on how to use is inside the script.

We noticed a bug with tdt for chunked streaming inference, we will push it soon to main for everyone to try!

We do also have dedicated cache-aware architecture for streaming use cases: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_en_fastconformer_hybrid_large_streaming_multi . We are also working on an upgraded performant model to this one.

pscar

1 day ago

Hi @nithinraok . Thanks for that link. Waiting eagerly for the new streaming models! About the bug - do you recommend waiting for the bugfix if it's major or can the version on main be used already?

nvidia
/

parakeet-tdt-0.6b-v2

Streaming?