latency - how to get sub 200ms ultra low latency as mentioned

#14

by saketfractal - opened 4 days ago

4 days ago

Firstly, thanks for open-sourcing. I loaded the model on single A40. Thereafter, time taken to run model.generate() was around 9.5 secs for a 12 sec output audio file. The description mentions it can achieve sub 200ms latency. is that for the priced model only? or the stats based on streaming based generation?

kth8

4 days ago

•

edited 4 days ago

The sub 200ms latency is for their paid service

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment