latency - how to get sub 200ms ultra low latency as mentioned

#14
by saketfractal - opened

Firstly, thanks for open-sourcing. I loaded the model on single A40. Thereafter, time taken to run model.generate() was around 9.5 secs for a 12 sec output audio file. The description mentions it can achieve sub 200ms latency. is that for the priced model only? or the stats based on streaming based generation?

The sub 200ms latency is for their paid service

Sign up or log in to comment