Text Generation
Transformers
Safetensors
English
olmo2
conversational
Inference Endpoints

32b, 4k ctx?

#2
by lucyknada - opened

is 4k the final context-length planned for this model? or is there more in the works?

I really like what they did with the the whole "fully open source" deal, but the 4k context length is indeed head-scratching. I'd also like to hear a word on this.

image.png

self replying, since apparently it was mentioned on twitter but not here, what "very soon" means is another question.

I can’t serve this model with a context length limited to 4K. A 4K context might be acceptable for smaller models (0.5B or 1B) intended for on-device use cases, but for a 32B model, I need it to support at least a 128K context window to achieve decent performance at 32K.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment