32b, 4k ctx?

by lucyknada - opened Mar 13

Discussion

lucyknada

Mar 13

is 4k the final context-length planned for this model? or is there more in the works?

Araki

Mar 13

I really like what they did with the the whole "fully open source" deal, but the 4k context length is indeed head-scratching. I'd also like to hear a word on this.

lucyknada

Mar 13

self replying, since apparently it was mentioned on twitter but not here, what "very soon" means is another question.

MaziyarPanahi

Mar 15

I can’t serve this model with a context length limited to 4K. A 4K context might be acceptable for smaller models (0.5B or 1B) intended for on-device use cases, but for a 32B model, I need it to support at least a 128K context window to achieve decent performance at 32K.

TheSeminal

Mar 19

Anyone else getting

ollama run MHKetbi/allenai_OLMo2-0325-32B-Instruct:Q8_0 Error: llama runner process has terminated: error loading model: check_tensor_dims: tensor 'blk.0.attn_k_norm.weight' has wrong shape; expected 5120, got 1024, 1, 1, 1

amanrangapur

Ai2 org Mar 20

•

edited Mar 20

Hi @lucyknada , we’re working on making the context longer. We’re definitely planning to do that in the next versions. Stay tuned! Thanks everyone else for the feedback.

amanrangapur

Ai2 org Mar 20

Anyone else getting

ollama run MHKetbi/allenai_OLMo2-0325-32B-Instruct:Q8_0 Error: llama runner process has terminated: error loading model: check_tensor_dims: tensor 'blk.0.attn_k_norm.weight' has wrong shape; expected 5120, got 1024, 1, 1, 1

Hey @TheSeminal , there is this issue on going with llama.cpp. Check this out for more context: https://huggingface.co/allenai/OLMo-2-0325-32B-Instruct-GGUF/discussions/1

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment