unsloth/Phi-4-mini-instruct-GGUF · The Phi-4-mini-instruct-GGUF:Q4_K

The Phi-4-mini-instruct-GGUF:Q4_K_M failing

by pgr405 - opened Mar 4

Mar 4

ollama run hf.co/unsloth/Phi-4-mini-instruct-GGUF:Q4_K_M
pulling manifest
pulling 88c002299140... 100%
pulling 813f53fdc6e5... 100%
pulling 534cce8916c3... 100%
verifying sha256 digest
writing manifest
success
Error: llama runner process has terminated: error loading model: missing tensor 'output.weight'

marcov-dart

Mar 5

I just tried using Phi-4-mini-instruct-Q4_K_M.gguf with a fresh pull from llama.cpp and that works just fine.

shimmyshimmer

Unsloth AI org Mar 5

ollama run hf.co/unsloth/Phi-4-mini-instruct-GGUF:Q4_K_M
pulling manifest
pulling 88c002299140... 100%
pulling 813f53fdc6e5... 100%
pulling 534cce8916c3... 100%
verifying sha256 digest
writing manifest
success
Error: llama runner process has terminated: error loading model: missing tensor 'output.weight'

please update ollama

ocanillas-lasallemanlleu

Mar 5

•

edited Mar 5

ollama run hf.co/unsloth/Phi-4-mini-instruct-GGUF:Q4_K_M
pulling manifest
pulling 88c002299140... 100%
pulling 813f53fdc6e5... 100%
pulling 534cce8916c3... 100%
verifying sha256 digest
writing manifest
success
Error: llama runner process has terminated: error loading model: missing tensor 'output.weight'

Using Q2_K_L also gives the same error, using Ollama 0.5.12, both in Ubuntu and Windows.

fertek

Mar 8

ollama run hf.co/unsloth/Phi-4-mini-instruct-GGUF:Q4_K_M
pulling manifest
pulling 88c002299140... 100%
pulling 813f53fdc6e5... 100%
pulling 534cce8916c3... 100%
verifying sha256 digest
writing manifest
success
Error: llama runner process has terminated: error loading model: missing tensor 'output.weight'

please update ollama

After updating to Ollama version 0.5.13, the model now runs without any problems (version 0.5.11 was reporting "error loading model").

0Slyder0

Apr 3

•

edited Apr 3

OS is Linux Mint 22.1
Python 3.12.3
ollama version is 0.5.7 (installed using pip in a venv)

ollama run hf.co/unsloth/Phi-4-mini-instruct-GGUF:Q4_K_M
results in
Error: llama runner process has terminated: error loading model: missing tensor 'output.weight'

directly converting the model through llama.cpp then moving it to ollama
results in
Error: llama runner process has terminated: error loading model: missing tensor 'output.weight'

I have yet to try any of this on my Windows side, but I dont feel like I should have to.

Please correct me if I am wrong anywhere here, but it seems the phi4-mini model has been incorrectly posted by the originator.
If this is the case, do we have a time table on when we might expect a corrected version?

As a side note, "ollama run phi4" works just fine, albeit shes a resource hog on my setup.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment