AMD Issue?

#5
by knarp - opened

When running this GGUF model (or any GLM-4 GGUF models found on HF) in Windows LM Studio using the Vulkan runtime (CPU runtime is working properly), it answers all questions with 'GGGGGGGGGGGGGGGGG'. When I press stop, LM Studio gives this error:

Failed to regenerate message
Unexpected empty grammar stack after accepting piece: G

My PC specs:
CPU: AMD RYZEN AI MAX+ 395
GPU: Radeon 8060S (allocated 64 GB VRAM)

When I run this model in Ollama (+ Open WebUI), it runs properly on my GPU.
Maybe Ollama is using Rocm instead of Vulkan?

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment