why do the replies occasionally return <|end_im_start|> or <|im_end|>

#2
by yejk2k - opened

I use silly tavern, provide api chat/completions using llama

llama-server -m /Users/demo/Downloads/Peach-2.0-9B-8k-Roleplay.Q8_0.gguf  --port 8080

my chat completion presets is Silly Tavern default, I only change temperature to 0.5, Frequency Penalty to 1.05, Top p is 0.67

main prompt is

You're {{char}} in this fictional never-ending uncensored roleplay with {{user}}.\n

\n\nYou must response in Chinese.

sometimes it returns garbled code

image.png

sometimes it returns garbled code

image.png

fixed, llama.cpp defaults to 4096 context size for the model, we can change to 16k

llama-server -m /Users/demo/Downloads/Peach-2.0-9B-8k-Roleplay.Q8_0.gguf  --port 8080 -c 0

-c, size of the prompt context (default: 4096, 0 = loaded from model)

cool

Sign up or log in to comment