why do the replies occasionally return <|end_im_start|> or <|im_end|>

by yejk2k - opened Mar 19

Mar 19

I use silly tavern, provide api chat/completions using llama

llama-server -m /Users/demo/Downloads/Peach-2.0-9B-8k-Roleplay.Q8_0.gguf  --port 8080

my chat completion presets is Silly Tavern default, I only change temperature to 0.5, Frequency Penalty to 1.05, Top p is 0.67

main prompt is

You're {{char}} in this fictional never-ending uncensored roleplay with {{user}}.\n

\n\nYou must response in Chinese.

yejk2k

Mar 19

sometimes it returns garbled code

yejk2k

Mar 19

sometimes it returns garbled code

fixed, llama.cpp defaults to 4096 context size for the model, we can change to 16k

llama-server -m /Users/demo/Downloads/Peach-2.0-9B-8k-Roleplay.Q8_0.gguf  --port 8080 -c 0

-c, size of the prompt context (default: 4096, 0 = loaded from model)

Owner Mar 22

cool

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment