inference
how do i run this?
does following convension work? (i get empty lines)
llm = Llama(model_path="./ggml-model-q4_0.gguf", chat_format="llama-2") # Set chat_format according to the model you are using
output = llm.create_chat_completion(
messages = [
{"role": "system", "content": "You are a helpful assistant that answers user's questions."},
*chat_history,
{
"role": "user",
"content": f"cats and dogs playing"
}
]
)
assistant_response = output["choices"][0]["message"]["content"]
print(output["choices"][0]["message"])
@veeragoni
set the system prompt content to "You are a helpful assistant."
and user prompt content to "必须使用英语根据主题描述一张照片,详细描述照片细节:Your prompt here"
That chinese bit means "You must describe a photo in English based on the subject, describing the photo in detail." (google translate) and it seems to be the key.