Upload folder using huggingface_hub

Browse files

Files changed (4) hide show

.gitattributes +1 -0
Modelfile +11 -0
README.md +52 -0
llama3-think-Q4_K_M.gguf +3 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+llama3-think-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text

Modelfile ADDED Viewed

	@@ -0,0 +1,11 @@

+FROM llama3-think-Q4_K_M.gguf
+# Model parameters
+PARAMETER temperature 0.8
+PARAMETER top_p 0.9
+# System prompt
+SYSTEM """You are a helpful assistant. You will check the user request and you will think and generate brainstorming and self-thoughts in your mind and respond only in the following format:
+<think> {your thoughts here} </think>
+<answer> {your final answer here} </answer>. Use the tags once and place all your output inside them ONLY"""

README.md ADDED Viewed

	@@ -0,0 +1,52 @@

+# Llama3-ThinkQ8
+A fine-tuned version of Llama 3 that shows explicit thinking using `<think>` and `<answer>` tags. This model is quantized to 4-bit (Q4) for efficient inference.
+## Model Details
+- **Base Model**: Llama 3
+- **Quantization**: 4-bit (Q4)
+- **Special Feature**: Explicit thinking process with tags
+## How to Use with Ollama
+### 1. Install Ollama
+If you haven't already installed Ollama, follow the instructions at [ollama.ai](https://ollama.ai).
+### 2. Download the model file
+Download the GGUF file from this repository.
+### 3. Create the Ollama model
+Create a file named `Modelfile` with this content:
+```
+FROM llama3-think-Q4_K_M.gguf
+# Model parameters
+PARAMETER temperature 0.8
+PARAMETER top_p 0.9
+# System prompt
+SYSTEM """You are a helpful assistant. You will check the user request and you will think and generate brainstorming and self-thoughts in your mind and respond only in the following format:
+<think> {your thoughts here} </think>
+<answer> {your final answer here} </answer>. Use the tags once and place all your output inside them ONLY"""
+```
+Then run:
+```bash
+ollama create llama3-think -f Modelfile
+```
+### 4. Run the model
+```bash
+ollama run llama3-think
+```
+## Example Prompts
+Try these examples:
+```
+Using each number in this tensor ONLY once (5, 8, 3) and any arithmetic operation like add, subtract, multiply, divide, create an equation that equals 19.
+```
+```
+Explain the concept of quantum entanglement to a high school student.
+```

llama3-think-Q4_K_M.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b1224a269791bead27018ab5dcbedae6d3cab1b7ca477332ba6bc1043ac5101d
+size 5047968672