ykarout commited on
Commit
49872a9
·
verified ·
1 Parent(s): 42db326

Upload folder using huggingface_hub

Browse files
Files changed (4) hide show
  1. .gitattributes +1 -0
  2. Modelfile +11 -0
  3. README.md +52 -0
  4. llama3-think-Q4_K_M.gguf +3 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ llama3-think-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
Modelfile ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM llama3-think-Q4_K_M.gguf
2
+
3
+ # Model parameters
4
+ PARAMETER temperature 0.8
5
+ PARAMETER top_p 0.9
6
+
7
+
8
+ # System prompt
9
+ SYSTEM """You are a helpful assistant. You will check the user request and you will think and generate brainstorming and self-thoughts in your mind and respond only in the following format:
10
+ <think> {your thoughts here} </think>
11
+ <answer> {your final answer here} </answer>. Use the tags once and place all your output inside them ONLY"""
README.md ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Llama3-ThinkQ8
2
+
3
+ A fine-tuned version of Llama 3 that shows explicit thinking using `<think>` and `<answer>` tags. This model is quantized to 4-bit (Q4) for efficient inference.
4
+
5
+ ## Model Details
6
+ - **Base Model**: Llama 3
7
+ - **Quantization**: 4-bit (Q4)
8
+ - **Special Feature**: Explicit thinking process with tags
9
+
10
+ ## How to Use with Ollama
11
+
12
+ ### 1. Install Ollama
13
+ If you haven't already installed Ollama, follow the instructions at [ollama.ai](https://ollama.ai).
14
+
15
+ ### 2. Download the model file
16
+ Download the GGUF file from this repository.
17
+
18
+ ### 3. Create the Ollama model
19
+ Create a file named `Modelfile` with this content:
20
+
21
+ ```
22
+ FROM llama3-think-Q4_K_M.gguf
23
+ # Model parameters
24
+ PARAMETER temperature 0.8
25
+ PARAMETER top_p 0.9
26
+ # System prompt
27
+ SYSTEM """You are a helpful assistant. You will check the user request and you will think and generate brainstorming and self-thoughts in your mind and respond only in the following format:
28
+ <think> {your thoughts here} </think>
29
+ <answer> {your final answer here} </answer>. Use the tags once and place all your output inside them ONLY"""
30
+ ```
31
+
32
+ Then run:
33
+ ```bash
34
+ ollama create llama3-think -f Modelfile
35
+ ```
36
+
37
+ ### 4. Run the model
38
+ ```bash
39
+ ollama run llama3-think
40
+ ```
41
+
42
+ ## Example Prompts
43
+
44
+ Try these examples:
45
+
46
+ ```
47
+ Using each number in this tensor ONLY once (5, 8, 3) and any arithmetic operation like add, subtract, multiply, divide, create an equation that equals 19.
48
+ ```
49
+
50
+ ```
51
+ Explain the concept of quantum entanglement to a high school student.
52
+ ```
llama3-think-Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b1224a269791bead27018ab5dcbedae6d3cab1b7ca477332ba6bc1043ac5101d
3
+ size 5047968672