skshmjn commited on
Commit
f52e919
·
verified ·
1 Parent(s): 40d915c

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +63 -0
README.md ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: unsloth/Llama-3.2-3B-Instruct
3
+ tags:
4
+ - text-generation
5
+ - instruction-tuned
6
+ - hallucination-reduction
7
+ - transformers
8
+ - unsloth
9
+ - llama
10
+ - fine-tuned
11
+ - gguf
12
+ - quantized
13
+ license: apache-2.0
14
+ language:
15
+ - en
16
+ datasets:
17
+ - skshmjn/RAG-INSTRUCT-1.1
18
+ pipeline_tag: text-generation
19
+ library_name: transformers
20
+ ---
21
+
22
+ # 🚀 RAG-Instruct Llama-3.2-3B (Fine-tuned)
23
+
24
+ - **Developed by:** skshmjn
25
+ - **License:** apache-2.0
26
+ - **Finetuned from model:** [unsloth/Llama-3.2-3B-Instruct](https://huggingface.co/unsloth/Llama-3.2-3B-Instruct)
27
+ - **Dataset Used:** [skshmjn/RAG-INSTRUCT-1.1](https://huggingface.co/datasets/skshmjn/RAG-INSTRUCT-1.1)
28
+ - **Supports:** Transformers & GGUF (for fast inference on CPU/GPU)
29
+
30
+ ---
31
+
32
+ ## 📌 **Model Overview**
33
+ This model is fine-tuned on the **RAG-INSTRUCT-1.1** dataset using **Unsloth** to enhance text generation.
34
+ It is optimized for **instruction-following** while reducing hallucination, ensuring that responses remain factual and concise.
35
+
36
+ - **Instruction-Tuned**: Follows structured queries effectively.
37
+ - **Hallucination Reduction**: Avoids fabricating information when context is missing.
38
+ - **Optimized with Unsloth**: Fast inference with GGUF quantization.
39
+
40
+ ---
41
+
42
+ ## 📌 **Example Usage (Transformers)**
43
+ ```python
44
+ from transformers import AutoModelForCausalLM, AutoTokenizer
45
+
46
+ model_name = "skshmjn/Llama-3.2-3B-RAG-Instruct"
47
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
48
+ model = AutoModelForCausalLM.from_pretrained(model_name)
49
+
50
+ prompt = """You are an assistant for question-answering tasks.
51
+ Use the following pieces of retrieved context to answer the question.
52
+ If you don't know the answer, just say that you don't know.
53
+ Use three sentences maximum and keep the answer concise.
54
+
55
+ Question: Who discovered the first exoplanet?
56
+ Context: [No relevant context available]
57
+ Answer:"""
58
+
59
+ inputs = tokenizer(prompt, return_tensors="pt")
60
+ output = model.generate(**inputs, max_length=100)
61
+ response = tokenizer.decode(output[0], skip_special_tokens=True)
62
+
63
+ print(response)