skshmjn
/

RAG-LLAMA-3.2-3b-INSTRUCT

Text Generation

instruction-tuned

hallucination-reduction

text-generation-inference

Model card Files Files and versions Community

skshmjn commited on Mar 26

Commit

f52e919

·

verified ·

1 Parent(s): 40d915c

Create README.md

Files changed (1) hide show

README.md +63 -0

README.md ADDED Viewed

	@@ -0,0 +1,63 @@

+---
+base_model: unsloth/Llama-3.2-3B-Instruct
+tags:
+- text-generation
+- instruction-tuned
+- hallucination-reduction
+- transformers
+- unsloth
+- llama
+- fine-tuned
+- gguf
+- quantized
+license: apache-2.0
+language:
+- en
+datasets:
+- skshmjn/RAG-INSTRUCT-1.1
+pipeline_tag: text-generation
+library_name: transformers
+---
+# 🚀 RAG-Instruct Llama-3.2-3B (Fine-tuned)
+- **Developed by:** skshmjn
+- **License:** apache-2.0
+- **Finetuned from model:** [unsloth/Llama-3.2-3B-Instruct](https://huggingface.co/unsloth/Llama-3.2-3B-Instruct)
+- **Dataset Used:** [skshmjn/RAG-INSTRUCT-1.1](https://huggingface.co/datasets/skshmjn/RAG-INSTRUCT-1.1)
+- **Supports:** Transformers & GGUF (for fast inference on CPU/GPU)
+---
+## 📌 **Model Overview**
+This model is fine-tuned on the **RAG-INSTRUCT-1.1** dataset using **Unsloth** to enhance text generation.
+It is optimized for **instruction-following** while reducing hallucination, ensuring that responses remain factual and concise.
+- **Instruction-Tuned**: Follows structured queries effectively.
+- **Hallucination Reduction**: Avoids fabricating information when context is missing.
+- **Optimized with Unsloth**: Fast inference with GGUF quantization.
+---
+## 📌 **Example Usage (Transformers)**
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_name = "skshmjn/Llama-3.2-3B-RAG-Instruct"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForCausalLM.from_pretrained(model_name)
+prompt = """You are an assistant for question-answering tasks.
+Use the following pieces of retrieved context to answer the question.
+If you don't know the answer, just say that you don't know.
+Use three sentences maximum and keep the answer concise.
+Question: Who discovered the first exoplanet?
+Context: [No relevant context available]
+Answer:"""
+inputs = tokenizer(prompt, return_tensors="pt")
+output = model.generate(**inputs, max_length=100)
+response = tokenizer.decode(output[0], skip_special_tokens=True)
+print(response)