Llama-2-7b Backward Instruction Model

This is a LoRA adapter trained on the OpenAssistant-Guanaco dataset for instruction backtranslation according to the "Self-Alignment with Instruction Backtranslation" paper.

Model Description

This model is a fine-tuned version of Llama-2-7b using Low-Rank Adaptation (LoRA). It's designed to generate instructions from responses, effectively performing "backtranslation" of instructions.

  • Developed by: [Qi Zeng/HKUST(GZ)]
  • Base Model: Meta-Llama-2-7b
  • Model Type: LoRA adapter for causal language modeling
  • Training Dataset: OpenAssistant-Guanaco
  • Training Format: Fine-tuned on (output, instruction) pairs where output is the response and instruction is what prompted that response

Usage

To use this model, you need to combine it with the base Llama-2-7b model:

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-2-7b-hf",
    torch_dtype=torch.float16,
    device_map="auto"
)

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-hf")
tokenizer.pad_token = tokenizer.eos_token

# Load LoRA adapter and apply to base model
model = PeftModel.from_pretrained(base_model, "derain30/llama2-7b-backward-instruction")

# Example: Generate an instruction from a response
response = """Machine learning is a subfield of artificial intelligence that focuses on developing algorithms and models that enable computers to learn from data without being explicitly programmed. It identifies patterns in data and makes decisions with minimal human intervention."""

prompt = f"### Output:\n{response}\n\n### Instruction:"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(
    inputs["input_ids"],
    max_length=256,
    temperature=0.7,
    top_p=0.9,
    num_return_sequences=1
)

generated_instruction = tokenizer.decode(outputs[0], skip_special_tokens=True).split("### Instruction:")[1].strip()
print("Generated instruction:", generated_instruction)

Training Details

This model was trained using LoRA with the following configuration:

  • Rank (r): 8
  • Alpha: 16
  • Target Modules: "q_proj", "v_proj"
  • Dropout: 0.05
  • Training epochs: 3
  • Learning rate: 2e-5
  • Batch size: 8 (with gradient accumulation steps of 4)
  • Optimizer: AdamW
  • Quantization: 4-bit quantization with NF4 type

Limitations

  • This model is designed specifically for generating instructions from outputs, not for general conversation or content generation
  • As a LoRA adapter, it must be used with the Llama-2-7b base model
  • The quality of generated instructions depends on the format and content of the input response
Downloads last month
29
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for derain30/llama2-7b-backward-instruction

Adapter
(1905)
this model