Llama-2-7b Backward Instruction Model

This is a LoRA adapter trained on the OpenAssistant-Guanaco dataset for instruction backtranslation according to the "Self-Alignment with Instruction Backtranslation" paper.

Model Description

This model is a fine-tuned version of Llama-2-7b using Low-Rank Adaptation (LoRA). It's designed to generate instructions from responses, effectively performing "backtranslation" of instructions.

Developed by: [Qi Zeng/HKUST（GZ）]
Base Model: Meta-Llama-2-7b
Model Type: LoRA adapter for causal language modeling
Training Dataset: OpenAssistant-Guanaco
Training Format: Fine-tuned on (output, instruction) pairs where output is the response and instruction is what prompted that response

Usage

To use this model, you need to combine it with the base Llama-2-7b model:

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-2-7b-hf",
    torch_dtype=torch.float16,
    device_map="auto"
)

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-hf")
tokenizer.pad_token = tokenizer.eos_token

# Load LoRA adapter and apply to base model
model = PeftModel.from_pretrained(base_model, "derain30/llama2-7b-backward-instruction")

# Example: Generate an instruction from a response
response = """Machine learning is a subfield of artificial intelligence that focuses on developing algorithms and models that enable computers to learn from data without being explicitly programmed. It identifies patterns in data and makes decisions with minimal human intervention."""

prompt = f"### Output:\n{response}\n\n### Instruction:"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(
    inputs["input_ids"],
    max_length=256,
    temperature=0.7,
    top_p=0.9,
    num_return_sequences=1
)

generated_instruction = tokenizer.decode(outputs[0], skip_special_tokens=True).split("### Instruction:")[1].strip()
print("Generated instruction:", generated_instruction)

Training Details

This model was trained using LoRA with the following configuration:

Rank (r): 8
Alpha: 16
Target Modules: "q_proj", "v_proj"
Dropout: 0.05
Training epochs: 3
Learning rate: 2e-5
Batch size: 8 (with gradient accumulation steps of 4)
Optimizer: AdamW
Quantization: 4-bit quantization with NF4 type

Limitations

This model is designed specifically for generating instructions from outputs, not for general conversation or content generation
As a LoRA adapter, it must be used with the Llama-2-7b base model
The quality of generated instructions depends on the format and content of the input response

derain30
/

llama2-7b-backward-instruction

Llama-2-7b Backward Instruction Model

Model Description

Usage

Training Details

Limitations

Model tree for derain30/llama2-7b-backward-instruction