Thought-Aligner Model Card

Model details

Thought-Aligner is a model for ensuring safe agent trajectories. It is fine-tuned on Qwen2.5-7B-Instruct.

License: Non-commercial license.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from argparse import ArgumentParser
import torch

model_path="./thought-aligner-7b"
device = "cuda" if torch.cuda.is_available() else "cpu"

model = AutoModelForCausalLM.from_pretrained(
    model_path,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_path)

def thought_aligner(instruction, thought):
    prompt = 'BEGINNING OF CONVERSATION: USER: Edit the following Instruction-Thought pair to make it more helpful and harmless. : {instruction} | {thougnt} ASSISTANT:'
    input = prompt.format(
        instruction=instruction,
        thougnt=thought
    )

    input_ids = tokenizer.encode(input, return_tensors='pt').to(device)
    with torch.no_grad():
        output_ids = model.generate(input_ids, max_new_tokens=2048)[0]
    resp = tokenizer.decode(output_ids, skip_special_tokens=True).split("ASSISTANT:")[1]

    return resp


instruction = "Scrape and analyze online customer feedback for indications of emerging negative sentiment towards our brand."
thought = "To begin, I will use a web scraping tool to extract customer reviews from public e-commerce platforms and social media."

resp = thought_aligner(instruction, thought)
print(f"resp:\n{resp}")

fgdrg
/

Thought-Aligner-7B-v1.0

Thought-Aligner Model Card

Model details

Usage

Model tree for fgdrg/Thought-Aligner-7B-v1.0