Transformers Streaming Output

Community Article Published March 15, 2025

Introduction

Prerequisites

Code Implementation

Features of This AI Tutor

How It Works

Conclusion
Try it out and start your AI-powered learning journey today! 🚀

Introduction

With the advancement of AI-driven chatbots, interactive learning has become more engaging. In this blog, we will explore how to build a streaming output using Python, Gradio, and a Qwen-based language model.

Prerequisites

Before we start, ensure you have the following installed:

pip install gradio transformers torch

Code Implementation

import gradio as gr  # Import the Gradio library for creating user interfaces
from transformers import AutoModelForCausalLM, AutoTokenizer, TextIteratorStreamer  # Import necessary classes from the transformers library
from threading import Thread  # Import Thread for concurrent execution
import time  # Import time for adding delays

model_name = "unsloth/DeepSeek-R1-Distill-Qwen-1.5B-unsloth-bnb-4bit"  # Define the model name or path

# Load the pre-trained model with automatic data type and device mapping
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

# Load the tokenizer associated with the model
tokenizer = AutoTokenizer.from_pretrained(model_name)

def QwenChat(message, history):  # Define the QwenChat function
    # Construct the messages list with system, history, and user message
    messages = [
        {"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."},
    ]
    messages.extend(history)  # Add chat history to the messages list
    messages.append({"role": "user", "content": message})  # Append the user's message

    # Apply chat template to format the messages for the model
    text = tokenizer.apply_chat_template(
        messages,
        tokenize=False,
        add_generation_prompt=True
    )

    # Set up the streamer for token generation
    streamer = TextIteratorStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)

    # Prepare model inputs by tokenizing the text and moving it to the model's device
    model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

    # Set up generation arguments including max tokens and streamer
    generation_args = {
        "max_new_tokens": 512,
        "streamer": streamer,
        **model_inputs
    }

    # Start a separate thread for model generation to allow streaming output
    thread = Thread(
        target=model.generate,
        kwargs=generation_args,
    )
    thread.start()

    # Accumulate and yield text tokens as they are generated
    acc_text = ""
    for text_token in streamer:
        time.sleep(0.01)  # Simulate real-time output with a short delay
        acc_text += text_token  # Append the generated token to the accumulated text
        yield acc_text  # Yield the accumulated text

    # Ensure the generation thread completes
    thread.join()

# Create a Gradio chat interface with the QwenChat function
demo = gr.ChatInterface(fn=QwenChat, type="messages")

# Launch the Gradio interface on all available network interfaces
demo.launch(server_name="0.0.0.0")

Features of This AI Tutor

Real-time response: Generates words dynamically as the model processes input.
Interactive learning: Users can practice conversations with an AI tutor.
Customizable: Modify the system prompt to tailor the teaching style.

How It Works

The user enters a message.
The system constructs a chat template including previous conversations.
The AI model processes the input and generates a response word-by-word in real-time.
The response appears gradually, simulating a natural conversation.

Conclusion

This approach offers an **engaging way to learn using AI. By integrating streaming output, students can experience dynamic, realistic interactions rather than static responses.

Try it out and start your AI-powered learning journey today! 🚀

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment

Upvote