Genshin_Impact_Scaramouche Text-to-Video Generation

This repository contains the necessary steps and scripts to generate videos using the Genshin_Impact_Scaramouche text-to-video model. The model leverages LoRA (Low-Rank Adaptation) weights and pre-trained components to create high-quality anime-style videos based on textual prompts.

Prerequisites

Before proceeding, ensure that you have the following installed on your system:

• Ubuntu (or a compatible Linux distribution) • Python 3.x • pip (Python package manager) • Git • Git LFS (Git Large File Storage) • FFmpeg

Installation

Update and Install Dependencies

sudo apt-get update && sudo apt-get install cbm git-lfs ffmpeg

Clone the Repository

git clone https://huggingface.co/svjack/Genshin_Impact_Scaramouche_IM_wan_2_1_1_3_B_text2video_lora
cd Genshin_Impact_Scaramouche_IM_wan_2_1_1_3_B_text2video_lora

Install Python Dependencies

pip install torch torchvision
pip install -r requirements.txt
pip install ascii-magic matplotlib tensorboard huggingface_hub datasets
pip install moviepy==1.0.3
pip install sageattention==1.0.6

Download Model Weights

wget https://huggingface.co/Wan-AI/Wan2.1-T2V-14B/resolve/main/models_t5_umt5-xxl-enc-bf16.pth
wget https://huggingface.co/DeepBeepMeep/Wan2.1/resolve/main/models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth
wget https://huggingface.co/Wan-AI/Wan2.1-T2V-14B/resolve/main/Wan2.1_VAE.pth
wget https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_t2v_1.3B_bf16.safetensors
wget https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_t2v_14B_bf16.safetensors

Usage

To generate a video, use the wan_generate_video.py script with the appropriate parameters. Below are examples of how to generate videos using the Genshin_Impact_Scaramouche model.

Playful Scaramouche in a Whimsical Setting

python wan_generate_video.py --fp8 --task t2v-1.3B --video_size 480 832 --video_length 81 --infer_steps 20 \
--save_path save --output_type both \
--dit wan2.1_t2v_1.3B_bf16.safetensors --vae Wan2.1_VAE.pth \
--t5 models_t5_umt5-xxl-enc-bf16.pth \
--attn_mode torch \
--lora_weight Scaramouche_Storyline_IM_outputs/Scaramouche_Storyline_IM_w1_3_lora-000300.safetensors \
--lora_multiplier 1.0 \
--prompt "In the style of Scaramouche , a young character with short, dark blue hair and striking purple eyes. he wear a large, floppy blue hat, and a matching blue cape with intricate white designs. The character is sitting, leaning slightly to the side, with one finger to their lips in a playful gesture. he is dressed in a white shirt and black gloves."

Scaramouche Indulges in Ice Cream with Theatrical Flair

python wan_generate_video.py --fp8 --task t2v-1.3B --video_size 480 832 --video_length 81 --infer_steps 20 \
--save_path save --output_type both \
--dit wan2.1_t2v_1.3B_bf16.safetensors --vae Wan2.1_VAE.pth \
--t5 models_t5_umt5-xxl-enc-bf16.pth \
--attn_mode torch \
--lora_weight Scaramouche_Storyline_IM_outputs/Scaramouche_Storyline_IM_w1_3_lora-000300.safetensors \
--lora_multiplier 1.0 \
--prompt "In the style of Scaramouche , a young character with short, dark blue hair and striking purple eyes. he wear a large, floppy blue hat adorned with a white bird, and a matching blue cape with intricate white designs. The character is sitting, leaning slightly to the side, with one finger to their lips in a playful gesture. he dressed in a white shirt and black gloves. With a slow, deliberate motion, he reached for the ice cream, his gloved fingers curling around the cone with practiced ease. He brought it to his lips, taking a small, deliberate bite, his eyes narrowing slightly as the cool sweetness hit his tongue. A faint, satisfied smile tugged at the corners of his mouth, and he licked his lips with a quiet hum of approval. The act of eating ice cream, so simple and ordinary, became something almost theatrical in his hands—a performance of indulgence and delight. The background is a lush, green landscape with a soft, dreamy quality. The overall tone is whimsical and serene."

Scaramouche Savoring Ice Cream in a Dreamy Landscape

python wan_generate_video.py --fp8 --task t2v-1.3B --video_size 480 832 --video_length 81 --infer_steps 20 \
--save_path save --output_type both \
--dit wan2.1_t2v_1.3B_bf16.safetensors --vae Wan2.1_VAE.pth \
--t5 models_t5_umt5-xxl-enc-bf16.pth \
--attn_mode torch \
--lora_weight Scaramouche_Storyline_IM_outputs/Scaramouche_Storyline_IM_w1_3_lora-000300.safetensors \
--lora_multiplier 1.0 \
--prompt "In the style of Scaramouche , a young character with short, dark blue hair and striking purple eyes. The character is sitting, leaning slightly to the side,  he reached for the ice cream, His gloved hand held the ice cream cone effortlessly as he took a small, deliberate bite. A faint smile appeared as he savored the cool sweetness, turning the simple act into a quiet, indulgent moment. The lush, dreamy green landscape added a whimsical, serene tone."

Scaramouche Reaches for a Burger in a Relaxed Pose

python wan_generate_video.py --fp8 --task t2v-1.3B --video_size 480 832 --video_length 81 --infer_steps 20 \
--save_path save --output_type both \
--dit wan2.1_t2v_1.3B_bf16.safetensors --vae Wan2.1_VAE.pth \
--t5 models_t5_umt5-xxl-enc-bf16.pth \
--attn_mode torch \
--lora_weight Scaramouche_Storyline_IM_outputs/Scaramouche_Storyline_IM_w1_3_lora-000300.safetensors  \
--lora_multiplier 1.0 \
--prompt "In the style of Scaramouche , a young character in purple with short, dark blue hair and striking purple eyes. Sitting and leaning slightly to the side, he reached for the burger. His gloved hand held it firmly as he took a bite."

Elegant Scaramouche with a Bouquet of Roses

python wan_generate_video.py --fp8 --task t2v-1.3B --video_size 480 832 --video_length 81 --infer_steps 20 \
--save_path save --output_type both \
--dit wan2.1_t2v_1.3B_bf16.safetensors --vae Wan2.1_VAE.pth \
--t5 models_t5_umt5-xxl-enc-bf16.pth \
--attn_mode torch \
--lora_weight Scaramouche_Storyline_IM_outputs/Scaramouche_Storyline_IM_w1_3_lora-000300.safetensors  \
--lora_multiplier 1.0 \
--prompt "In the style of Scaramouche , a young character with short, messy dark purple hair and large, expressive purple eyes. he is wearing a white casual shirt and a black beret adorned with small, round, purple eyes and cat ears, giving a playful, cat-like appearance. He reached for a bouquet of roses. His hand gently grasped the stems, the motion elegant and deliberate."

Urban Scaramouche in a School Uniform

python wan_generate_video.py --fp8 --task t2v-1.3B --video_size 480 832 --video_length 81 --infer_steps 20 \
--save_path save --output_type both \
--dit wan2.1_t2v_1.3B_bf16.safetensors --vae Wan2.1_VAE.pth \
--t5 models_t5_umt5-xxl-enc-bf16.pth \
--attn_mode torch \
--lora_weight Scaramouche_Storyline_IM_outputs/Scaramouche_Storyline_IM_w1_3_lora-000300.safetensors  \
--lora_multiplier 1.0 \
--prompt "In the style of Scaramouche , a young character with short, dark blue hair, wearing a school uniform with a blue and white plaid vest, a teal tie, and a black beret. He holds a purple and green drink with a straw in his left hand. His expression is neutral. He has a backpack with a blue and white plaid pattern and a decorative badge. The background shows a yellow and white storefront with graffiti. The character is standing in front of the store, with a modern urban setting."

Parameters

--fp8: Enable FP8 precision (optional).
--task: Specify the task (e.g., t2v-1.3B).
--video_size: Set the resolution of the generated video (e.g., 1024 1024).
--video_length: Define the length of the video in frames.
--infer_steps: Number of inference steps.
--save_path: Directory to save the generated video.
--output_type: Output type (e.g., both for video and frames).
--dit: Path to the diffusion model weights.
--vae: Path to the VAE model weights.
--t5: Path to the T5 model weights.
--attn_mode: Attention mode (e.g., torch).
--lora_weight: Path to the LoRA weights.
--lora_multiplier: Multiplier for LoRA weights.
--prompt: Textual prompt for video generation.

Output

The generated video and frames will be saved in the specified save_path directory.

Troubleshooting

• Ensure all dependencies are correctly installed. • Verify that the model weights are downloaded and placed in the correct locations. • Check for any missing Python packages and install them using pip.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgments

• Hugging Face for hosting the model weights. • Wan-AI for providing the pre-trained models. • DeepBeepMeep for contributing to the model weights.

Contact

For any questions or issues, please open an issue on the repository or contact the maintainer.