Just published a tutorial that shows how to properly install ComfyUI, SwarmUI, use installed ComfyUI as a backend in SwarmUI with absolutely maximum best performance such as out of the box Sage Attention, Flash Attention, RTX 5000 Series support and more. Also how to upscale images with max quality
If you want to generate the very best AI videos and images on your Windows computer locally this is the tutorial that you were looking for. Literally 1-click to install most powerful and advanced generative AI interface SwarmUI (with Flash Attention, Sage Attention, Triton, DeepSpeed, xFormers, RTX 5000 series perfect compatibility) and download the very best AI image and video generation models with ultra advanced model downloader Gradio app. SwarmUI utilizes the famous and most powerful, advanced, performant and optimized ComfyUI as a backend. So SwarmUI is the ultimate generative AI tool at the moment with vast amount of features and constant updates.
Tutorial Important Download Links App Links 🔗Follow below link to download the zip file that contains SwarmUI installer and AI models downloader Gradio App - the one used in the tutorial ⤵️
🔗Follow below link to download the zip file that contains ComfyUI 1-click installer that has all the Flash Attention, Sage Attention, xFormers, Triton, DeepSpeed, RTX 5000 series support ⤵️
its based on orpheus - but really the model is irrelevant as i focus mostly on data augmentation / prep / pipelineing - its just the way to show progress
should be able to express fine even in a sfw context
probably the last release for a few weeks as i go back to the data pipeline and improve there ..
in the mean time, please do test and report problems or enjoyable generations you found - we have a growing discord community and i love to see what you get out of that early release !
(small colab is provided on the model page if you dont have the gpu to run that your self)
Meta has released Llama 4 Scout and Llama 4 Maverick, now available on Hugging Face: • Llama 4 Scout: 17B active parameters, 16-expert Mixture of Experts (MoE) architecture, 10M token context window, fits on a single H100 GPU.  • Llama 4 Maverick: 17B active parameters, 128-expert MoE architecture, 1M token context window, optimized for DGX H100 systems. 
🔥 Key Features: • Native Multimodality: Seamlessly processes text and images.  • Extended Context Window: Up to 10 million tokens for handling extensive inputs. • Multilingual Support: Trained on 200 languages, with fine-tuning support for 12, including Arabic, Spanish, and German. 
🛠️ Access and Integration: • Model Checkpoints: Available under the meta-llama organization on the Hugging Face Hub. • Transformers Compatibility: Fully supported in transformers v4.51.0 for easy loading and fine-tuning. • Efficient Deployment: Supports tensor-parallelism and automatic device mapping.
These models offer developers enhanced capabilities for building sophisticated, multimodal AI applications.