--- language: en license: apache-2.0 tags: - text-to-image - diffusion - mflux - development datasets: - custom --- # FLUX.1-dev-mflux-4bit [![Hugging Face](https://img.shields.io/badge/🤗%20Hugging%20Face-FLUX.1--dev--mflux--4bit-blue)](https://huggingface.co/dhairyashil/FLUX.1-dev-mflux-4bit) ![comparison_output](comparison.png) A quantized version of the [FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) text-to-image model, implemented using the [mflux](https://github.com/filipstrand/mflux) (version 0.6.2) quantization approach. ## Overview This repository contains the 4-bit quantized FLUX.1 model, which significantly reduces the memory footprint while maintaining most of the generation quality. The quantization was performed using the mflux. ### Benefits of 4-bit Quantization - **Reduced Memory Usage**: ~75% reduction in memory requirements compared to the original model - **Faster Loading Times**: Smaller model size means quicker initialization - **Lower Storage Requirements**: Significantly smaller disk footprint - **Accessibility**: Can run on consumer hardware with limited VRAM - **Minimal Quality Loss**: Maintains nearly identical output quality to the original model ## Model Structure This repository contains the following components: - `text_encoder/`: CLIP text encoder (4-bit quantized) - `text_encoder_2/`: Secondary text encoder (4-bit quantized) - `tokenizer/`: CLIP tokenizer configuration and vocabulary - `tokenizer_2/`: Secondary tokenizer configuration - `transformer/`: Main diffusion model components (4-bit quantized) - `vae/`: Variational autoencoder for image encoding/decoding (4-bit quantized) ## Usage ### Requirements - Python - PyTorch - Transformers - Diffusers - [mflux](https://github.com/filipstrand/mflux) library (for 8-bit model support) ### Installation ```bash pip install torch diffusers transformers accelerate uv tool install mflux # check mflux README for more details ``` ### Example Usage ```bash # export path for mflux % mflux-generate \ mflux-generate \ --path "dhairyashil/FLUX.1-dev-mflux-4bit" \ --model dev \ --steps 50 \ --seed 2 \ --height 1920 \ --width 1024 \ --prompt "hot chocolate dish on decorated table" ``` ### Comparison Output The images generated from above prompt for different models are shown at the top. fp16 and 8-bit results look visibly almost the same, with the 8-bit version maintaining excellent quality while using significantly less memory but 4-bit version may show some quality loss. A [8-bit model](https://huggingface.co/dhairyashil/FLUX.1-dev-mflux-8bit) may also be available for testing, though with more noticeable quality difference. ## Performance Comparison | Model Version | Memory Usage | Inference Speed | Quality | |---------------|--------------|-----------------|--------| | Original FP16 | ~36 GB | Base | Base | | 8-bit Quantized | ~18 GB | Nearly identical | Nearly identical | | 4-bit Quantized | ~9 GB | Nearly identical | Moderately reduced | ## Other Highlights - Very minimal quality degradation compared to the original model - Nearly identical inference speed - Rare artifacts that are generally imperceptible in most use cases ## Acknowledgements - [Black Forest Labs](https://huggingface.co/black-forest-labs) for creating the original FLUX.1 model family - [Filip Strand](https://github.com/filipstrand) for developing the mflux quantization methodology - The Hugging Face team for their Diffusers and Transformers libraries - All contributors to the development version for their testing and improvements ## License This model inherits the license of the original FLUX.1 model. Please refer to the [original model repository](https://huggingface.co/black-forest-labs/FLUX.1) for licensing information.