SageAttention2++: A More Efficient Implementation of SageAttention2 Paper • 2505.21136 • Published 7 days ago • 40
SageAttention2++: A More Efficient Implementation of SageAttention2 Paper • 2505.21136 • Published 7 days ago • 40
SageAttention2++: A More Efficient Implementation of SageAttention2 Paper • 2505.21136 • Published 7 days ago • 40 • 2
Accurate INT8 Training Through Dynamic Block-Level Fallback Paper • 2503.08040 • Published Mar 11 • 1
Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation Paper • 2505.18875 • Published 10 days ago • 38
Accurate INT8 Training Through Dynamic Block-Level Fallback Paper • 2503.08040 • Published Mar 11 • 1
Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation Paper • 2505.18875 • Published 10 days ago • 38
SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training Paper • 2505.11594 • Published 18 days ago • 69 • 6
SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training Paper • 2505.11594 • Published 18 days ago • 69
SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training Paper • 2505.11594 • Published 18 days ago • 69
SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training Paper • 2505.11594 • Published 18 days ago • 69 • 6
Faster Video Diffusion with Trainable Sparse Attention Paper • 2505.13389 • Published 15 days ago • 35
Model Merging in Pre-training of Large Language Models Paper • 2505.12082 • Published 17 days ago • 35
Identifying Sensitive Weights via Post-quantization Integral Paper • 2503.01901 • Published Feb 28 • 8