CoIn: Counting the Invisible Reasoning Tokens in Commercial Opaque LLM APIs Paper • 2505.13778 • Published 15 days ago • 4
CoIn: Counting the Invisible Reasoning Tokens in Commercial Opaque LLM APIs Paper • 2505.13778 • Published 15 days ago • 4 • 2
Router-Tuning: A Simple and Effective Approach for Enabling Dynamic-Depth in Transformers Paper • 2410.13184 • Published Oct 17, 2024 • 3
Router-Tuning: A Simple and Effective Approach for Enabling Dynamic-Depth in Transformers Paper • 2410.13184 • Published Oct 17, 2024 • 3 • 2
FLoRA: Federated Fine-Tuning Large Language Models with Heterogeneous Low-Rank Adaptations Paper • 2409.05976 • Published Sep 9, 2024
Domino: Eliminating Communication in LLM Training via Generic Tensor Slicing and Overlapping Paper • 2409.15241 • Published Sep 23, 2024 • 1
What Matters in Transformers? Not All Attention is Needed Paper • 2406.15786 • Published Jun 22, 2024 • 32
What Matters in Transformers? Not All Attention is Needed Paper • 2406.15786 • Published Jun 22, 2024 • 32
FedHyper: A Universal and Robust Learning Rate Scheduler for Federated Learning with Hypergradient Descent Paper • 2310.03156 • Published Oct 4, 2023
Enhancing One-Shot Federated Learning Through Data and Ensemble Co-Boosting Paper • 2402.15070 • Published Feb 23, 2024