LLMs&Agents - a RandomHakkaDude Collection

RandomHakkaDude 's Collections

LLMs&Agents

updated 5 days ago

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published Feb 13 • 149
From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation up to 100K Tokens

Paper • 2502.18890 • Published Feb 26 • 30
MPO: Boosting LLM Agents with Meta Plan Optimization

Paper • 2503.02682 • Published Mar 4 • 27
SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents

Paper • 2505.20411 • Published 8 days ago • 82
DeepResearchGym: A Free, Transparent, and Reproducible Evaluation Sandbox for Deep Research

Paper • 2505.19253 • Published 9 days ago • 24
Universal Reasoner: A Single, Composable Plug-and-Play Reasoner for Frozen LLMs

Paper • 2505.19075 • Published 9 days ago • 21
Text2Grad: Reinforcement Learning from Natural Language Feedback

Paper • 2505.22338 • Published 6 days ago • 6
ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows

Paper • 2505.19897 • Published 8 days ago • 99
Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers

Paper • 2505.21497 • Published 7 days ago • 90
Beyond Prompt Engineering: Robust Behavior Control in LLMs via Steering Target Atoms

Paper • 2505.20322 • Published 11 days ago • 14
VideoGameBench: Can Vision-Language Models complete popular video games?

Paper • 2505.18134 • Published 11 days ago • 6
Alita: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self-Evolution

Paper • 2505.20286 • Published 8 days ago • 6
ARM: Adaptive Reasoning Model

Paper • 2505.20258 • Published 8 days ago • 43
Flex-Judge: Think Once, Judge Anywhere

Paper • 2505.18601 • Published 10 days ago • 27
Lifelong Safety Alignment for Language Models

Paper • 2505.20259 • Published 8 days ago • 23
Vibe Coding vs. Agentic Coding: Fundamentals and Practical Implications of Agentic AI

Paper • 2505.19443 • Published 9 days ago • 14
InfantAgent-Next: A Multimodal Generalist Agent for Automated Computer Interaction

Paper • 2505.10887 • Published 18 days ago • 10