MMMU

non-profit

https://mmmu-benchmark.github.io/

MMMU-Benchmark

Activity Feed Request to join this org

AI & ML interests

Multimodal Model Evaluation

Recent Activity

a43992899 authored a paper 4 days ago

Kimi-Audio Technical Report

zhangysk authored a paper 10 days ago

IV-Bench: A Benchmark for Image-Grounded Video Perception and Reasoning in Multimodal LLMs

zhangysk authored a paper 16 days ago

ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

View all activity

MMMU's activity

a43992899

authored a paper 4 days ago

Kimi-Audio Technical Report

Paper • 2504.18425 • Published 8 days ago • 13

zhangysk

authored a paper 10 days ago

IV-Bench: A Benchmark for Image-Grounded Video Perception and Reasoning in Multimodal LLMs

Paper • 2504.15415 • Published 11 days ago • 22

zhangysk

authored a paper 16 days ago

ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

Paper • 2504.11536 • Published 17 days ago • 60

zhangysk

authored a paper 24 days ago

COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values

Paper • 2504.05535 • Published 25 days ago • 44

aaabiao

authored a paper 24 days ago

COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values

Paper • 2504.05535 • Published 25 days ago • 44

yuanshengni

in MMMU/MMMU_Pro 25 days ago

Why are the choices a string instead of a list?

#6 opened about 1 month ago by

nbalepur

yuexiang96

authored a paper 30 days ago

ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations

Paper • 2504.00824 • Published Apr 1 • 40

wenhu

authored a paper 30 days ago

ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations

Paper • 2504.00824 • Published Apr 1 • 40

wenhu

authored 2 papers about 1 month ago

Towards Trustworthy GUI Agents: A Survey

Paper • 2503.23434 • Published Mar 30 • 21

MoCha: Towards Movie-Grade Talking Character Synthesis

Paper • 2503.23307 • Published Mar 30 • 131

zhangysk

authored 2 papers about 1 month ago

Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models

Paper • 2503.18923 • Published Mar 24 • 12

FlexWorld: Progressively Expanding 3D Scenes for Flexiable-View Synthesis

Paper • 2503.13265 • Published Mar 17 • 15

wenhu

authored a paper about 2 months ago

Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers

Paper • 2503.11579 • Published Mar 14 • 20

zhangysk

authored a paper about 2 months ago

Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers

Paper • 2503.11579 • Published Mar 14 • 20

wren93

authored a paper about 2 months ago

Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers

Paper • 2503.11579 • Published Mar 14 • 20

wenhu

authored a paper about 2 months ago

YuE: Scaling Open Foundation Models for Long-Form Music Generation

Paper • 2503.08638 • Published Mar 11 • 65

a43992899

authored 4 papers about 2 months ago

Chinese Open Instruction Generalist: A Preliminary Release

Paper • 2304.07987 • Published Apr 17, 2023 • 2

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

Paper • 2311.16502 • Published Nov 27, 2023 • 35

LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT

Paper • 2306.17103 • Published Jun 29, 2023 • 1

CIF-Bench: A Chinese Instruction-Following Benchmark for Evaluating the Generalizability of Large Language Models

Paper • 2402.13109 • Published Feb 20, 2024

AI & ML interests

Recent Activity

Team members 17

MMMU's activity

Why are the choices a string instead of a list?