13 35 25

Joya Chen PRO

chenjoya

https://chenjoya.github.io/

chenjoya

AI & ML interests

Video LLM

Recent Activity

upvoted a paper about 16 hours ago

Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play

liked a model about 17 hours ago

inclusionAI/Ming-Lite-Uni

updated a dataset 5 days ago

chenjoya/Live-CC-5M

View all activity

Organizations

chenjoya's activity

upvoted a paper about 16 hours ago

Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play

Paper • 2505.02707 • Published 1 day ago • 60

upvoted 2 papers 14 days ago

Describe Anything: Detailed Localized Image and Video Captioning

Paper • 2504.16072 • Published 14 days ago • 60

LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale

Paper • 2504.16030 • Published 15 days ago • 34

upvoted a collection 14 days ago

LiveCC

Collection

Learning Video LLM with Streaming Speech Transcription at Scale (CVPR 2025) • 8 items • Updated 14 days ago • 4

upvoted a paper about 1 month ago

Long-Context Autoregressive Video Modeling with Next-Frame Prediction

Paper • 2503.19325 • Published Mar 25 • 72

upvoted 4 papers about 2 months ago

upvoted 3 papers 2 months ago

DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles

Paper • 2503.03651 • Published Mar 5 • 16

Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models

Paper • 2503.01774 • Published Mar 3 • 44

PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data

Paper • 2502.14397 • Published Feb 20 • 42

upvoted 3 papers 3 months ago

WorldGUI: Dynamic Testing for Comprehensive Desktop GUI Automation

Paper • 2502.08047 • Published Feb 12 • 27

TextAtlas5M: A Large-scale Dataset for Dense Text Image Generation

Paper • 2502.07870 • Published Feb 11 • 45

MakeAnything: Harnessing Diffusion Transformers for Multi-Domain Procedural Sequence Generation

Paper • 2502.01572 • Published Feb 3 • 20

upvoted 5 papers 4 months ago

Are Your LLMs Capable of Stable Reasoning?

Paper • 2412.13147 • Published Dec 17, 2024 • 95

No More Adam: Learning Rate Scaling at Initialization is All You Need

Paper • 2412.11768 • Published Dec 16, 2024 • 44

Progressive Multimodal Reasoning via Active Retrieval

Paper • 2412.14835 • Published Dec 19, 2024 • 74

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 367

Parallelized Autoregressive Visual Generation

Paper • 2412.15119 • Published Dec 19, 2024 • 54