Arunkumar Venkataramanan's picture

85 199

Arunkumar Venkataramanan

ArunkumarVR

·

https://arunkumarramanan.github.io

AI & ML interests

AGI Research: Reasoning, Safety & Alignment (Superalignment), Generative AI (GenAI), Multi-Modal Foundation Models (FMs), Large Language Models (LLMs), Transformers & Diffusion Models, Open LLM Training, Optimization & Finetuning, Serving & Inference

Recent Activity

liked a model 2 days ago

microsoft/Phi-4-reasoning-plus

liked a model 5 days ago

Qwen/Qwen3-235B-A22B

liked a model 10 days ago

nari-labs/Dia-1.6B

View all activity

Organizations

ArunkumarVR's activity

upvoted a paper 17 days ago

BitNet b1.58 2B4T Technical Report

Paper • 2504.12285 • Published 17 days ago • 70

upvoted a collection 24 days ago

Llama 4

Meta's new Llama 4 multimodal models, Scout & Maverick. Includes Dynamic GGUFs, 16-bit & Dynamic 4-bit uploads. Run & fine-tune them with Unsloth! • 15 items • Updated 3 days ago • 44

upvoted a collection 28 days ago

Llama 4

Llama 4 release • 13 items • Updated 5 days ago • 474

upvoted a paper about 1 month ago

Qwen2.5-Omni Technical Report

Paper • 2503.20215 • Published Mar 26 • 149

upvoted 3 collections about 2 months ago

Google's Gemma models family

279 items • Updated 16 days ago • 177

Gemma 3 Release

24 items • Updated 16 days ago • 352

QwQ

Qwen with Questions • 6 items • Updated 5 days ago • 94

upvoted a collection 2 months ago

Model Optimizer

A collection of generative models quantized and optimized with TensorRT Model Optimizer. • 17 items • Updated 11 days ago • 19

upvoted a paper 3 months ago

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published Feb 7 • 140

upvoted 5 collections 3 months ago

RLHFlow MATH Process Reward Model

This is a collection of datasets and models of process reward modeling. • 15 items • Updated Nov 9, 2024 • 10

Skywork-o1-Open

Skywork o1 open model collections • 3 items • Updated Mar 20 • 20

Qwen2.5-Math

Math-specific model series based on Qwen2.5 • 11 items • Updated 5 days ago • 81

LLM Reasoning Papers

Papers to improve reasoning capabilities of LLMs • 20 items • Updated Jan 15 • 122

Reasoning Datasets

Distilled synthetic Reasoning datasets • 7 items • Updated Feb 2 • 60

upvoted an article 3 months ago

Article

How to deploy and fine-tune DeepSeek models on AWS

Jan 30

• 52

upvoted a paper 3 months ago

Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch

Paper • 2501.18512 • Published Jan 30 • 30

upvoted a collection 3 months ago

IndicBERT v2

IndicBERT v2 is a multilingual BERT model pretrained on IndicCorp v2, an Indic monolingual corpus of 20.9 billion tokens, covering 24 consitutionally • 4 items • Updated Oct 15, 2024 • 4