Xeon's picture

14 2

Xeon PRO

flexonoel

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

BitNet b1.58 2B4T Technical Report

upvoted a paper 8 days ago

DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning

upvoted a paper 8 days ago

C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing

View all activity

Organizations

None yet

flexonoel's activity

upvoted a paper 3 days ago

BitNet b1.58 2B4T Technical Report

Paper • 2504.12285 • Published 4 days ago • 53

upvoted 2 papers 8 days ago

DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning

Paper • 2504.07128 • Published 18 days ago • 79

C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing

Paper • 2504.07964 • Published 10 days ago • 58

upvoted 2 papers 10 days ago

Hogwild! Inference: Parallel LLM Generation via Concurrent Attention

Paper • 2504.06261 • Published 12 days ago • 101

DDT: Decoupled Diffusion Transformer

Paper • 2504.05741 • Published 12 days ago • 70

upvoted a paper 11 days ago

T1: Tool-integrated Self-verification for Test-time Compute Scaling in Small Language Models

Paper • 2504.04718 • Published 13 days ago • 38

upvoted 5 papers 14 days ago

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 156

ZClip: Adaptive Spike Mitigation for LLM Pre-Training

Paper • 2504.02507 • Published 17 days ago • 76

AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation

Paper • 2503.19693 • Published 26 days ago • 75

TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes

Paper • 2503.23461 • Published 21 days ago • 93

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

Paper • 2504.01990 • Published 20 days ago • 247

upvoted a paper 17 days ago

Open-Qwen2VL: Compute-Efficient Pre-Training of Fully-Open Multimodal LLMs on Academic Resources

Paper • 2504.00595 • Published 19 days ago • 34

upvoted 2 papers 18 days ago

Expanding RL with Verifiable Rewards Across Diverse Domains

Paper • 2503.23829 • Published 20 days ago • 18

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published 20 days ago • 61

liked 2 models about 1 month ago

Alibaba-NLP/gte-Qwen2-1.5B-instruct

Sentence Similarity • Updated 27 days ago • 332k • 207

qihoo360/Light-R1-32B

Text Generation • Updated Mar 17 • 1.05k • 83