Guoheng Sun's picture

6 354

Guoheng Sun

s1ghhh

·

s1ghhh

AI & ML interests

None yet

Recent Activity

published a model about 6 hours ago

s1ghhh/counting_zyw

liked a model 3 days ago

Nellyw888/VeriReason-codeLlama-7b-RTLCoder-Verilog-GRPO-reasoning-tb

liked a model 3 days ago

Nellyw888/VeriReason-Qwen2.5-7b-RTLCoder-Verilog-GRPO-reasoning-tb

View all activity

Organizations

s1ghhh's activity

upvoted a paper 13 days ago

CoIn: Counting the Invisible Reasoning Tokens in Commercial Opaque LLM APIs

Paper • 2505.13778 • Published 15 days ago • 4

upvoted an article 3 months ago

Article

From DeepSpeed to FSDP and Back Again with Hugging Face Accelerate

By

and 3 others •

Jun 13, 2024

• 54

upvoted an article 4 months ago

Article

Open R1: Update #2

By

and 6 others •

Feb 10

• 213

upvoted a collection 7 months ago

LLM-Drop

Model weights of paper "What Matters in Transformers? Not All Attention is Needed" (https://arxiv.org/abs/2406.15786) • 14 items • Updated Oct 23, 2024 • 4

upvoted a paper 8 months ago

What Matters in Transformers? Not All Attention is Needed

Paper • 2406.15786 • Published Jun 22, 2024 • 32

upvoted a collection 9 months ago

🪐 SmolLM

A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated 29 days ago • 227