11 14 45

Haihao Shen

Haihao

https://github.com/intel/auto-round

AI & ML interests

LLM quantization, sparsity, and acceleration

Recent Activity

upvoted an article about 20 hours ago

Introducing AutoRound: Intel’s Advanced Quantization for LLMs and VLMs

published an article 1 day ago

Introducing AutoRound: Intel’s Advanced Quantization for LLMs and VLMs

liked a model about 1 month ago

OPEA/gemma-3-27b-it-int4-AutoRound

View all activity

Organizations

Haihao's activity

upvoted an article about 20 hours ago

Article

Introducing AutoRound: Intel’s Advanced Quantization for LLMs and VLMs

and 8 others •

1 day ago

• 11

published an article 1 day ago

Article

Introducing AutoRound: Intel’s Advanced Quantization for LLMs and VLMs

and 8 others •

1 day ago

• 11

liked 2 models about 1 month ago

OPEA/gemma-3-27b-it-int4-AutoRound

Updated about 6 hours ago • 41 • 2

OPEA/DeepSeek-R1-int4-AutoRound-awq-asym

Updated about 5 hours ago • 80 • 2

upvoted a paper about 1 month ago

Faster Inference of LLMs using FP8 on the Intel Gaudi

Paper • 2503.09975 • Published Mar 13 • 1

authored a paper about 1 month ago

Faster Inference of LLMs using FP8 on the Intel Gaudi

Paper • 2503.09975 • Published Mar 13 • 1

upvoted a collection 2 months ago

DeepSeek

Collection

15 items • Updated Mar 19 • 3

liked a model 2 months ago

OPEA/DeepSeek-R1-fp8-static-w8a8-inc

Updated Feb 27 • 19 • 2

liked 5 models 3 months ago

upvoted a paper 3 months ago

Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models

Paper • 2501.11873 • Published Jan 21 • 66

liked a model 4 months ago

OPEA/DeepSeek-V3-int4-sym-gptq-inc

Updated about 5 hours ago • 153 • 17

reacted to wenhuach's post with 🚀 4 months ago

Post

345

This week, OPEA Space released several new INT4 models, including:
nvidia/Llama-3.1-Nemotron-70B-Instruct-HF
allenai/OLMo-2-1124-13B-Instruct
THUDM/glm-4v-9b
AIDC-AI/Marco-o1
and several others.
Let us know which models you'd like prioritized for quantization, and we'll do our best to make it happen!

OPEA

3 replies

authored a paper 5 months ago

A dynamic parallel method for performance optimization on hybrid CPUs

Paper • 2411.19542 • Published Nov 29, 2024 • 5

upvoted a paper 5 months ago

A dynamic parallel method for performance optimization on hybrid CPUs

Paper • 2411.19542 • Published Nov 29, 2024 • 5

commented a paper 5 months ago

A dynamic parallel method for performance optimization on hybrid CPUs

Paper • 2411.19542 • Published Nov 29, 2024 • 5 •

liked a model 5 months ago

OPEA/Meta-Llama-3.1-70B-Instruct-int4-asym-inc

Updated about 5 hours ago • 21 • 1