Vaibhav Srivastav's picture

Vaibhav Srivastav PRO

reach-vb

AI & ML interests

TTS + LM performance prediction

Recent Activity

liked a Space 3 days ago
sesame/csm-1b
liked a model 3 days ago
sesame/csm-1b
View all activity

Organizations

Hugging Face's profile picture Notebooks-explorers's profile picture Whisper fine-tuning sprint's profile picture Hugging Face Course's profile picture Whisper Fine-Tuning Event's profile picture Kensho's profile picture Mozilla Foundation's profile picture PolinaOrg's profile picture Coqui.ai's profile picture Internal Data & Models for Speech Recognition Event's profile picture Speech Recognition Community Event Version 2's profile picture onnx's profile picture Hugging Test Lab's profile picture Internal Data's profile picture The Team Ten's profile picture Huggingface Projects's profile picture EuroPython 2022's profile picture Whisper Distillation's profile picture BigCode's profile picture Hugging Face OSS Metrics's profile picture Harmonai's Dance Diffusion Community's profile picture EuroSciPy 2022's profile picture LaLoka Labs's profile picture Core ML Projects's profile picture meta-private's profile picture Blog-explorers's profile picture Music Gen Sprint's profile picture Hugging Face for Audio's profile picture Hugging Face TB Research's profile picture Open ASR Leaderboard's profile picture test's profile picture MusicGen Internal's profile picture TTS Eval (OLD)'s profile picture ZeroGPU Explorers's profile picture Editing Audio's profile picture ggml.ai's profile picture LocalLLaMA's profile picture gg-hf's profile picture Python Italia's profile picture Unofficial Mistral Community's profile picture Journalists on Hugging Face's profile picture Llzama's profile picture finding-nemo's profile picture diarizers-community's profile picture MLX Community's profile picture Cartesia's profile picture Hugging Face Assignments's profile picture IBM Granite's profile picture On-device Squad's profile picture TTS AGI's profile picture Social Post Explorers's profile picture Apple CoreNet Models 's profile picture LM Studio Community's profile picture gg-gguf's profile picture hsramall's profile picture Lina Speech's profile picture Dev Mode Explorers's profile picture Sweet Dream(Booth)s's profile picture private beta for deeplinks's profile picture Paris AI Running Club's profile picture gg-tt's profile picture Kyutai's profile picture OuteAI's profile picture Hugging Face Discord Community's profile picture LLHF's profile picture SLLHF's profile picture Ratchet Community's profile picture Hugging Quants's profile picture lbhf's profile picture CoreML Scratchpad's profile picture blhf's profile picture Meta Llama's profile picture kmhf's profile picture nltpt's profile picture nltpt-q's profile picture ai4b-hf's profile picture Ollama Tools's profile picture Spirit LM's profile picture qrias's profile picture Audio Collabs's profile picture Consumer AI Edge Hackathon (Meta, Hugging Face, Pytorch, Scaleway & Unaite)'s profile picture open/ acc's profile picture ExecuTorch Community's profile picture wut?'s profile picture DDUF's profile picture AI Starter Pack's profile picture None yet's profile picture Open R1's profile picture LiteRT Community (FKA TFLite)'s profile picture MultiLlasa's profile picture gg-hf-g's profile picture mshf's profile picture fluxions-hf's profile picture yoso's profile picture hf-private-mlx's profile picture

reach-vb's activity

reacted to julien-c's post with πŸš€πŸ”₯ 5 days ago
view post
Post
2189
Important notice 🚨

For Inference Providers who have built support for our Billing API (currently: Fal, Novita, HF-Inference – with more coming soon), we've started enabling Pay as you go (=PAYG)

What this means is that you can use those Inference Providers beyond the free included credits, and they're charged to your HF account.

You can see it on this view: any provider that does not have a "Billing disabled" badge, is PAYG-compatible.
reacted to PranjaliJoshi's post with β€οΈπŸ‘€ 6 days ago
view post
Post
595
🌍 Have you tried Cosmos world foundation models on Hugging Face? Because more updates are coming! πŸš€

Cosmos world foundation models (WFMs) are generative pretrained models for synthetic data generation for training AI models for robot or autonomous vehicle development.

πŸ› οΈ If you are building generative VLMs or foundation models for physical AI like policy models- there are new updates coming at NVIDIA GTC.

GTC is NVIDIA’s biggest annual event (March 17-21) - it will have deep dives, training labs, and researcher-led sessions on Cosmos.

Plus, Jensen Huang’s keynote! 🎀

🎟️ 20% off GTC registration β†’ Use code HUGGINGFACE20
πŸ”— https://www.nvidia.com/gtc/
πŸ“ Happening in person at the San Jose Convention Center and online.
Explore all Cosmos sessions at GTC: https://nvda.ws/41yBkmY

Try the existing Cosmos WFMs:

πŸ”— Hugging Face models: nvidia/cosmos-6751e884dc10e013a0a0d8e6

πŸ› οΈ Post-training scripts: https://github.com/NVIDIA/Cosmos/blob/main/cosmos1/models/POST_TRAINING.md
  • 1 reply
Β·
reacted to AdinaY's post with πŸš€πŸ”₯😎 13 days ago
view post
Post
4001
Exciting releases from the Chinese community this FebruaryπŸ”₯
πŸ‘‰ zh-ai-community/2025-february-67a35aaa68e97812def5b6ef

MLLM:
✨ Ovis2 by Alibaba
AIDC-AI/ovis2-67ab36c7e497429034874464
✨ Step Audio Chat by StepFun AI
stepfun-ai/step-audio-67b33accf45735bb21131b0b

Audio:
✨ Step Audio TTS by StepFunAI
stepfun-ai/Step-Audio-TTS-3B
✨ InspireMusic by Alibaba
https://huggingface.co/FunAudioLLM
✨ Baichuan Audio by BaichuanAI
baichuan-inc/Baichuan-Audio-Instruct

Video:
✨ Wan2.1 by Alibaba_Wan
Wan-AI/Wan2.1-T2V-14B
✨ Stepvideo-T2V by StepFun AI
stepfun-ai/stepvideo-t2v
✨ SkyReels-V1 by Skywork
Skywork/skyreels-v1-67b34676ff65b4ec02d16307
✨ LLaDA-8B by RenminUniversity
GSAI-ML/LLaDA-8B-Instruct

MoE:
✨ Moonlight-16B by MoonshotAI (Kimi)
moonshotai/Moonlight-16B-A3B-Instruct

Reasoning:
✨ TinyR1-32B by Qihoo360
qihoo360/TinyR1-32B-Preview

Dataset:
✨ Chinese DeepSeek R1-Distill data -110k
Congliu/Chinese-DeepSeek-R1-Distill-data-110k
replied to lysandre's post 23 days ago
reacted to lysandre's post with πŸš€β€οΈ 23 days ago
view post
Post
5652
SmolVLM-2 and SigLIP-2 are now part of transformers in dedicated releases!

They're added on top of the v4.49.0 release, and can be installed from the following tags: v4.49.0-SmolVLM-2 and v4.49.0-SigLIP-2.

This marks a new beginning for the release process of transformers. For the past five years, we've been doing monthly releases featuring many models (v4.49.0, the latest release, features 9 new architectures).

Starting with SmolVLM-2 & SigLIP2, we'll now additionally release tags supporting new models on a stable branch. These models are therefore directly available for use by installing from the tag itself. These tags will continue to be updated with fixes applied to these models.

Going forward, continue expecting software releases following semantic versioning: v4.50.0 will have ~10 new architectures compared to v4.49.0, as well as a myriad of new features, improvements and bug fixes. Accompanying these software releases, we'll release tags offering brand new models as fast as possible, to make them accessible to all immediately.
  • 1 reply
Β·
replied to Keltezaa's post about 1 month ago
reacted to AdinaY's post with πŸ”₯πŸš€ about 2 months ago
view post
Post
2662
πŸ”₯So many exciting releases coming from the Chinese community this month!
zh-ai-community/2025-january-6786b054f492fb223591269e

LLMs:
✨ Qwen2.5 -1M by Alibaba
Qwen/qwen25-1m-679325716327ec07860530ba
✨ InternLM3-8B-Instruct by Shanghai AI Lab
internlm/internlm3-8b-instruct
✨ MiniMax-Text-01 by MiniMax AI
MiniMaxAI/MiniMax-Text-01
✨ RWKV-7 by BlinkDL -- RNN + Transformer πŸ‘€
BlinkDL/rwkv-7-world
✨ DeepSeek-R1 by DeepSeek -- THE ONE πŸ™Œ
https://huggingface.co/deepseek-ai
✨ Baichuan-M1-14B by Baichuan - Medical 🩺
baichuan-inc/Baichuan-M1-14B-Base
✨ Qwen2.5-Math-PRM by Alibaba - Math πŸ”’
Qwen/Qwen2.5-Math-PRM-7B

Code:
✨ Tare by Bytedance
https://trae.ai

TTS:
✨ T2A-01-HD by MiniMax AI
https://hailuo.ai/audio
✨ LLaSA by HKUST Audio
HKUSTAudio/Llasa-3B

MLLM:
✨ Kimi k1.5 by Moonshot AI
https://kimi.ai
✨ MiniCPM-o-2_6 by OpenBMB
openbmb/MiniCPM-o-2_6
✨ Sa2VA-4B by ByteDance
ByteDance/Sa2VA-4B
✨ VideoLLaMA 3 by Alibaba DAMO
DAMO-NLP-SG/videollama3-678cdda9281a0e32fe79af15
✨ LLaVA-Mini by Chinese Academy of Sciences
ICTNLP/llava-mini-llama-3.1-8b
✨Hunyuan-7B by Tencent
tencent/Hunyuan-7B-Instruct
✨ Hunyuan 3D 2.0 by Tencent
tencent/Hunyuan3D-2
✨MiniMax-VL-01 by MiniMax AI - A non transformer based VLM πŸ‘€
MiniMaxAI/MiniMax-VL-01

Agent:
✨ UI-TARS by Bytedance
bytedance-research/UI-TARS-7B-SFT
✨ GLM-PC by Zhipu AI
https://cogagent.aminer.cn

Dataset:
✨ Fineweb-Edu-Chinese by Opencsg
opencsg/Fineweb-Edu-Chinese-V2.1
✨ Multimodal_textbook by Alibaba
DAMO-NLP-SG/multimodal_textbook
✨ MME-Finance by Hithink AI
Β·
reacted to julien-c's post with πŸ‘ 3 months ago
view post
Post
10359
After some heated discussion πŸ”₯, we clarify our intent re. storage limits on the Hub

TL;DR:
- public storage is free, and (unless blatant abuse) unlimited. We do ask that you consider upgrading to PRO and/or Enterprise Hub if possible
- private storage is paid above a significant free tier (1TB if you have a paid account, 100GB otherwise)

docs: https://huggingface.co/docs/hub/storage-limits

We optimize our infrastructure continuously to scale our storage for the coming years of growth in Machine learning, to the benefit of the community πŸ”₯

cc: @reach-vb @pierric @victor and the HF team
Β·
replied to julien-c's post 3 months ago
reacted to julien-c's post with πŸ€—β€οΈπŸ”₯ 3 months ago
view post
Post
10359
After some heated discussion πŸ”₯, we clarify our intent re. storage limits on the Hub

TL;DR:
- public storage is free, and (unless blatant abuse) unlimited. We do ask that you consider upgrading to PRO and/or Enterprise Hub if possible
- private storage is paid above a significant free tier (1TB if you have a paid account, 100GB otherwise)

docs: https://huggingface.co/docs/hub/storage-limits

We optimize our infrastructure continuously to scale our storage for the coming years of growth in Machine learning, to the benefit of the community πŸ”₯

cc: @reach-vb @pierric @victor and the HF team
Β·
posted an update 3 months ago
view post
Post
5665
VLMs are going through quite an open revolution AND on-device friendly sizes:

1. Google DeepMind w/ PaliGemma2 - 3B, 10B & 28B: google/paligemma-2-release-67500e1e1dbfdd4dee27ba48

2. OpenGVLabs w/ InternVL 2.5 - 1B, 2B, 4B, 8B, 26B, 38B & 78B: https://huggingface.co/collections/OpenGVLab/internvl-25-673e1019b66e2218f68d7c1c

3. Qwen w/ Qwen 2 VL - 2B, 7B & 72B: Qwen/qwen2-vl-66cee7455501d7126940800d

4. Microsoft w/ FlorenceVL - 3B & 8B: https://huggingface.co/jiuhai

5. Moondream2 w/ 0.5B: https://huggingface.co/vikhyatk/

What a time to be alive! πŸ”₯
replied to Duskfallcrew's post 3 months ago
view reply

Hi @nyuuzyou - I'm VB, I work at HF. The team is working around the clock on putting together a setup that works for everyone.

In the meantime I assure you that your models/ dataset are safe and no hard limits are in-place. We're working on it!

Your research/ work is quite important to the community and Hugging Face, always will be.