This is the most exciting of this weekβs release for me: Gemini Robotics - A SOTA generalist Vision-Language-Action model that brings intelligence to the physical world. It comes with a verifiable real-world knowledge Embodied Reasoning QA benchmark. Cool part is that the model can be specialized with fast adaptation to new tasks and have such adaptations transferred to new robot embodiment like humanoids. Looking forward to the model and data on hf, itβs about time I go full physical:) Technical Report: https://storage.googleapis.com/deepmind-media/gemini-robotics/gemini_robotics_report.pdf
Thank you to the Open LLM Leaderboard's team for offering it to the community for as long as they did. I only recently joined HF, and it provided a lot of incentive and information to make better models.
Always will remember getting to #112 :D
Anyone have a solid way to test my models privately? Please let me know!
You can apply for yourself, or your entire organization. Head over to your account settings for more information or join anywhere you see the Xet logo on a repository you know.
Have questions? Join the conversation below π or open a discussion on the Xet team page xet-team/README
Ever wanted 45 min with one of AIβs most fascinating minds? Was with @thomwolf at HumanX Vegas. Sharing my notes of his Q&A with the pressβcompletely changed how I think about AIβs future:
1οΈβ£ The next wave of successful AI companies wonβt be defined by who has the best model but by who builds the most useful real-world solutions. "We all have engines in our cars, but thatβs rarely the only reason we buy one. We expect it to work well, and thatβs enough. LLMs will be the same."
2οΈβ£ Big players are pivoting: "Closed-source companiesβOpenAI being the firstβhave largely shifted from LLM announcements to product announcements."
3οΈβ£ Open source is changing everything: "DeepSeek was open source AIβs ChatGPT moment. Basically, everyone outside the bubble realized you can get a model for freeβand itβs just as good as the paid ones."
4οΈβ£ Product innovation is being democratized: Take Manus, for exampleβthey built a product on top of Anthropicβs models thatβs "actually better than Anthropicβs own product for now, in terms of agents." This proves that anyone can build great products with existing models.
Weβre entering a "multi-LLM world," where models are becoming commoditized, and all the tools to build are readily availableβjust look at the flurry of daily new releases on Hugging Face.
Thom's comparison to the internet era is spot-on: "In the beginning you made a lot of money by making websites... but nowadays the huge internet companies are not the companies that built websites. Like Airbnb, Uber, Facebook, they just use the internet as a medium to make something for real life use cases."
We just crossed 1,500,000 public models on Hugging Face (and 500k spaces, 330k datasets, 50k papers). One new repository is created every 15 seconds. Congratulations all!
3 replies
Β·
reacted to BrigitteTousi's
post with β€οΈπ₯π5 days ago
Honored to be named among their 12 pioneers and power players in the news industry in the 2025 Tech Trends Report from Future Today Strategy Group.
Incredible group to be part of - each person is doing groundbreaking work at the intersection of AI and journalism. Worth following them all: they're consistently sharing practical insights on building the future of news.
Take the time to read this report, it's packed with insights as always. The news & information section's #1 insight hits hard: "The most substantive economic impact of AI to date has been licensing payouts for a handful of big publishers. The competition will start shifting in the year ahead to separate AI 'haves' that have positioned themselves to grow from the 'have-nots.'"
This AI-driven divide is something I've been really concerned about. Now is the time to build more than ever!
I was chatting with @peakji , one of the cofounders of Manu AI, who told me he was on Hugging Face (very cool!).
He shared an interesting insight which is that agentic capabilities might be more of an alignment problem rather than a foundational capability issue. Similar to the difference between GPT-3 and InstructGPT, some open-source foundation models are simply trained to 'answer everything in one response regardless of the complexity of the question' - after all, that's the user preference in chatbot use cases. Just a bit of post-training on agentic trajectories can make an immediate and dramatic difference.
As a thank you to the community, he shared 100 invite code first-come first serve, just use βHUGGINGFACEβ to get access!