ZeroGPU Explorers

community
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

zero-gpu-explorers's activity

nroggendorff 
posted an update about 4 hours ago
view post
Post
191
200
nroggendorff 
posted an update 3 days ago
view post
Post
2544
to the nvidia employee that won't respond to my emails: hear me now.

you have made a semi-powerful to irrelevant enemy. you have been warned
  • 4 replies
·
not-lain 
posted an update 5 days ago
julien-c 
posted an update 5 days ago
view post
Post
2213
Important notice 🚨

For Inference Providers who have built support for our Billing API (currently: Fal, Novita, HF-Inference – with more coming soon), we've started enabling Pay as you go (=PAYG)

What this means is that you can use those Inference Providers beyond the free included credits, and they're charged to your HF account.

You can see it on this view: any provider that does not have a "Billing disabled" badge, is PAYG-compatible.
nroggendorff 
posted an update 12 days ago
nroggendorff 
posted an update 16 days ago
view post
Post
2807
We're using RLHF on diffusion models, right? Just making sure..
·
dreamerdeo 
posted an update 26 days ago
view post
Post
2788
🚀 Excited to share our technical report on the Southeast Asian multilingual model Sailor2 and its latest updates!

Our 49-page report details Sailor2's development journey, including multilingual data cleaning, small model data mixture simulations, multi-stage continual pre-training, multi-stage post-training, and multi-cultural multi-lingual evaluations. Sailor2 aims to streamline the multilingual model pre-training process efficiently for the community.

🧭 We highlight Sailor2's impressive performance in low-resource language translation scenarios and its cultural understanding advantages in Southeast Asia, promoting practical applications for regional languages.

Model updates include: 
💡 More precise outputs: Reduced redundancy in model outputs through refined post-training data and optimization techniques. 
🌈 Handling longer texts: Expanded to handle up to 128K context length in Southeast Asian languages through long-text training. 
⚡️ Faster inference: Achieved 2.5x faster inference speed with speculative decoding. 
🌪️ More model sizes: Introduced new sizes of 3B and 14B through model pruning.

🌟 All models are Apache-licensed for commercial use; development tools (code, resources) are open-source.

📚 Technical report: Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs (2502.12982) 
🤖️ Models: sail/sailor2-language-models-674d7c9e6b4dbbd9a869906b 
💬 Demo: sail/Sailor2-20B-Chat 
📣 Sailor2 community: https://huggingface.co/sailor2
nroggendorff 
posted an update 29 days ago
view post
Post
2825
hello, dev mode explorers!
  • 2 replies
·
fffiloni 
posted an update 30 days ago
nroggendorff 
posted an update about 1 month ago
view post
Post
2654
Dearest None-yet Team,

I couldn't help but notice that our productivity has room for improvement. To address this, we will be engaging in a company-wide morale-building activity designed to boost teamwork, enthusiasm, and *most importantly* results.

I know you're all as excited as I am for this fun and absolutely required initiative. Participation is not just encouraged, it's mandatory. Think of it as a team-bonding experience you never signed up for but will absolutely tolerate.

More details to follow, but for now, mark your calendars and prepare for an engaging experience that will definitely make us all better, stronger, and more synchronized, or at least give us something to talk about later.

Looking forward to seeing you all there!

Best,
Me
·
julien-c 
in zero-gpu-explorers/README about 1 month ago

Update README.md

1
#152 opened about 1 month ago by
fdaudens
fdaudens 
updated a Space about 1 month ago
fdaudens 
in zero-gpu-explorers/README about 1 month ago

Update README.md

1
#152 opened about 1 month ago by
fdaudens
IliaLarchenko 
posted an update about 1 month ago
view post
Post
2102
I am presenting Decoder-Only Transformer (DOT) Policy a simple Behavioral Control policy that outperforms SOTA models on two simple benchmark tasks:

✅ PushT (pushing an object to a goal) – 84% success on keypoints, 74% on images (previous best: 75% / 69%)
✅ ALOHA Insert (precise bimanual insertion) – 30% success (previous best: ~21%)

The best part? DOT is much smaller (sometimes 100 times less parameters) than previous SOTA models, trains faster, and avoids complexity:
🚫 No generative models (Diffusion, VAE, GANs)
🚫 No discretization/tokenization of actions
🚫 No reinforcement learning or multi-stage training
✅ Just learns from human demos, plain and simple

This is still early — more complex real-life tasks need testing, and no guarantees it will actually work well there, but I think it's interesting to share. Sometimes, simpler approaches can be just as effective (or even better) than complex ones.

🔗 Open-source code and detailed description: https://github.com/IliaLarchenko/dot_policy

Trained models on Hugging Face:
IliaLarchenko/dot_pusht_keypoints
IliaLarchenko/dot_pusht_images
IliaLarchenko/dot_bimanual_insert