Social Post Explorers

community
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

social-post-explorers's activity

MonsterMMORPG 
posted an update about 2 hours ago
view post
Post
50
Prepared presets for Wan 2.1 for every model and GPU with modelscope / DiffSynth-Studio - Works with maximum speed as long as you are not using more than 2 GB VRAM - Compared BF16 vs FP8 as well

Our app tutorial main : https://youtu.be/hnAhveNy-8s

2nd tutorial : https://youtu.be/ueMrzmbdWBg

Our App : https://www.patreon.com/posts/click-to-open-post-used-in-tutorial-123105403

Also our App now has fully updated presets for every GPU both for BF16 and FP8 precision
MonsterMMORPG 
posted an update 2 days ago
burtenshaw 
posted an update 3 days ago
view post
Post
1707
The open LLM leaderboard is completed, retired, dead, ‘ascended to a higher plane’. And in its shadow we have an amazing range of leaderboards built and maintained by the community.

In this post, I just want to list some of those great leaderboards that you should bookmark for staying up to date:

- Chatbot Arena LLM Leaderboard is the first port of call for checking out the best model. It’s not the fastest because humans will need to use the models to get scores, but it’s worth the wait. lmarena-ai/chatbot-arena-leaderboard

- OpenVLM Leaderboard is great for getting scores on vision language models opencompass/open_vlm_leaderboard

- Ai2 are doing a great job on RewardBench and I hope they keep it up because reward models are the unsexy workhorse of the field. allenai/reward-bench

- The GAIA leaderboard is great for evaluating agent applications. gaia-benchmark/leaderboard

🤩 This seems like such a sustainable way of building for the long term, where rather than leaning on a single company to evaluate all LLMs, we share the load.
  • 3 replies
·
MonsterMMORPG 
posted an update 3 days ago
burtenshaw 
posted an update 3 days ago
view post
Post
1692
Still speed running Gemma 3 to think. Today I focused on setting up gpu poor hardware to run GRPO.

This is a plain TRL and PEFT notebook which works on mac silicone or colab T4. This uses the 1b variant of Gemma 3 and a reasoning version of GSM8K dataset.

🧑‍🍳 There’s more still in the oven like releasing models, an Unsloth version, and deeper tutorials, but hopefully this should bootstrap your projects.

Here’s a link to the 1b notebook: https://colab.research.google.com/drive/1mwCy5GQb9xJFSuwt2L_We3eKkVbx2qSt?usp=sharing
  • 1 reply
·
burtenshaw 
posted an update 4 days ago
view post
Post
1567
everybody and their dog is fine-tuning Gemma 3 today, so I thought I'd do a longer post on the tips and sharp edges I find. let's go!

1. has to be install everything form main and nightly. this is what I'm working with to get unsloth and TRL running

git+https://github.com/huggingface/transformers@main
git+https://github.com/huggingface/trl.git@main
bitsandbytes
peft


plus this with --no-deps

git+https://github.com/unslothai/unsloth-zoo.git@nightly
git+https://github.com/unslothai/unsloth.git@nightly


2. will brown's code to turn GSM8k into a reasoning dataset is a nice toy experiment https://gist.github.com/willccbb/4676755236bb08cab5f4e54a0475d6fb

3. with a learning rate of 5e-6 rewards and loss stayed flat for the first 100 or so steps.

4. so far none of my runs have undermined the outputs after 1 epoch. therefore, I'm mainly experimenting with bigger LoRA adapters.

from trl import GRPOConfig

training_args = GRPOConfig(
    learning_rate = 5e-6,
    adam_beta1 = 0.9,
    adam_beta2 = 0.99,
    weight_decay = 0.1,
    warmup_ratio = 0.1,
    lr_scheduler_type = "cosine",
    optim = "adamw_8bit",
    logging_steps = 1,
    per_device_train_batch_size = 2,
    gradient_accumulation_steps = 1,
    num_generations = 2,
    max_prompt_length = 256,
    max_completion_length = 1024 - 256,
    num_train_epochs = 1,
    max_steps = 250,
    save_steps = 250,
    max_grad_norm = 0.1,
    report_to = "none",
)


5. vision fine-tuning isn't available in TRL's GRPOTrainer, so stick to text datasets. but no need to load the model differently in transformers or Unsloth

from transformers import AutoModelForImageTextToText

model = AutoModelForImageTextToText.from_pretrained("google/gemma-3-4b-it)


if you want an introduction to GRPO, check out the reasoning course, it walks you through the algorithm, theory, and implementation in a smooth way.

https://huggingface.co/reasoning-course
  • 2 replies
·
MonsterMMORPG 
posted an update 4 days ago
view post
Post
1865
I just pushed another amazing update to our Wan 2.1 APP. LoRA loading for 14B Wan 2.1 models were taking over 15 minutes. Optimized to take only few seconds now. Fully supports RTX 5000 series and fully optimized for both VRAM and RAM.

Our APP here : https://www.patreon.com/posts/wan-2-1-ultra-as-123105403

Tutorial 1 : https://youtu.be/hnAhveNy-8s

Tutorial 2 : https://youtu.be/ueMrzmbdWBg

It is also pushed to the original repo you can see pull request here : https://github.com/modelscope/DiffSynth-Studio/pull/442

burtenshaw 
posted an update 4 days ago
view post
Post
1712
Here’s a notebook to make Gemma reason with GRPO & TRL. I made this whilst prepping the next unit of the reasoning course:

In this notebooks I combine together google’s model with some community tooling

- First, I load the model from the Hugging Face hub with transformers’s latest release for Gemma 3
- I use PEFT and bitsandbytes to get it running on Colab
- Then, I took Will Browns processing and reward functions to make reasoning chains from GSM8k
- Finally, I used TRL’s GRPOTrainer to train the model

Next step is to bring Unsloth AI in, then ship it in the reasoning course. Links to notebook below.

https://colab.research.google.com/drive/1Vkl69ytCS3bvOtV9_stRETMthlQXR4wX?usp=sharing
·
thomwolf 
posted an update 4 days ago
view post
Post
2151
We've kept pushing our Open-R1 project, an open initiative to replicate and extend the techniques behind DeepSeek-R1.

And even we were mind-blown by the results we got with this latest model we're releasing: ⚡️OlympicCoder ( open-r1/OlympicCoder-7B and open-r1/OlympicCoder-32B)

It's beating Claude 3.7 on (competitive) programming –a domain Anthropic has been historically really strong at– and it's getting close to o1-mini/R1 on olympiad level coding with just 7B parameters!

And the best part is that we're open-sourcing all about its training dataset, the new IOI benchmark, and more in our Open-R1 progress report #3: https://huggingface.co/blog/open-r1/update-3

Datasets are are releasing:
- open-r1/codeforces
- open-r1/codeforces-cots
- open-r1/ioi
- open-r1/ioi-test-cases
- open-r1/ioi-sample-solutions
- open-r1/ioi-cots
- open-r1/ioi-2024-model-solutions
Smooke 
posted an update 5 days ago
view post
Post
1816
Hallucinations Blog Research Reading List:

Hallucinations Are A Feature of AI, Humans Are The Bug https://hackernoon.com/hallucinations-are-a-feature-of-ai-humans-are-the-bug

Overcome LLM Hallucinations Using Knowledge Bases https://hackernoon.com/overcome-llm-hallucinations-using-knowledge-bases

How to Detect and Minimise Hallucinations in AI Models https://hackernoon.com/how-to-detect-and-minimise-hallucinations-in-ai-models

Predictive Coding, AI: Modeling Placebos in RCTs for Psychedelics and Antidepressants https://hackernoon.com/predictive-coding-ai-modeling-placebos-in-rcts-for-psychedelics-and-antidepressants

A Simple Method to Improving the Accuracy of Your RAG System https://hackernoon.com/say-goodbye-to-ai-hallucinations-a-simple-method-to-improving-the-accuracy-of-your-rag-system

Gen AI Hallucinations: The Good, the Bad, and the Costly https://hackernoon.com/gen-ai-hallucinations-the-good-the-bad-and-the-costly

Why Do LLMs Hallucinate? https://hackernoon.com/why-do-llms-hallucinate

Truth Serum For The AI Age: Factiverse To Fight Fake News And Hallucinations https://hackernoon.com/truth-serum-for-the-ai-age-factiverse-to-fight-fake-news-and-hallucinations

A Secret Technique To Sidestepping LLM Hallucinations https://hackernoon.com/a-secret-technique-to-sidestepping-llm-hallucinations

The Importance of Explainability in AI (XAI) https://hackernoon.com/tackling-ai-hallucinations-the-importance-of-explainability-in-ai-xai

What You Need to Know About Amazon Bedrock’s RAG Evaluation and LLM-as-a-Judge for Advancing AI https://hackernoon.com/what-you-need-to-know-about-amazon-bedrocks-rag-evaluation-and-llm-as-a-judge-for-advancing-ai

I Over Relied on AI and Those Shortcuts Cost Me https://hackernoon.com/i-over-relied-on-ai-and-those-shortcuts-cost-me

AI’s Non-Determinism, Hallucinations, And... Cats? https://hackernoon.com/ais-non-determinism-hallucinations-and-cats

More to read --> https://hackernoon.com/search?query=hallucinations

MonsterMMORPG 
posted an update 6 days ago
view post
Post
795
Ultra Advanced Wan 2.1 App Updates & Famous Squish Effect to Generate Squishing Videos Locally : https://youtu.be/ueMrzmbdWBg

Tutorial Link : https://youtu.be/ueMrzmbdWBg

Squish Effect LoRA arrived to Wan 2.1. Wan 2.1 is the truly State of the Art (SOTA) Open Source video generation model that supports Text to Video (T2V), Video to Video (V2V) and Image to Video (I2V). Now our ultra advanced 1-Click Gradio application supports LoRAs and today I will show you all the new developments to our Wan 2.1 all in one video generation Gradio App. We have added so many new features since the original Wan 2.1 step by step tutorial and we continue to improve our App on a daily bases with amazing updates.

If you want to have Squish it: AI Squish Video Art locally for free forever, our app and Squish LoRA and Wan 2.1 is all you need. Watch this tutorial to learn all. Moreover this tutorial will show you majority of the newest features we have implemented with non-stop working for 10 days.

Hopefully many more updates coming soon.
MonsterMMORPG 
posted an update 11 days ago
Smooke 
posted an update 11 days ago
view post
Post
2139
My Favorite AI Blog Posts on HackerNoon RN:

"AI Chatbots Are Getting Too Good at Making You Say ‘Yes’" https://hackernoon.com/ai-chatbots-are-getting-too-good-at-making-you-say-yes

"Text-to-SQL Was Supposed to Be AI’s Killer App. It’s Not." https://hackernoon.com/text-to-sql-was-supposed-to-be-ais-killer-app-its-not

"This AI Model Gives Edge Devices Eyes on the Back of Their Heads" https://hackernoon.com/this-ai-model-gives-edge-devices-eyes-on-the-back-of-their-heads

"Standardizing Dataset Documentation to Improve Machine Learning Outcomes" https://hackernoon.com/standardizing-dataset-documentation-to-improve-machine-learning-outcomes

"The Internet Is Worse Than Ever, But We’re Too Addicted to Leave" https://hackernoon.com/the-internet-is-worse-than-ever-but-were-too-addicted-to-leave

"DeepSeek vs ChatGPT vs Perplexity vs Qwen vs Claude vs DeepMind" https://hackernoon.com/deepseek-vs-chatgpt-vs-perplexity-vs-qwen-vs-claude-vs-deepmind-more-ai-agents-and-new-ai-tools

"Mitigating Framing Bias with Polarity Minimization Loss: Experiments"
https://hackernoon.com/mitigating-framing-bias-with-polarity-minimization-loss-experiments

"AI Is Now Creating Antidotes for Snake Venom" https://hackernoon.com/ai-is-now-creating-antidotes-for-snake-venom

"human carbon consciousness and AI silicon sentience" https://hackernoon.com/so-how-does-one-really-determine-ai-is-conscious

"Why Natural Language Coding Isn’t for Everyone—Yet" https://hackernoon.com/why-natural-language-coding

"What Is a Diffusion LLM and Why Does It Matter?" https://hackernoon.com/what-is-a-diffusion-llm-and-why-does-it-matter

"AI-Augmented Development: Redefining the Role of Product Managers" https://hackernoon.com/ai-augmented-development-redefining-the-role-of-product-managers

And a bonus story from 2018: "20 top lawyers were beaten by legal AI. Here are their surprising responses" https://hackernoon.com/20-top-lawyers-were-beaten-by-legal-ai-here-are-their-surprising-responses-5dafdf25554d
burtenshaw 
posted an update 11 days ago
view post
Post
3590
I’m super excited to work with @mlabonne to build the first practical example in the reasoning course.

🔗 https://huggingface.co/reasoning-course

Here's a quick walk through of the first drop of material that works toward the use case:

- a fundamental introduction to reinforcement learning. Answering questions like, ‘what is a reward?’ and ‘how do we create an environment for a language model?’

- Then it focuses on Deepseek R1 by walking through the paper and highlighting key aspects. This is an old school way to learn ML topics, but it always works.

- Next, it takes to you Transformers Reinforcement Learning and demonstrates potential reward functions you could use. This is cool because it uses Marimo notebooks to visualise the reward.

- Finally, Maxime walks us through a real training notebook that uses GRPO to reduce generation length. I’m really into this because it works and Maxime took the time to validate it share assets and logging from his own runs for you to compare with.

Maxime’s work and notebooks have been a major part of the open source community over the last few years. I, like everyone, have learnt so much from them.