Join the conversation
Join the community of Machine Learners and AI enthusiasts.
Sign UpAll HF Hub posts
Post
1413
π Idea Transformer:
Idea Transformer: Infinity is an innovative tool that unlocks infinite creativity by generating unique transformation ideas and design images from up to three keywords and a chosen category. Leveraging a state-of-the-art diffusion pipeline, real-time translation, and a powerful LLM, it delivers fresh ideas every time. π¨β¨
openfree/Idea-Transformer
Key Features
Diverse Ideas:
Randomly selects creative variations from your keywords and category β the possibilities are nearly endless! π²
Unique Design Images:
Your text prompt produces striking, varied design images via the diffusion model. πΌοΈ
Real-Time Translation & Expansion:
Korean inputs are automatically translated and enriched using an advanced LLM for high-quality output. π
Dual-Language Support:
Enjoy an intuitive Gradio interface with separate English and Korean tabs for a global audience. π
Explore a Wide Range of Categories:
Sensor Functions π‘: Creative changes in sensor technologies.
Size & Shape Change π: Ideas altering physical dimensions and forms.
Surface & Appearance Change π¨: Transformations in color, texture, and visual effects.
Material State Change π₯: Transitions between different material states.
Movement Characteristics Change πββοΈπ¨: Innovations in motion, speed, and vibration.
Structural Change π οΈ: Reconfigurations via assembly/disassembly and design modifications.
Spatial Movement π: Ideas on repositioning and directional shifts.
Time-Related Change β³: Concepts influenced by aging, wear, and lifecycle.
Light & Visual Effects π‘: Alterations in illumination, transparency, and holographic effects.
Sound & Vibration Effects π: Innovations in auditory and vibrational dynamics.
Business Ideas πΌ: Strategies for market redefinition, business model innovation, and more.
Why Choose Idea Transformer?
Infinite Creativity & Cutting-Edge Technology : Your keywords and randomized transformations produce an endless stream of unique ideas!
Idea Transformer: Infinity is an innovative tool that unlocks infinite creativity by generating unique transformation ideas and design images from up to three keywords and a chosen category. Leveraging a state-of-the-art diffusion pipeline, real-time translation, and a powerful LLM, it delivers fresh ideas every time. π¨β¨
openfree/Idea-Transformer
Key Features
Diverse Ideas:
Randomly selects creative variations from your keywords and category β the possibilities are nearly endless! π²
Unique Design Images:
Your text prompt produces striking, varied design images via the diffusion model. πΌοΈ
Real-Time Translation & Expansion:
Korean inputs are automatically translated and enriched using an advanced LLM for high-quality output. π
Dual-Language Support:
Enjoy an intuitive Gradio interface with separate English and Korean tabs for a global audience. π
Explore a Wide Range of Categories:
Sensor Functions π‘: Creative changes in sensor technologies.
Size & Shape Change π: Ideas altering physical dimensions and forms.
Surface & Appearance Change π¨: Transformations in color, texture, and visual effects.
Material State Change π₯: Transitions between different material states.
Movement Characteristics Change πββοΈπ¨: Innovations in motion, speed, and vibration.
Structural Change π οΈ: Reconfigurations via assembly/disassembly and design modifications.
Spatial Movement π: Ideas on repositioning and directional shifts.
Time-Related Change β³: Concepts influenced by aging, wear, and lifecycle.
Light & Visual Effects π‘: Alterations in illumination, transparency, and holographic effects.
Sound & Vibration Effects π: Innovations in auditory and vibrational dynamics.
Business Ideas πΌ: Strategies for market redefinition, business model innovation, and more.
Why Choose Idea Transformer?
Infinite Creativity & Cutting-Edge Technology : Your keywords and randomized transformations produce an endless stream of unique ideas!

KaiChen1998Β
posted an update
about 15 hours ago
Post
746
π’ Our EMOVA paper has been accepted by CVPR 2025, and we are glad to release all resources, including code (training & inference), datasets (training & evaluation), and checkpoints (EMOVA-3B/7B/72B)!
π€ EMOVA is a novel end-to-end omni-modal LLM that can see, hear and speak. Given omni-modal (i.e., textual, visual and speech) inputs, EMOVA can generate both textual and speech responses with vivid emotional controls by utilizing the speech decoder and a style controller.
β¨ EMOVA Highlights
β State-of-the-art omni-modality: EMOVA achieves SoTA comparable results on both vision-language and speech benchmarks simultaneously.
β Device adaptation: our codebase supports training/inference on both NVIDIA GPUs (e.g., A800 & H20) and Ascend NPUs (e.g., 910B3)!
β Modular design: we integrate multiple implementations of vision encoder, vision projector, and language model.
π₯ You are all welcome to try and star!
- Project page: https://emova-ollm.github.io/
- Github: https://github.com/emova-ollm/EMOVA
- Demo: Emova-ollm/EMOVA-demo
π€ EMOVA is a novel end-to-end omni-modal LLM that can see, hear and speak. Given omni-modal (i.e., textual, visual and speech) inputs, EMOVA can generate both textual and speech responses with vivid emotional controls by utilizing the speech decoder and a style controller.
β¨ EMOVA Highlights
β State-of-the-art omni-modality: EMOVA achieves SoTA comparable results on both vision-language and speech benchmarks simultaneously.
β Device adaptation: our codebase supports training/inference on both NVIDIA GPUs (e.g., A800 & H20) and Ascend NPUs (e.g., 910B3)!
β Modular design: we integrate multiple implementations of vision encoder, vision projector, and language model.
π₯ You are all welcome to try and star!
- Project page: https://emova-ollm.github.io/
- Github: https://github.com/emova-ollm/EMOVA
- Demo: Emova-ollm/EMOVA-demo

nroggendorffΒ
posted an update
2 days ago
Post
1498
This is the most exciting of this weekβs release for me: Gemini Robotics - A SOTA generalist Vision-Language-Action model that brings intelligence to the physical world. It comes with a verifiable real-world knowledge Embodied Reasoning QA benchmark. Cool part is that the model can be specialized with fast adaptation to new tasks and have such adaptations transferred to new robot embodiment like humanoids. Looking forward to the model and data on hf, itβs about time I go full physical:)
Technical Report: https://storage.googleapis.com/deepmind-media/gemini-robotics/gemini_robotics_report.pdf
Technical Report: https://storage.googleapis.com/deepmind-media/gemini-robotics/gemini_robotics_report.pdf
Post
551
Hello community,
I want to share my work of creating a reasoning mamba model
I used GRPO over Falcon3 Mamba Instruct to make this model. It generates blazing fast response while building good logic to answer challenging questions.
Give it a try:
Model repo: hanzla/Falcon3-Mamba-R1-v0
Space: hanzla/Falcon3MambaReasoner
Looking forward to community feedback.
I want to share my work of creating a reasoning mamba model
I used GRPO over Falcon3 Mamba Instruct to make this model. It generates blazing fast response while building good logic to answer challenging questions.
Give it a try:
Model repo: hanzla/Falcon3-Mamba-R1-v0
Space: hanzla/Falcon3MambaReasoner
Looking forward to community feedback.

DualityAI-RebekahBogdanoffΒ
posted an update
1 day ago
Post
1035
Think building custom digital twins for AI training is hard? Let us show you how to make it easy!
Next week, Duality AI is offering a free "Creating Your Own 4-Wheeled Vehicle Digital Twins for AI Training with Falcon Editor" live class.
Sign up here: https://forms.gle/2U5xugMjvSkZdeaR8
What we'll cover:
π Import & Configure a rigged 4-wheeled vehicle and transform it into a controllable system twin using Blueprints.
π Enable Dynamic Control by exposing Python variables for real-time adjustments.
π Attach Sensors to capture valuable simulation data.
π Assemble & Run a Simulation Scenario to generate training data for AI & robotics applications.
See how Falcon creates synthetic data for faster, easier, and more targeted AI training by creating a FREE account here: https://www.duality.ai/edu
Next week, Duality AI is offering a free "Creating Your Own 4-Wheeled Vehicle Digital Twins for AI Training with Falcon Editor" live class.
Sign up here: https://forms.gle/2U5xugMjvSkZdeaR8
What we'll cover:
π Import & Configure a rigged 4-wheeled vehicle and transform it into a controllable system twin using Blueprints.
π Enable Dynamic Control by exposing Python variables for real-time adjustments.
π Attach Sensors to capture valuable simulation data.
π Assemble & Run a Simulation Scenario to generate training data for AI & robotics applications.
See how Falcon creates synthetic data for faster, easier, and more targeted AI training by creating a FREE account here: https://www.duality.ai/edu
Post
1179
I've published an article showing five ways to use πͺ’ Langfuse with π€ Hugging Face.
My personal favorite is Method #4: Using Hugging Face Datasets for Langfuse Dataset Experiments. This lets you benchmark your LLM app or AI agent with a dataset hosted on Hugging Face. In this example, I chose the GSM8K dataset ( openai/gsm8k) to test the mathematical reasoning capabilities of my smolagent :)
Link to the Article here on HF: https://huggingface.co/blog/MJannik/hugging-face-and-langfuse
My personal favorite is Method #4: Using Hugging Face Datasets for Langfuse Dataset Experiments. This lets you benchmark your LLM app or AI agent with a dataset hosted on Hugging Face. In this example, I chose the GSM8K dataset ( openai/gsm8k) to test the mathematical reasoning capabilities of my smolagent :)
Link to the Article here on HF: https://huggingface.co/blog/MJannik/hugging-face-and-langfuse

burtenshawΒ
posted an update
2 days ago
Post
1548
Still speed running Gemma 3 to think. Today I focused on setting up gpu poor hardware to run GRPO.
This is a plain TRL and PEFT notebook which works on mac silicone or colab T4. This uses the 1b variant of Gemma 3 and a reasoning version of GSM8K dataset.
π§βπ³ Thereβs more still in the oven like releasing models, an Unsloth version, and deeper tutorials, but hopefully this should bootstrap your projects.
Hereβs a link to the 1b notebook: https://colab.research.google.com/drive/1mwCy5GQb9xJFSuwt2L_We3eKkVbx2qSt?usp=sharing
This is a plain TRL and PEFT notebook which works on mac silicone or colab T4. This uses the 1b variant of Gemma 3 and a reasoning version of GSM8K dataset.
π§βπ³ Thereβs more still in the oven like releasing models, an Unsloth version, and deeper tutorials, but hopefully this should bootstrap your projects.
Hereβs a link to the 1b notebook: https://colab.research.google.com/drive/1mwCy5GQb9xJFSuwt2L_We3eKkVbx2qSt?usp=sharing
Post
1427
π’ With the recent release of Gemma-3, If you interested to play with textual chain-of-though, the notebook below is a wrapper over the the model (native transformers inference API) for passing the predefined schema of promps in batching mode.
https://github.com/nicolay-r/nlp-thirdgate/blob/master/tutorials/llm_gemma_3.ipynb
Limitation: schema supports texts only (for now), while gemma-3 is a text+image to text.
Model: google/gemma-3-1b-it
Provider: https://github.com/nicolay-r/nlp-thirdgate/blob/master/llm/transformers_gemma3.py
https://github.com/nicolay-r/nlp-thirdgate/blob/master/tutorials/llm_gemma_3.ipynb
Limitation: schema supports texts only (for now), while gemma-3 is a text+image to text.
Model: google/gemma-3-1b-it
Provider: https://github.com/nicolay-r/nlp-thirdgate/blob/master/llm/transformers_gemma3.py