david

quyet7779

AI & ML interests

None yet

Recent Activity

reacted to DawnC's post with 🔥 15 days ago

I'm excited to introduce VisionScout —an interactive vision tool that makes computer vision both accessible and powerful! 👀🔍 What can VisionScout do right now? 🖼️ Upload any image and detect 80 different object types using YOLOv8. 🔄 Instantly switch between Nano, Medium, and XLarge models depending on your speed vs. accuracy needs. 🎯 Filter specific classes (people, vehicles, animals, etc.) to focus only on what matters to you. 📊 View detailed statistics about detected objects, confidence levels, and spatial distribution. 🎨 Enjoy a clean, intuitive interface with responsive design and enhanced visualizations. What's next? I'm working on exciting updates: - Support for more models - Video processing and object tracking across frames - Faster real-time detection - Improved mobile responsiveness The goal is to build a complete but user-friendly vision toolkit for both beginners and advanced users. Try it yourself! 🚀 https://huggingface.co/spaces/DawnC/VisionScout I'd love to hear your feedback , what features would you find most useful? Any specific use cases you'd love to see supported? Give it a try and let me know your thoughts in the comments! Stay tuned for future updates. #ComputerVision #ObjectDetection #YOLO #MachineLearning #TechForLife

reacted to andito's post with 🔥 5 months ago

Let's go! We are releasing SmolVLM, a smol 2B VLM built for on-device inference that outperforms all models at similar GPU RAM usage and tokens throughputs. - SmolVLM generates tokens 7.5 to 16 times faster than Qwen2-VL! 🤯 - Other models at this size crash a laptop, but SmolVLM comfortably generates 17 tokens/sec on a macbook! 🚀 - SmolVLM can be fine-tuned on a Google collab! Or process millions of documents with a consumer GPU! - SmolVLM even outperforms larger models in video benchmarks, despite not even being trained on videos! Check out more! Demo: https://huggingface.co/spaces/HuggingFaceTB/SmolVLM Blog: https://huggingface.co/blog/smolvlm Model: https://huggingface.co/HuggingFaceTB/SmolVLM-Instruct Fine-tuning script: https://github.com/huggingface/smollm/blob/main/finetuning/Smol_VLM_FT.ipynb

liked a Space 6 months ago

Qwen/Qwen2.5-Coder-demo

View all activity

Organizations

quyet7779's activity

reacted to DawnC's post with 🔥 15 days ago

Post

4238

I'm excited to introduce VisionScout —an interactive vision tool that makes computer vision both accessible and powerful! 👀🔍

What can VisionScout do right now?
🖼️ Upload any image and detect 80 different object types using YOLOv8.
🔄 Instantly switch between Nano, Medium, and XLarge models depending on your speed vs. accuracy needs.
🎯 Filter specific classes (people, vehicles, animals, etc.) to focus only on what matters to you.
📊 View detailed statistics about detected objects, confidence levels, and spatial distribution.
🎨 Enjoy a clean, intuitive interface with responsive design and enhanced visualizations.

What's next?
I'm working on exciting updates:
- Support for more models
- Video processing and object tracking across frames
- Faster real-time detection
- Improved mobile responsiveness

The goal is to build a complete but user-friendly vision toolkit for both beginners and advanced users.

Try it yourself! 🚀
DawnC/VisionScout

I'd love to hear your feedback , what features would you find most useful? Any specific use cases you'd love to see supported?

Give it a try and let me know your thoughts in the comments! Stay tuned for future updates.

#ComputerVision #ObjectDetection #YOLO #MachineLearning #TechForLife

reacted to andito's post with 🔥 5 months ago

Post

3396

Let's go! We are releasing SmolVLM, a smol 2B VLM built for on-device inference that outperforms all models at similar GPU RAM usage and tokens throughputs.

- SmolVLM generates tokens 7.5 to 16 times faster than Qwen2-VL! 🤯
- Other models at this size crash a laptop, but SmolVLM comfortably generates 17 tokens/sec on a macbook! 🚀
- SmolVLM can be fine-tuned on a Google collab! Or process millions of documents with a consumer GPU!
- SmolVLM even outperforms larger models in video benchmarks, despite not even being trained on videos!

Check out more!
Demo: HuggingFaceTB/SmolVLM
Blog: https://huggingface.co/blog/smolvlm
Model: HuggingFaceTB/SmolVLM-Instruct
Fine-tuning script: https://github.com/huggingface/smollm/blob/main/finetuning/Smol_VLM_FT.ipynb

liked 2 Spaces 6 months ago

440

Qwen2.5 Coder Demo

👁

Chat with a Qwen AI assistant

8.7k

Kolors Virtual Try-On

👕

Try on clothes by uploading images

reacted to merve's post with 🔥 6 months ago

Post

5301

OmniVision-968M: a new local VLM for edge devices, fast & small but performant
💨 a new vision language model with 9x less image tokens, super efficient
📖 aligned with DPO for reducing hallucinations
⚡️ Apache 2.0 license 🔥

Demo hf.co/spaces/NexaAIDev/omnivlm-dpo-demo
Model https://huggingface.co/NexaAIDev/omnivision-968M

4 replies

updated a model 8 months ago

quyet7779/whisper-small-vi

Updated Sep 13, 2024

reacted to mrfakename's post with ❤️ 12 months ago

Post

11204

Introducing StyleTTS 2 detector, an audio classification model to detect StyleTTS 2 vs human-generated content!

Dual-licensed under MIT/Apache 2.0.

Model Weights: mrfakename/styletts2-detector
Spaces: mrfakename/styletts2-detector

2 replies

liked a model about 1 year ago

ByteDance/SDXL-Lightning

Text-to-Image • Updated Apr 3, 2024 • 70.4k • • 2.02k