Computer vision - a Ryukijano Collection

Ryukijano 's Collections

Vision_transformer_robotics

VILA

Diffusion models

Neural Rendering

Deep Reinforcement Learning

LLMs

Computer vision

Audio

Multi modal foundational models

Text_to_video diffusion

Vision_language_models

Text-3D

2D->3D

Computer vision

updated Dec 4, 2024

Unsupervised Universal Image Segmentation

Paper • 2312.17243 • Published Dec 28, 2023 • 20
Denoising Vision Transformers

Paper • 2401.02957 • Published Jan 5, 2024 • 31
timm/ViT-B-16-SigLIP

Zero-Shot Image Classification • Updated Oct 25, 2023 • 18.6k • 31
Runtime error

19

19

Slimsam

🌖

Small yet powerful mask generation application ⚡️
InstaGen: Enhancing Object Detection by Training on Synthetic Dataset

Paper • 2402.05937 • Published Feb 8, 2024 • 14
microsoft/OmniParser

Image-Text-to-Text • Updated Dec 2, 2024 • 826 • 1.66k
meta-llama/Llama-3.2-11B-Vision-Instruct

Image-Text-to-Text • Updated Dec 4, 2024 • 520k • • 1.43k
Runtime error

52

52

LSM

🦀

LargeSpatialModel: End-to-end Unposed Images to Semantic 3D
Running on Zero

55

55

Mini Dust3r

🌖

Run a web app for creating 3D models
Junyi42/MonST3R_PO-TA-S-W_ViTLarge_BaseDecoder_512_dpt

Image-to-3D • Updated Oct 30, 2024 • 6.05k • 18
Running on Zero

888

888

OminiControl

🌍

Generate images based on text prompts and condition images