--- task_categories: - visual-question-answering language: - en tags: - remyx - SpatialReasoning - spatial-reasoning - test-time-compute - thinking - reasoning - multimodal - vlm - vision-language - distance-estimation - quantitative-spatial-reasoning pretty_name: SpaceOm license: apache-2.0 --- # SpaceOm (Coming Soon) ![image/gif](https://cdn-uploads.huggingface.co/production/uploads/647777304ae93470ffc28913/5cPsHwrmzqPOjd7zUgzss.gif) ## Model Overview OpenAI's plan to release a SOTA text-in, text-out toggleable reasoning LLM means the most performant Vision-Language Model (VLM) will likely be based on this llm backbone. Meanwhile, updated methods of reasoning synthesis which include improvements to localization & captioning using "Describe Anything" as well as the step-by-step instructions are [in the works](https://github.com/andrewliao11/Q-Spatial-Bench-code/blob/main/prompt_templates/spatial_prompt_steps.txt). Check out [SpaceThinker](https://huggingface.co/remyxai/SpaceThinker-Qwen2.5VL-3B) for more on the cutting-edge of quantitative spatial reasoning.