RenderFormer: Transformer-based Neural Rendering of Triangle Meshes with Global Illumination
Abstract
RenderFormer is a transformer-based neural rendering pipeline that renders images from triangle representations without per-scene training and with full global illumination effects.
We present RenderFormer, a neural rendering pipeline that directly renders an image from a triangle-based representation of a scene with full global illumination effects and that does not require per-scene training or fine-tuning. Instead of taking a physics-centric approach to rendering, we formulate rendering as a sequence-to-sequence transformation where a sequence of tokens representing triangles with reflectance properties is converted to a sequence of output tokens representing small patches of pixels. RenderFormer follows a two stage pipeline: a view-independent stage that models triangle-to-triangle light transport, and a view-dependent stage that transforms a token representing a bundle of rays to the corresponding pixel values guided by the triangle-sequence from the view-independent stage. Both stages are based on the transformer architecture and are learned with minimal prior constraints. We demonstrate and evaluate RenderFormer on scenes with varying complexity in shape and light transport.
Community
Using transformer for end-to-end rendering of triangle meshes, with full global illumination effects and that does not require per-scene training or fine-tuning.
Project Page: https://microsoft.github.io/renderformer/
Code&Model: https://github.com/microsoft/renderformer
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Beyond Reconstruction: A Physics Based Neural Deferred Shader for Photo-realistic Rendering (2025)
- LIRM: Large Inverse Rendering Model for Progressive Reconstruction of Shape, Materials and View-dependent Radiance Fields (2025)
- LSNIF: Locally-Subdivided Neural Intersection Function (2025)
- 3D Gaussian Inverse Rendering with Approximated Global Illumination (2025)
- SVG-IR: Spatially-Varying Gaussian Splatting for Inverse Rendering (2025)
- Recollection from Pensieve: Novel View Synthesis via Learning from Uncalibrated Videos (2025)
- ERUPT: Efficient Rendering with Unposed Patch Transformer (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
As someone with no experience in this I would not have expected this to produce such high quality output at such a great speed! Fantastic work!
I wonder what you've explored in the way of practical applications. Like, this would be an awesome preview option in Blender. I can also imagine it could work in games? You could fine-tune it on the game environment and get perfect real-time GI. Not even just GI — whatever expensive effect you can render in Blender, you could train the model to reproduce it at 60 FPS.
Models citing this paper 2
Datasets citing this paper 1
Spaces citing this paper 0
No Space linking this paper