Bio

I am an Assistant Professor at MIT EECS, where I am leading the Scene Representation Group. Previously, I did my Ph.D. at Stanford University as well as a Postdoc at MIT CSAIL. My research interest lies in building AI that perceives and models the world the way that humans do. Specifically, I work towards models that can learn to reconstruct a rich state description of their environment, such as reconstructing its 3D structure, materials, semantics, etc. from vision. These models should also be able to model the impact of their own actions on that environment, i.e., learn a "mental simulator" or "world model". I am particularly interested in models that can learn these skills fully self-supervised only from video and by self-directed interaction with the world.

Publications

Unifying 3D Representation and Control of Diverse Robots with a Single Camera
arXiv
Sizhe Lester Li, Annan Zhang, Boyuan Chen, Hanna Matusik, Chao Liu, Daniela Rus, Vincent Sitzmann
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion
NeurIPS
Boyuan Chen*, Diego Marti Monso*, Yilun Du, Max Simchowitz, Russ Tedrake, Vincent Sitzmann
Neural Isometries: Taming Transformations for Equivariant ML
NeurIPS
Tommy Mitchel, Michael Taylor, Vincent Sitzmann
FlowMap: High-Quality Camera Poses, Intrinsics, and Depth via Gradient Descent
arXiv
Cameron Smith*, David Charatan*, Ayush Tewari, Vincent Sitzmann
pixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction
CVPR 2024 (Oral, Best Paper Runner-Up)
David Charatan, Sizhe Li, Andrea Tagliasacchi, Vincent Sitzmann
Variational Barycentric Coordinates
SIGGRAPH Asia 2023 (Journal Track)
Ana Dodik, Oded Stein, Vincent Sitzmann, Justin Solomon
Diffusion with Forward Models: Solving Stochastic Inverse Problems Without Direct Supervision
NeurIPS 2024 (Spotlight)
Ayush Tewari*, Tianwei Yin*, George Cazenavette, Joshua B. Tenenbaum, Fredo Durand, William T. Freeman, Vincent Sitzmann
FlowCam: Training Generalizable 3D Radiance Fields without Camera Poses via Pixel-Aligned Scene Flow
NeurIPS 2024
Cameron Smith, Yilun Du, Ayush Tewari, Vincent Sitzmann
Learning to Render Novel Views from Wide-Baseline Stereo Pairs
CVPR 2023
Yilun Du, Cameron Smith, Ayush Tewari†, Vincent Sitzmann†
Seeing 3D Objects in a Single Image via Self-Supervised Static-Dynamic Disentanglement
ICLR 2022
Prafull Sharma, Ayush Tewari, Yilun Du, Sergey Zakharov, Rares Ambrus, Adrien Gaidon, William T. Freeman, Fredo Durand, Joshua B. Tenenbaum, Vincent Sitzmann
Decomposing NeRF for Editing via Feature Field Distillation
NeurIPS 2022
Sosuke Kobayashi, Eiichi Matsumoto, Vincent Sitzmann
Neural Descriptor Fields: SE(3)-Equivariant Object Representations for Manipulation
ICRA 2022
Anthony Simeonov*, Yilun Du*, Andrea Tagliasacchi, Alberto Rodriguez, Pulkit Agrawal†, Vincent Sitzmann
Learning Signal-Agnostic Manifolds of Neural Fields
NeurIPS 2021
Yilun Du, Katherine M. Collins, Joshua Tenenbaum, Vincent Sitzmann
Light Field Networks: Neural Scene Representations with Single-Evaluation Rendering
NeurIPS 2021 (Spotlight)
Vincent Sitzmann*, Semon Rezchikov*, William T. Freeman, Joshua B. Tenenbaum, Frédo Durand
Implicit Neural Representations with Periodic Activation Functions
NeurIPS 2020 (Oral)
Vincent Sitzmann*, Julien N. P. Martel*, Alexander W. Bergman, David B. Lindell, Gordon Wetzstein
MetaSDF: Meta-learning Signed Distance Functions
NeurIPS 2020
Vincent Sitzmann*, Eric R. Chan*, Richard Tucker, Noah Snavely, Gordon Wetzstein
State of the Art on Neural Rendering
Computer Graphics Forum 2020 - EG 2020 (STAR Report)
Ayush Tewari*, Ohad Fried*, Justus Thies*, Vincent Sitzmann*, Stephen Lombardi, Kalyan Sunkavalli, Ricardo Martin-Brualla, Tomas Simon, Jason Saragih, Matthias Nießner, Rohit Pandey, Sean Fanello, Gordon Wetzstein, Jun-Yan Zhu, Christian Theobalt, Maneesh Agrawala, Eli Shechtman, Dan B Goldman, Michael Zollhöfer
Inferring Semantic Information with 3D Neural Scene Representations
3DV
Amit Kohli*, Vincent Sitzmann*, Gordon Wetzstein
Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations
NeurIPS 2019 (Oral, Honorable Mention "Outstanding New Directions")
Vincent Sitzmann, Michael Zollhöfer, Gordon Wetzstein
DeepVoxels: Learning Persistent 3D Feature Embeddings
CVPR 2019 (Oral)
Vincent Sitzmann, Justus Thies, Felix Heide, Matthias Nießner, Gordon Wetzstein, Michael Zollhöfer
Hybrid optical-electronic convolutional neural networks with optimized diffractive optics for image classification
Scientific Reports
Julie Chang, Vincent Sitzmann, Xiong Dun, Wolfgang Heidrich, Gordon Wetzstein
End-to-end Optimization of Optics and Image Processing for Achromatic Extended Depth of Field and Super-resolution Imaging
SIGGRAPH 2018
Vincent Sitzmann*, Steven Diamond*, Yifan Peng*, Xiong Dun, Stephen Boyd, Wolfgang Heidrich, Felix Heide, Gordon Wetzstein
Saliency in VR: How do people explore virtual environments?
IEEE VR 2018
Vincent Sitzmann*, Ana Serrano*, Amy Pavel, Maneesh Agrawala, Belen Masia, Diego Gutierrez, Gordon Wetzstein
Movie Editing and Cognitive Event Segmentation in Virtual Reality Video
SIGGRAPH 2017
Ana Serrano, Vincent Sitzmann, Jaime Ruiz-Borau, Gordon Wetzstein, Diego Gutierrez, Belen Masia
Towards a Machine-learning Approach for Sickness Prediction in 360° Stereoscopic Videos
IEEE VR 2018
Nitish Padmanaban*, Timon Ruban*, Vincent Sitzmann, Anthony M. Norcia, Gordon Wetzstein