Back

Perceiving Systems Members Publications Website

Re-Thinking Inverse Graphics with Large Language Models

Igllmwide
We leverage the broad world knowledge encoded in LLMs to solve inverse-graphics problems. To this end, we propose the Inverse-Graphics Large Language Model (IG-LLM) [File Icon], which uses an LLM, that autoregressively decode a visual embedding into a structured, compositional 3D-scene representation. We demonstrate the potential of LLMs to facilitate inverse graphics through next-token prediction, without the application of image-space supervision.

Members

Publications

Perceiving Systems Article Re-Thinking Inverse Graphics with Large Language Models Kulits, P., Feng, H., Liu, W., Abrevaya, V., Black, M. J. Transactions on Machine Learning Research, August 2024 (Published) pdf URL BibTeX