Building Physically Plausible World Models

ICML 2025 Workshop, Vancouver

Time: TBD (Whole-Day Workshop)

Room: TBD




Schedule Speakers Call For Papers Organizers

Overview

The goal of this workshop is to exchange ideas and establish communications among researchers working on building generalizable world models that describe how the physical world evolves in response to interacting agents (e.g. human and robots). Large-scale datasets of videos, images, and text hold the key for learning generalizable world models that are visually plausible. However, distilling useful physical information from such diverse unstructured data is challenging and requires careful attention to data curation, developing scalable algorithms, and implementing suitable training curricula. On the other hand, physics-based priors can enable learning plausible scene dynamics but it is difficult to scale to complex phenomenon that lack efficient solvers or even governing dynamic equations. Developing general world models that can simulate complex real-world phenomenon in a physically-plausible fashion can unlock enormous opportunities in generative modeling and robotics, and would be of wide interest to the larger AI community, and we believe this workshop falls at an ideal timing given recent significant progress in both video-modeling models and physics-based simulation. This workshop aims to bring together researchers in machine learning, robotics, physics-based simulation, and computer vision broadly aspiring to build scalable world models by utilizing internet data, simulation, and beyond in myriad ways.

Topics of Interest

Our workshop will focus on topics including but not limited to the following:
  • Controllable Video Generation and Generative Simulations. How can we improve fine-grained control in video generation and integrate it with world models conditioned on low-level actions?
  • Incorporating physics priors. How can we leverage physics prior to empower learned world models with physical realism?
  • Dynamic 3D Reconstruction. How can we generalize 3D reconstruction using web data while preserving scene consistency and controlled motion of dynamic elements?
  • Applications to Robotics and Time-Series Prediction. How can we use generic dataset such as web video and text to build shared world models that can synthesize physically-plausible results for applications such as robotics?
  • Special Considerations: Data Curation, Hallucination, and Broader Implications. How do dataset biases impact learned world models, and how can we mitigate them?

Call for Papers

We invite submissions of original research papers related to building physically plausible world models.

Submission Types:

  • Short Papers / Extended Abstracts (max 3 pages) - For preliminary results, interesting applications, or novel ideas that did not pan out in practice. The top three short papers will be invited for a spotlight talk.
  • Full Papers (max 8 pages) - For original research contributions. Three award candidates will be selected for spotlight talks.

Important Notes:

  • Papers are non-archival - we welcome submissions that have been submitted to or accepted by other venues
  • All accepted papers will be presented in a poster session
  • The review process will be double-blind

Schedule

  • 08:50 - 09:00 Welcome/Opening Remarks
  • 09:00 - 09:30 Invited Talk 1: (including 5 min Q&A)
  • 09:30 - 10:00 Invited Talk 2: (including 5 min Q&A)
  • 10:00 - 10:30 Contributed Talks: 10 min. presentations each (one presentation by a best paper)
  • 10:30 - 11:30 Poster Session with Coffee: We will encourage folks to make new connections and chat!
  • 11:30-12:15 panel discussion / debate 1
  • 12:15 - 13:30 lunch break
  • 13:30 - 14:00 Invited Talk 3: (including 5 min Q&A)
  • 14:00 - 14:30 Invited Talk 4: (including 5 min Q&A)
  • 14:30 - 15:00 Contributed Talks: 10 min. presentations each (one presentation by a best paper)
  • 15:00 - 16:00 Poster Session with Coffee: We will encourage folks to make new connections and chat!
  • 16:00 - 16:30 Invited Talk 5: (including 5 min Q&A)
  • 16:30 - 17:00 Invited Talk 6: (including 5 min Q&A)
  • 17:00 - 17:45 panel discussion / debate 2

Invited Speakers

Sherry Yang

Sherry Yang

New York University

Tim Brooks

Tim Brooks

Google DeepMind

Agrim Gupta

Agrim Gupta

Google DeepMind

Shuran Song

Shuran Song

Stanford University

Hao Su

Hao Su

University of California San Diego

Beom Joon Kim

Beom Joon Kim

KAIST

Organizers

Homanga Bharadhwaj

Homanga Bharadhwaj

CMU

Boyuan Chen

Boyuan Chen

MIT

Yilun Du

Yilun Du

Harvard

Hiroki Furuta

Hiroki Furuta

UTokyo

Ruiqi Gao

Ruiqi Gao

Google DeepMind

Hamidreza Kasaei

Hamidreza Kasaei

University of Groningen

Sean Kirmani

Sean Kirmani

Google DeepMind

Kuang-Huei Lee

Kuang-Huei Lee

Google DeepMind

Ruoshi Liu

Ruoshi Liu

Columbia

Zeyi Liu

Zeyi Liu

Stanford

Fei-Fei Li

Fei-Fei Li

Stanford

Carl Vondrick

Carl Vondrick

Columbia

Wenhao Yu

Wenhao Yu

Google DeepMind