GithubHelp home page GithubHelp logo

awesome-world-model's Introduction

Awesome World Models for Autonomous Driving Awesome

Collect some World Models (for Autonomous Driving) papers.

If you find some ignored papers, feel free to create pull requests, open issues, or email me. Contributions in any form to make this list more comprehensive are welcome. πŸ“£πŸ“£πŸ“£

If you find this repository useful, please consider citing and giving us a star 🌟.

Feel free to share this list with others! πŸ₯³πŸ₯³πŸ₯³

Workshop & Challenge

Papers

World model original paper

  • Using Occupancy Grids for Mobile Robot Perception and Navigation [paper]

Technical blog or video

  • Yann LeCun: A Path Towards Autonomous Machine Intelligence [paper] [Video]

  • CVPR'23 WAD Keynote - Ashok Elluswamy, Tesla [Video]

  • Wayve Introducing GAIA-1: A Cutting-Edge Generative AI Model for Autonomy [blog]

    World models are the basis for the ability to predict what might happen next, which is fundamentally important for autonomous driving. They can act as a learned simulator, or a mental β€œwhat if” thought experiment for model-based reinforcement learning (RL) or planning. By incorporating world models into our driving models, we can enable them to understand human decisions better and ultimately generalise to more real-world situations.

Survey

  • A survey on multimodal large language models for autonomous driving. WACVW 2024 [Paper] [Code]
  • World Models for Autonomous Driving: An Initial Survey. 2024.3, arxiv [Paper]

2024

  • [ViDAR] Visual Point Cloud Forecasting enables Scalable Autonomous Driving. CVPR 2024 [Paper] [Code]
  • [GenAD] Generalized Predictive Model for Autonomous Driving. CVPR 2024 [Paper] [Data]
  • [Cam4DOCC] Cam4DOcc: Benchmark for Camera-Only 4D Occupancy Forecasting in Autonomous Driving Applications. CVPR 2024 [Paper] [Code]
  • [Drive-WM] Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving. CVPR 2024 [Paper] [Code]
  • [DriveWorld] DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving. CVPR 2024 [Code]
  • [Panacea] Panacea: Panoramic and Controllable Video Generation for Autonomous Driving. CVPR 2024 [Paper] [Code]
  • [MagicDrive] MagicDrive: Street View Generation with Diverse 3D Geometry Control. ICLR 2024 [Paper] [Code]
  • Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion. ICLR 2024 [Paper]
  • [SafeDreamer] SafeDreamer: Safe Reinforcement Learning with World Models. ICLR 2024 [Paper] [Code]
  • [RoboDreamer] RoboDreamer: Learning Compositional World Models for Robot Imagination. 2024.4, arxiv [Paper] [Code]
  • [LidarDM] LidarDM: Generative LiDAR Simulation in a Generated World. 2024.4, arxiv [Paper] [Code]
  • [3D-VLA] 3D-VLA: A 3D Vision-Language-Action Generative World Model. 2024.3, arxiv [Paper]
  • [DriveDreamer-2] DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation. 2024.3, arxiv [Paper] [Code]
  • [Think2Drive] Think2Drive: Efficient Reinforcement Learning by Thinking in Latent World Model for Quasi-Realistic Autonomous Driving. 2024.2, arxiv [Paper]

2023

  • [TrafficBots] TrafficBots: Towards World Models for Autonomous Driving Simulation and Motion Prediction. ICRA 2023 [Paper] [Code]
  • [WoVoGen] WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation. 2023.12, arxiv [Paper] [Code]
  • [CTT] Categorical Traffic Transformer: Interpretable and Diverse Behavior Prediction with Tokenized Latent. 2023.11, arxiv [Paper]
  • [OccWorld] OccWorld: Learning a 3D Occupancy World Model for Autonomous Driving. 2023.11, arxiv [Paper] [Code]
  • [MUVO] MUVO: A Multimodal Generative World Model for Autonomous Driving with Geometric Representations. 2023.11, arxiv [Paper]
  • [DrivingDiffusion] DrivingDiffusion: Layout-Guided multi-view driving scene video generation with latent diffusion model. 2023.10, arxiv [Paper] [Code]
  • [GAIA-1] GAIA-1: A Generative World Model for Autonomous Driving. 2023.9, arxiv [Paper]
  • [ADriver-I] ADriver-I: A General World Model for Autonomous Driving. 2023.9, arxiv [Paper]
  • [DriveDreamer] DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving. 2023.9, arxiv [Paper] [Code]
  • [UniWorld] UniWorld: Autonomous Driving Pre-training via World Models. 2023.8, arxiv [Paper] [Code]

2022

  • [MILE] Model-Based Imitation Learning for Urban Driving. NeurIPS 2022 [Paper] [Code]
  • [Symphony] Symphony: Learning Realistic and Diverse Agents for Autonomous Driving Simulation. ICRA 2022 [Paper]
  • Hierarchical Model-Based Imitation Learning for Planning in Autonomous Driving. IROS 2022 [Paper]

Other World Model Paper

2024

  • [Genie] Genie: Generative Interactive Environments. DeepMind [Paper] [Blog]
  • [Sora] Video generation models as world simulators. OpenAI [Technical report]
  • [IWM] Learning and Leveraging World Models in Visual Representation Learning. Meta AI [Paper]
  • [V-JEPA] V-JEPA: Video Joint Embedding Predictive Architecture. Meta AI [Blog] [Paper] [Code]
  • [MAMBA] MAMBA: an Effective World Model Approach for Meta-Reinforcement Learning. ICLR 2024 [Paper] [Code]
  • [MagicTime] MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators. 2024.4, arxiv [Paper] [Code]
  • [Dreaming of Many Worlds] Dreaming of Many Worlds: Learning Contextual World Models Aids Zero-Shot Generalization. 2024.3, arxiv [Paper] [Code]
  • [ManiGaussian] ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation. 2024.3, arxiv [Paper] [Code]
  • [LWM] World Model on Million-Length Video And Language With RingAttention. 2024.2, arxiv [Paper] [Code]
  • Planning with an Ensemble of World Models. OpenReview [Paper]
  • [WorldDreamer] WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens. 2024.1, arxiv [Paper] [Code]

awesome-world-model's People

Contributors

lmd0311 avatar dk-liang avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.