Skip to main content

Driving with Language

Jingxin Wang (王京新)About 5 min

✧ 大模型用于自动驾驶相关文献整理

谷歌学术检索(标题):intitle:(autonomous driving OR driving OR self-driving AND LLM OR Large Language Model OR Vision-Language OR Agent OR Multi-Modal) 时间范围:2022年——2025年 image

arXiv Xplorer检索(全文):LLM 、large language models、vision language models、AND autonomous driving 按照相似性排序 时间范围:2022年——2024年 image

➢综述类(★推荐指数)

★★★★ LLM4Drive: A Survey of Large Language Models for Autonomous Driving paperopen in new window codeopen in new window
Authors: Zhenjie Yang, Xiaosong Jia, Hongyang Li, Junchi Yan
institution: School of AI and Department of CSE, Shanghai Jiao Tong University, OpenDriveLab
year: 2023
eprint: [v4] Mon, 12 Aug 2024 11:53:28 UTC (538 KB)

★★★★★ A Survey on Multimodal Large Language Models for Autonomous Driving paperopen in new window codeopen in new window

Authors: Cui, Can and Ma, Yunsheng and Cao, Xu and Ye, Wenqian and Zhou, Yang and Liang, Kaizhao and Chen, Jintai and Lu, Juanwu and Yang, Zichong and Liao, Kuei-Da and others
institution: Purdue University, West Lafayette, IN, USA; Tencent T Lab, Beijing, China; University of Illinois Urbana-Champaign, Champaign, IL, USA
year: 2024
booktitle: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision

★★★★ Will Large Language Models be a Panacea to Autonomous Driving? paperopen in new window

Authors:Yuxuan Zhu, Shiyi Wang, Wenqing Zhong, Nianchen Shen, Yunqi Li, Siqi Wang, Zhiheng Li, Cathy Wu, Zhengbing He, Li Li
institution:Department of Automation, Tsinghua University, Beijing, China
year: 2024
eprint:arxiv

Title: XLM for Autonomous Driving Systems: A Comprehensive Review paperopen in new window

Authors: Sonda Fourati, Wael Jaafar, Noura Baccar, Safwan Alfattani
Institution:Mediterranean Institute of Technology (MedTech), Tunis, Tunisia、
Year: 2024
Publication Title: arXiv preprint arXiv:2409.10484

➢研究类(★推荐指数)

Title: Driving with llms: Fusing object-level vector modality for explainable autonomous driving paperopen in new window codeopen in new window

Authors: Chen, Long; Sinavski, Oleg; Hünermann, Jan; Karnsund, Alice; Willmott, Andrew James; Birch, Danny; Maund, Daniel; Shotton, Jamie
Institution: Wayve
Year: 2024
Publication Title: 2024 IEEE International Conference on Robotics and Automation (ICRA)

Title: Drivegpt4: Interpretable end-to-end autonomous driving via large language model paperopen in new window codeopen in new window

Authors: Zhenhua Xu, Yujia Zhang, Enze Xie, Zhen Zhao, Yong Guo, Kwan-Yee K. Wong, Zhenguo Li, Hengshuang Zhao
Institution:The University of Hong Kong、Zhejiang University、Huawei Noah’s Ark Lab
Year: 2024
Publication Title: IEEE Robotics and Automation Letters

Title: Lampilot: An open benchmark dataset for autonomous driving with language model programs paperopen in new window codeopen in new window

Authors: Yunsheng Ma, Can Cui, Xu Cao, Wenqian Ye, Peiran Liu, Juanwu Lu, Amr Abdelraouf, Rohit Gupta, Kyungtae Han, Aniket Bera
Institution:Purdue University、University of Illinois Urbana-Champaign、University of Virginia
Year: 2024
Publication Title: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Title: A language agent for autonomous driving paperopen in new window codeopen in new window

Authors: Jiageng Mao, Junjie Ye, Yuxi Qian, Marco Pavone, Yue Wang
Institution:University of Southern California、Stanford University
Year: 2023
Publication Title: arXiv preprint arXiv:2311.10813

Title: Vision language models in autonomous driving and intelligent transportation systems paperopen in new window codeopen in new window

Authors: Xingcheng Zhou, Mingyu Liu, Bare Luka Zagar, Ekim Yurtsever, Alois C. Knoll
Institution:
Year: 2023
Publication Title: arXiv preprint arXiv:2310.14414

Title: OmniDrive: A Holistic LLM-Agent Framework for Autonomous Driving with 3D Perception, Reasoning and Planning paperopen in new window codeopen in new window

Authors: Shihao Wang, Zhiding Yu, Xiaohui Jiang, Shiyi Lan, Min Shi, Nadine Chang, Jan Kautz, Ying Li, Jose M. Alvarez
Institution:Beijing Inst of Tech、NVIDIA
Year: 2024
Publication Title: arXiv preprint arXiv:2405.01533

Title: Drive like a human: Rethinking autonomous driving with large language models paperopen in new window codeopen in new window

Authors: Daocheng Fu, Xin Li, Licheng Wen, Min Dou, Pinlong Cai, Botian Shi, Yu Qiao
Institution:Shanghai AI Lab
Year: 2024
Publication Title: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision

Title: DriVLMe: Enhancing LLM-based Autonomous Driving Agents with Embodied and Social Experiences paperopen in new window codeopen in new window

Authors: Yidong Huang, Jacob Sansom, Ziqiao Ma, Felix Gervits, Joyce Chai
Institution:University of Michigan、Army Research Lab
Year: 2024
Publication Title: arXiv preprint arXiv:2406.03008

Title: Dilu: A knowledge-driven approach to autonomous driving with large language models paperopen in new window codeopen in new window

Authors: Licheng Wen, Daocheng Fu, Xin Li, Xinyu Cai, Tao Ma, Pinlong Cai, Min Dou, Botian Shi, Liang He, Yu Qiao
Institution:Shanghai Artificial Intelligence Laboratory
Year: 2023
Publication Title: arXiv preprint arXiv:2309.16292

Title: Editable scene simulation for autonomous driving via collaborative LLM-agents paperopen in new window codeopen in new window

Authors: Yuxi Wei, Zi Wang, Yifan Lu, Chenxin Xu, Changxing Liu, Hao Zhao, Siheng Chen, Yanfeng Wang
Institution:Shanghai Jiao Tong University、Shanghai AI Laboratory
Year: 2024
Publication Title: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Title: Drivevlm: The convergence of autonomous driving and large vision-language models paperopen in new window codeopen in new window

Authors: Xiaoyu Tian, Junru Gu, Bailin Li, Yicheng Liu, Chenxu Hu, Yang Wang, Kun Zhan, Peng Jia, Xianpeng Lang, Hang Zhao
Institution:IIIS, Tsinghua University、Li Auto
Year: 2024
Publication Title: arXiv preprint arXiv:2402.12289

Title: Vlaad: Vision and language assistant for autonomous driving paperopen in new window codeopen in new window

Authors: SungYeon Park, MinJae Lee, JiHyuk Kang, Hahyeon Choi, Yoonah Park, Juhwan Cho, Adam Lee, DongKyu Kim
Institution:Seoul National University、University of California, Berkeley
Year: 2024
Publication Title: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision

Title: Asynchronous Large Language Model Enhanced Planner for Autonomous Driving paperopen in new window codeopen in new window

Authors: Yuan Chen, Zi-han Ding, Ziqin Wang, Yan Wang, Lijun Zhang, Si Liu
Institution: Beihang University、AIR, Tsinghua University
Year: 2024
Publication Title: arXiv preprint arXiv:2406.14556

Title: Empowering Autonomous Driving with Large Language Models: A Safety Perspective paperopen in new window code

Authors: Yixuan Wang, Ruochen Jiao, Chengtian Lang, Sinong Simon Zhan, Chao Huang, Zhaoran Wang, Zhuoran Yang, Qi Zhu
Institution: Northwestern University
Year: 2023
Publication Title: Arxiv

Title: A Survey of Large Language Models for Autonomous Driving paperopen in new window codeopen in new window

Authors: Zhenjie Yang, Xiaosong Jia, Hongyang Li, Junchi Yan
Institution: Shanghai Jiao Tong University
Year: 2023
Publication Title: Arxiv

Title: SimpleLLM4AD: An End-to-End Vision-Language Model with Graph Visual Question Answering for Autonomous Driving [paper(https://arxiv.org/pdf/2407.21293) code

Authors: Peiru Zheng, Yun Zhao, Zhan Gong, Hong Zhu, Shaohua Wu
Institution: Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences
Year: 2024
Publication Title: Arxiv

LLM and Existing AD Challenges paperopen in new window

• Solution Insight A: LLMs have demonstrated significant capability in solving the corresponding challenge, and comprehensive solution based on LLMs can be expected.
• Solution Insight B: LLMs have demonstrated capability in solving the corresponding challenge, but the challenge may not be fully solved given current drawbacks of LLMs.
• Solution Insight C: LLMs can improve performance in related tasks, but might not be able to solve the key problems within the challenges.
image
LLMs在AD任务中的表现主要源于以下几个方面:

  1. Common Sense.
  2. Reasoning Capability.
  3. Communication ability.(就是与人类互动交流的能力,一定程度上解决了神经网络作为黑盒模型的问题,可信问题)

image Reason2Drive
NuPrompt
DriveGPT4
LingoQA
WEDGE
DriveLM-nuScenes
DriveLM-Carla
NuScenes-QA
Rank2Tell
LaMPilot
MAPLM
LMDrive Nuscenes OpenLane-V2 OmniDrive