Driving with Language

Jingxin Wang (王京新)About 5 min

✧ 大模型用于自动驾驶相关文献整理

谷歌学术检索（标题）：intitle:(autonomous driving OR driving OR self-driving AND LLM OR Large Language Model OR Vision-Language OR Agent OR Multi-Modal) 时间范围：2022年——2025年

arXiv Xplorer检索(全文)：LLM 、large language models、vision language models、AND autonomous driving 按照相似性排序时间范围：2022年——2024年

➢综述类（★推荐指数）

★★★★ LLM4Drive: A Survey of Large Language Models for Autonomous Driving paperopen in new window codeopen in new window
Authors: Zhenjie Yang, Xiaosong Jia, Hongyang Li, Junchi Yan
institution: School of AI and Department of CSE, Shanghai Jiao Tong University, OpenDriveLab
year: 2023
eprint: [v4] Mon, 12 Aug 2024 11:53:28 UTC (538 KB)

★★★★★ A Survey on Multimodal Large Language Models for Autonomous Driving paperopen in new window codeopen in new window

Authors: Cui, Can and Ma, Yunsheng and Cao, Xu and Ye, Wenqian and Zhou, Yang and Liang, Kaizhao and Chen, Jintai and Lu, Juanwu and Yang, Zichong and Liao, Kuei-Da and others
institution: Purdue University, West Lafayette, IN, USA; Tencent T Lab, Beijing, China; University of Illinois Urbana-Champaign, Champaign, IL, USA
year: 2024
booktitle: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision

★★★★ Will Large Language Models be a Panacea to Autonomous Driving? paperopen in new window

Authors:Yuxuan Zhu, Shiyi Wang, Wenqing Zhong, Nianchen Shen, Yunqi Li, Siqi Wang, Zhiheng Li, Cathy Wu, Zhengbing He, Li Li
institution:Department of Automation, Tsinghua University, Beijing, China
year: 2024
eprint:arxiv

Title: XLM for Autonomous Driving Systems: A Comprehensive Review paperopen in new window

Authors: Sonda Fourati, Wael Jaafar, Noura Baccar, Safwan Alfattani
Institution:Mediterranean Institute of Technology (MedTech), Tunis, Tunisia、
Year: 2024
Publication Title: arXiv preprint arXiv:2409.10484

➢研究类（★推荐指数）

Title: Driving with llms: Fusing object-level vector modality for explainable autonomous driving paperopen in new window codeopen in new window

Authors: Chen, Long; Sinavski, Oleg; Hünermann, Jan; Karnsund, Alice; Willmott, Andrew James; Birch, Danny; Maund, Daniel; Shotton, Jamie
Institution: Wayve
Year: 2024
Publication Title: 2024 IEEE International Conference on Robotics and Automation (ICRA)

Title: Drivegpt4: Interpretable end-to-end autonomous driving via large language model paperopen in new window codeopen in new window

Authors: Zhenhua Xu, Yujia Zhang, Enze Xie, Zhen Zhao, Yong Guo, Kwan-Yee K. Wong, Zhenguo Li, Hengshuang Zhao
Institution:The University of Hong Kong、Zhejiang University、Huawei Noah’s Ark Lab
Year: 2024
Publication Title: IEEE Robotics and Automation Letters

Title: Lampilot: An open benchmark dataset for autonomous driving with language model programs paperopen in new window codeopen in new window

Authors: Yunsheng Ma, Can Cui, Xu Cao, Wenqian Ye, Peiran Liu, Juanwu Lu, Amr Abdelraouf, Rohit Gupta, Kyungtae Han, Aniket Bera
Institution:Purdue University、University of Illinois Urbana-Champaign、University of Virginia
Year: 2024
Publication Title: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Title: A language agent for autonomous driving paperopen in new window codeopen in new window

Authors: Jiageng Mao, Junjie Ye, Yuxi Qian, Marco Pavone, Yue Wang
Institution:University of Southern California、Stanford University
Year: 2023
Publication Title: arXiv preprint arXiv:2311.10813

Title: Vision language models in autonomous driving and intelligent transportation systems paperopen in new window codeopen in new window

Authors: Xingcheng Zhou, Mingyu Liu, Bare Luka Zagar, Ekim Yurtsever, Alois C. Knoll
Institution:
Year: 2023
Publication Title: arXiv preprint arXiv:2310.14414

Title: OmniDrive: A Holistic LLM-Agent Framework for Autonomous Driving with 3D Perception, Reasoning and Planning paperopen in new window codeopen in new window

Authors: Shihao Wang, Zhiding Yu, Xiaohui Jiang, Shiyi Lan, Min Shi, Nadine Chang, Jan Kautz, Ying Li, Jose M. Alvarez
Institution:Beijing Inst of Tech、NVIDIA
Year: 2024
Publication Title: arXiv preprint arXiv:2405.01533

Title: Drive like a human: Rethinking autonomous driving with large language models paperopen in new window codeopen in new window

Authors: Daocheng Fu, Xin Li, Licheng Wen, Min Dou, Pinlong Cai, Botian Shi, Yu Qiao
Institution:Shanghai AI Lab
Year: 2024
Publication Title: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision

Title: DriVLMe: Enhancing LLM-based Autonomous Driving Agents with Embodied and Social Experiences paperopen in new window codeopen in new window

Authors: Yidong Huang, Jacob Sansom, Ziqiao Ma, Felix Gervits, Joyce Chai
Institution:University of Michigan、Army Research Lab
Year: 2024
Publication Title: arXiv preprint arXiv:2406.03008

Title: Dilu: A knowledge-driven approach to autonomous driving with large language models paperopen in new window codeopen in new window

Authors: Licheng Wen, Daocheng Fu, Xin Li, Xinyu Cai, Tao Ma, Pinlong Cai, Min Dou, Botian Shi, Liang He, Yu Qiao
Institution:Shanghai Artificial Intelligence Laboratory
Year: 2023
Publication Title: arXiv preprint arXiv:2309.16292

Title: Editable scene simulation for autonomous driving via collaborative LLM-agents paperopen in new window codeopen in new window

Authors: Yuxi Wei, Zi Wang, Yifan Lu, Chenxin Xu, Changxing Liu, Hao Zhao, Siheng Chen, Yanfeng Wang
Institution:Shanghai Jiao Tong University、Shanghai AI Laboratory
Year: 2024
Publication Title: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Title: Drivevlm: The convergence of autonomous driving and large vision-language models paperopen in new window codeopen in new window

Authors: Xiaoyu Tian, Junru Gu, Bailin Li, Yicheng Liu, Chenxu Hu, Yang Wang, Kun Zhan, Peng Jia, Xianpeng Lang, Hang Zhao
Institution:IIIS, Tsinghua University、Li Auto
Year: 2024
Publication Title: arXiv preprint arXiv:2402.12289

Title: Vlaad: Vision and language assistant for autonomous driving paperopen in new window codeopen in new window

Authors: SungYeon Park, MinJae Lee, JiHyuk Kang, Hahyeon Choi, Yoonah Park, Juhwan Cho, Adam Lee, DongKyu Kim
Institution:Seoul National University、University of California, Berkeley
Year: 2024
Publication Title: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision

Title: Asynchronous Large Language Model Enhanced Planner for Autonomous Driving paperopen in new window codeopen in new window

Authors: Yuan Chen, Zi-han Ding, Ziqin Wang, Yan Wang, Lijun Zhang, Si Liu
Institution: Beihang University、AIR, Tsinghua University
Year: 2024
Publication Title: arXiv preprint arXiv:2406.14556

Title: Empowering Autonomous Driving with Large Language Models: A Safety Perspective paperopen in new window code

Authors: Yixuan Wang, Ruochen Jiao, Chengtian Lang, Sinong Simon Zhan, Chao Huang, Zhaoran Wang, Zhuoran Yang, Qi Zhu
Institution: Northwestern University
Year: 2023
Publication Title: Arxiv

Title: A Survey of Large Language Models for Autonomous Driving paperopen in new window codeopen in new window

Authors: Zhenjie Yang, Xiaosong Jia, Hongyang Li, Junchi Yan
Institution: Shanghai Jiao Tong University
Year: 2023
Publication Title: Arxiv

Title: SimpleLLM4AD: An End-to-End Vision-Language Model with Graph Visual Question Answering for Autonomous Driving [paper(https://arxiv.org/pdf/2407.21293) code

Authors: Peiru Zheng, Yun Zhao, Zhan Gong, Hong Zhu, Shaohua Wu
Institution: Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences
Year: 2024
Publication Title: Arxiv

LLM and Existing AD Challenges paperopen in new window

• Solution Insight A: LLMs have demonstrated significant capability in solving the corresponding challenge, and comprehensive solution based on LLMs can be expected.
• Solution Insight B: LLMs have demonstrated capability in solving the corresponding challenge, but the challenge may not be fully solved given current drawbacks of LLMs.
• Solution Insight C: LLMs can improve performance in related tasks, but might not be able to solve the key problems within the challenges.

LLMs在AD任务中的表现主要源于以下几个方面：

Common Sense.
Reasoning Capability.
Communication ability.(就是与人类互动交流的能力，一定程度上解决了神经网络作为黑盒模型的问题，可信问题)

Reason2Drive
NuPrompt
DriveGPT4
LingoQA
WEDGE
DriveLM-nuScenes
DriveLM-Carla
NuScenes-QA
Rank2Tell
LaMPilot
MAPLM
LMDrive Nuscenes OpenLane-V2 OmniDrive

Driving with Language

✧ 大模型用于自动驾驶相关文献整理

➢综述类（★推荐指数）

➢研究类（★推荐指数）

LLM and Existing AD Challenges paperopen in new window

➢ LLM Related AD Datasets.