IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023
Vision & Language for
Autonomous AI
Advancing the theoretical foundations and practical applications of computer vision and machine learning for embodied agents.
About VL4AI Research
Towards Intelligent Embodied AI
VL4AI advances intelligent embodied AI for real-world robotic and autonomous systems. We study how multi-sensor and multimodal models can perceive, understand, reason, and act in complex dynamic environments, with a strong emphasis on interpretable, trustworthy, auditable, and safe embodied intelligence.
- Multimodal and multi-sensor perception, including 2D vision, 3D vision, sensor fusion, and embodied scene understanding.
- 3D reconstruction, semantic mapping, and world modelling for dynamic embodied environments.
- Compositional and neuro-symbolic reasoning for visual grounding, multimodal understanding, and decision making.
- Agentic embodied systems for planning, navigation, search, tracking, and mission-level autonomy.
Our goal is to build embodied AI systems that do more than perceive the world: they can reason over it, act within it, and support reliable real-world deployment through interpretable, trustworthy, and safe intelligence.
Research highlight
2D Visual Perception
Impact
Seminal Works
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
GMFlow: Learning Optical Flow via Global Matching
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
JRDB: A Dataset and Benchmark of Egocentric Robot Visual Perception of Humans in Built Environments
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
SoPhie: An Attentive GAN for Predicting Paths Compliant to Social and Physical Constraints
Advances in Neural Information Processing Systems (NeurIPS), 2019
Social-BiGAT: Multimodal Trajectory Forecasting using Bicycle-GAN and Graph Attention Networks
Thirty-First AAAI Conference on Artificial Intelligence (AAAI), 2017
Online multi-target tracking using recurrent neural networks
IEEE international conference on computer vision (ICCV), 2015
Joint probabilistic data association revisited
Updates
Latest News
Our work MATA published in ICLR
Congrats to Zhixi. Read paper here
Our work Aeroseg++ published in TOMM
Congrats to Saikat. Read paper here
Our work MGIoU published in AAAI
Congrats to Tho. Read paper here
Our work on JRDB published in AAAI
Congrats to Simin. Read paper here
Our work NEUSIS published in RA-L
Congrats to Zhixi. Read paper here
Our work DWIM published in ICCV
Congrats to Fucai. Read paper here