VL4AI Research

Vision & Language for
Autonomous AI

Advancing the theoretical foundations and practical applications of computer vision and machine learning for embodied agents.

About VL4AI Research

Towards Intelligent Embodied AI

VL4AI advances intelligent embodied AI for real-world robotic and autonomous systems. We study how multi-sensor and multimodal models can perceive, understand, reason, and act in complex dynamic environments, with a strong emphasis on interpretable, trustworthy, auditable, and safe embodied intelligence.

  • Multimodal and multi-sensor perception, including 2D vision, 3D vision, sensor fusion, and embodied scene understanding.
  • 3D reconstruction, semantic mapping, and world modelling for dynamic embodied environments.
  • Compositional and neuro-symbolic reasoning for visual grounding, multimodal understanding, and decision making.
  • Agentic embodied systems for planning, navigation, search, tracking, and mission-level autonomy.

Our goal is to build embodied AI systems that do more than perceive the world: they can reason over it, act within it, and support reliable real-world deployment through interpretable, trustworthy, and safe intelligence.

2D Visual Perception

Research highlight

2D Visual Perception

Chief Scientist
A/Prof. Hamid Rezatofighi

Hamid Rezatofighi

Associate Professor, Department of Data Science & AI

Hamid Rezatofighi leads VL4AI at Monash University with a research profile spanning computer vision, robotics, machine learning, and neuro-symbolic AI. His work combines strong theoretical foundations with high-stakes deployment in defense, surveillance, healthcare, and embodied intelligence.

Interested in Computer vision, Robot vision & Deep learning

Research leadership

Chief Scientist Profile

19,000

Scholar citations

42

h-index

10

PhD completions

Positioning

Neuro-symbolic AI research lead

Leads a research agenda that connects computer vision, embodied reasoning, and robust autonomy, with particular strength in high-stakes environments where interpretability and reliability matter.

Editorial & program service

  • Area Chair at CVPR (2020–2026), NeurIPS (2023–2025), IJCAI (2023–2025), ICCV 2025, ECCV (2024–2026), and WACV 2021.
  • Senior Associate Editor at IEEE Transactions on Image Processing since 2024.
  • Associate Editor at Artificial Intelligence Journal, IET Computer Vision, and IROS; Guest Editor at Journal of Field Robotics and IEEE TCSVT.
  • Publication Chair at ACCV 2018 and organizer of MOTChallenge since 2018.

Awards, fellowships & recognition

  • Excellence in Research by Early Career Researcher award, Monash FIT, 2023.
  • Endeavour Research Fellowship supporting postdoctoral research at Stanford University and ETH Zurich, 2018–2019.
  • Invited speaker across academia and industry, including Vanderbilt, Salesforce, NVIDIA, Carnegie Mellon, University of Washington, and ETH Zurich.
  • ARC grant assessor and PhD thesis examiner across leading universities including CMU, Adelaide, UWA, QUT, and RMIT.

Career trajectory

Selected appointments & fellowships

A concise view of prior roles that shaped the lab’s current leadership.

2025–now

Associate Professor

Department of Data Science & AI, Monash University

2024–2025

Senior Lecturer

Department of Data Science & AI, Monash University

2020–2024

Lecturer

Department of Data Science & AI, Monash University

2018–2020

Endeavour Research Fellow

Stanford Vision Learning Lab, Stanford University

2014–2020

Senior Research Fellow

Australian Institute for Machine Learning, University of Adelaide

Impact

Seminal Works

IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023

Unifying Flow, Stereo and Depth Estimation

PDF

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022

GMFlow: Learning Optical Flow via Global Matching

PDF

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021

JRDB: A Dataset and Benchmark of Egocentric Robot Visual Perception of Humans in Built Environments

PDF

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019

Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression

PDF

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019

SoPhie: An Attentive GAN for Predicting Paths Compliant to Social and Physical Constraints

PDF

Advances in Neural Information Processing Systems (NeurIPS), 2019

Social-BiGAT: Multimodal Trajectory Forecasting using Bicycle-GAN and Graph Attention Networks

PDF

Thirty-First AAAI Conference on Artificial Intelligence (AAAI), 2017

Online multi-target tracking using recurrent neural networks

PDF

IEEE international conference on computer vision (ICCV), 2015

Joint probabilistic data association revisited

PDF