Ph.D. Project in Computer Science and Artificial Intelligence| IIT Delhi - Abu Dhabi

Reinforcement Learning-Based Vision-Language-Action Models for Robotic Manipulation

Computer Science and Artificial Intelligence

Supervisors

Dr. Alap Kshirsagar
Dr. Kaushal Kumar Maurya
Prof.Rohan Paul

Project Description

Robotic manipulation in unstructured environments requires both high-level semantic understanding and adaptive control under contact-rich, dynamic conditions. While Vision-Language models provide strong priors for task understanding, their direct deployment in robotic systems is limited by poor grounding in physical interaction and lack of online adaptability. In particular, Vision-Language-Action (VLA) policies often struggle with distribution shifts, sparse feedback, and sequential task learning in real-world settings.

This Ph.D. project aims to develop reinforcement learning-based methods for the continual adaptation of VLA models through online interaction. The central focus is on leveraging language not only as a task specification but also as a structured signal for guiding policy improvement and reducing reliance on manually designed rewards. The project will address key challenges of sample efficiency and catastrophic forgetting by developing data-efficient learning strategies and mechanisms to retain previously acquired skills during manipulation tasks.

The research will be validated on robotic platforms in contact-rich scenarios such as grasping, insertion, tool use, and object reconfiguration, with an emphasis on sustained adaptation and generalization. The expected outcome is a unified framework for reinforcement learning in VLA systems that enables efficient online learning while preserving prior knowledge, advancing the deployment of robust robotic manipulators in real-world environments.

Background Required

Bachelor's and/or Master's degree in Robotics, Computer Science, Artificial Intelligence, or related fields. A strong interest in machine learning and reinforcement learning is desirable. Experience with deep learning frameworks or robotic systems is beneficial.