Ph.D. Project in Computer Science and Artificial Intelligence| IIT Delhi - Abu Dhabi

Multimodal Perception and Control for Robust Contact-Rich Robotic Manipulation

Computer Science and Artificial Intelligence

Supervisors

Dr. Alap Kshirsagar
Prof. Rohan Paul (IIT Delhi)

Project Description

This Ph.D. project aims to develop integrated perception and control methods that enable robots to operate reliably in dynamic and contact-rich environments under uncertainty. While most robotic systems rely primarily on vision, tasks such as grasping in clutter, pouring, insertion, and handling deformable objects require continuous feedback from tactile and force sensing. This research will focus on combining multiple perception modalities such as vision, tactile sensing, and force feedback to both estimate interaction states and enable responsive, contact-aware control.

The project will develop multimodal learning approaches to fuse RGB-D perception with tactile and force signals into unified representations, and investigate how these can be used within feedback control policies for manipulation. The research will explore data-driven and reinforcement learning methods to enable robots to adapt online to variations in object properties, contact conditions, and partial observability. A strong emphasis will be placed on real-world validation on a robotic platform. The project will progressively scale from single-arm to bimanual manipulation and human-robot interaction scenarios involving physical contact. The expected outcomes include novel multimodal perception and control algorithms, and experimentally validated robotic systems capable of robust, adaptive manipulation in uncertain environments, with applications in household, industrial and service robotics.

Background Required

Bachelor's or Master's degree in Robotics, Mechanical/Electrical Engineering, Computer Science, or related fields. Strong interest in robot learning, control, or perception is desirable. Experience with programming, machine learning, or robotics frameworks (e.g., ROS) is beneficial.