My research is broadly founded on exploring the interplay between control, data availability, and learning in order to design provably safe, successful, and efficient planning strategies for systems operating in complex or unknown environments. Inversely, I am interested in identifying control policies and agent intent from data, and communicating the findings to a human supervisor in an explainable manner. Some of my recent areas of interest are described below. If you want to learn more, please look at my recent publications or contact me.
My work is currently supported by the DRILLAWAY: aDaptive, ResIllient Learning-enabLed oceAn World AutonomY and the Safety-Constrained and Efficient Learning for Resilient Autonomous Space Systems projects funded by NASA, the Net Zero Transportation Infrastructure project funded by the Discovery Partners Institute, as well as the Enhancing Opportunities for Research and Training in Space Engineering project funded by the US Department of Education. We also recently received notification that the Robust and Resilient Autonomy for Advanced Air Mobility project has been selected for NASA’s University Leadership Initiative funding.
We concluded the System for Avoidance and Flight-path Execution based on Risk Reduction (SAFERR) project funded by the United States Air Force’s AFWERX program, the Learning of Time-Varying Dynamics project funded by Sandia National Laboratories, the Seedling: Synthesis of Control Protocols for Integrated Mission Planning, Resource Management and Information Acquisition project funded by the Defense Advanced Research Projects Agency, and the intensive summer research project LEONA: Logic-Based Context-Aware Activity Interpreter for Geospatial Intelligence funded by the University of Illinois’ New Frontiers Initiative, aligning with the mission and needs of the National Geospatial-Intelligence Agency.
Some Areas of Interest
Certifiable Real-Time Planning and Control for Systems with Unknown Dynamics
Faced with a mid-mission catastrophe or a significantly different environment from the one originally expected, it is often impossible for a system to complete its original task; hence, classical methods of adaptation and robustness — focused on adapting the system’s control capabilities to meet its original objective — will fail. My research develops methods that recognize whether the original task can be completed, choose an alternative task if it cannot, and ensure that the system completes the chosen task.
My group’s recent work introduces a guaranteed reachable set of a partially unknown system as the set of all states reachable for every system consistent with partial knowledge about new system dynamics. Underapproximating the guaranteed reachable set by exploiting the knowledge of the system’s reachable set prior to the adverse event and/or partial knowledge of the new system dynamics allows us to quickly determine tasks that the system can certifiably reach, even before we know how to reach them. In order to determine how to reach them, the system necessarily needs to learn at least a model of the local system dynamics. Our previous work on myopic control aims to continually relearn local system dynamics by applying short-time “test inputs”. Using this method, we have shown that even a significantly damaged aircraft can remain in the air. The following video, made by Steven Carr from UT Austin, shows a high-fidelity simulation of a Boeing 747-200 which lost 33% of its right wing, controlled using myopic control.
Role of Side Information in Learning
Controlling a system in an unknown environment inevitably requires learning about the environment and the system’s interaction with it. Learning and operating in an entirely unknown environment is time-consuming and risky. However, systems rarely operate in environments that are entirely unknown. They often have access to some information collected by previous missions in the same or a similar environment, information collected by agents on parallel and complementary missions, or information about physical laws of the environment. My group’s work exploits this information to learn quicker and subsequently plan better.
Our work in this area spans across domains: one recent line of work seeks to optimally plan a fast and reliable routes for vehicles or public transit passengers, given a priori joint probability distributions on travel times between stops and online data from all transit vehicles. Another focuses on optimally planning a control strategy of an extraterrestrial rover based on information collected from an orbiter prior to the mission and real-time sensor data. The video below, made by LEADCAT’s Pranay Thangeda, illustrates the difference in online learning of system dynamics between an agent that learns solely from the outcomes of its actions and an agent that also collects and uses side information about similarity between dynamics in different areas.
Behavior Inference, Supervision, and Deception
To ensure success in a hostile environment, an agent being observed may want to hide its true intentions from the observer for as long as possible. Inversely, an agent observing an adversary needs to infer its important behavioral features and uncover possible deception. My current work deals with both sides of this problem. On one hand, I have been exploring methods that agents can use to seem as unpredictable as possible while ultimately satisfying their objective or, alternatively, seem to follow a particular “decoy” policy as closely as possible while also proceeding to their objective. The resulting algorithms often match the human intuition of deceptive behaviors, and have been shown to successfully fool autonomous adversaries about the agent’s intention.
Along with work on (counter)deceptive policies, we introduced the notion of counterdeceptive environment design, i.e., placement of environmental features in a way that makes it easy to uncover the adversary’s true intentions. Investigating optimal environment design spawns interesting optimization questions: motivated by a geometry-inspired simplification, one of our recent papers interprets the challenge as a classical p-dispersion optimization problem. Our novel numerical solution, motivated by physics, is the first to systematically deal with nonconvex environments in a computationally feasible manner, exceeding the best known results from previous work.