The vision of our research is to enable robots to function in dynamic human environments by allowing them to flexibly adapt their skill set via learning interactions with end-users. We call this Socially Guided Machine Learning (SG-ML), exploring the ways in which Machine Learning agents can exploit principles of human social learning. To date, our work in SG-ML has focused on two research thrusts: (1) Interactive Machine Learning, and (2) Natural Interaction Patterns for HRI. Here you will find recent examples of projects in each of these two thrusts.
Interactive Machine Learning
Simulation-Inspired Active Learning
A. Allevato, E.S. Short, A.L. Thomaz
Robots in real-world environments may need to adapt context-specific behaviors learned in one environment to new environments with new constraints. In many cases, copresent humans can provide the robot with information, but it may not be safe for them to provide hands-on demonstrations and there may not be a dedicated supervisor to provide constant feedback. In this work we present the SAIL (Simulation-Informed Active In-the-Wild Learning) algorithm for learning new approaches to manipulation skills starting from a single demonstration. In this three-step algorithm, the robot simulates task execution to choose new potential approaches; collects unsupervised data on task execution in the target environment; and finally, chooses informative actions to show to co-present humans and obtain labels. Our approach enables a robot to learn new ways of executing two different tasks by using success/failure labels obtained from na¨ıve users in a public space, performing 496 manipulation actions and collecting 163 labels from users in the wild over six 45-minute to 1-hour deployments. We show that classifiers based low-level sensor data can be used to accurately distinguish between successful and unsuccessful motions in a multi-step task, even when trained in the wild. We also show that using the sensor data to choose which actions to sample is more effective than choosing the least-sampled action.
E.S. Short, A. Allevato and A.L. Thomaz, "SAIL: Simulation-Informed Active In-the-Wild Learning." HRI 2019.
A. Allevato, E.S. Short and A.L. Thomaz, "SAIL: Simulation-Informed Active In-the-Wild Learning." CoRL 2019.
Human-guided Task Transfer
T. Fitzgerald, E.S. Short, A. Goel, A.L. Thomaz
As robots become more commonplace, they will be situated in a wide variety of environments and tasks. Since a robot cannot be programmed to complete every task, it is necessary for robots to adapt their task models to various environment and task constraints.When transferring a learned task to an environment containing new objects, a core problem is identifying the mapping between objects in the old and new environments. This object mapping is dependent on the task being performed and the roles objects play in that task. We introduce an approach that is not constrained by either assumption, but rather, uses structured interaction with a human teacher to infer an object mapping for task transfer. Our results indicate that human-guided object mapping provides a balance between mapping performance and autonomy.An object replacement may also introduce new constraints to the task. We introduce " transfer by correction ": a method for transferring a robot's tool-based task models to use unfamiliar tools. By having the robot receive corrections from a human teacher when repeating a known task with a new tool, it can learn the relationship between the two tools, allowing it to transfer additional tasks learned with the original tool to the new tool. We demonstrate how the tool transform models learned from one episode of task corrections can be used to perform that task with >=85% of maximum performance in 83% of tool/task combinations. Furthermore, these transformations generalize to unseen tool/task combinations in 27.8% of our transfer evaluations, and up to 41% of transfer problems when the source and replacement tool share tooltip similarities.
T. Fitzgerald, E.S. Short, A. Goel, A.L. Thomaz. Human-guided Trajectory Adaptation for Tool Transfer. AAMAS 2019.
T. Fitzgerald, A. Goel, A.L. Thomaz. Human-guided Object Mapping for Task Transfer. . THRI, 2019.
Learning from Partially Attentive or Inaccurate Humans
T. Kessler Faulkner, R. A. Gutierrez, E. S. Short, G. Hoffman, A.L. Thomaz
Interactive reinforcement learning allows robots to learn from both exploring their environment and from human feedback. Robots can use one of these sources to confirm the performance of the other. There are limitations on the human feedback, including human teachers that are not constantly available, or give incorrect feedback to the robot. In this work, we propose interactive reinforcement learning algorithms that take the presence or absence of human attention into account, or learn patterns of incorrect feedback to improve performance.
T. Faulkner, E. S. Short, A. L. Thomaz. Interactive Reinforcement Learning with Inaccurate Feedback. ICRA 2020.
T. Kessler Faulkner, R. A. Gutierrez, E. S. Short, and A.L. Thomaz, Policy Shaping with Supervisory Attention Driven Exploration. IROS 2018.
T. Kessler Faulkner, R. A. Gutierrez, E. S. Short, G. Hoffman, and A.L. Thomaz, "Active Attention-Modified Policy Shaping." AAMAS 2019.
Learning from Human Corrections
R.A. Gutierrez, V. Chu, A.L. Thomaz, S. Niekum
In realistic environments, fully specifying a task model such that a robot can perform a task in all situations is impractical. In this work, we present Incremental Task Modification via Corrective Demonstrations (ITMCD), a novel algorithm that allows a robot to update a learned model by making use of corrective demonstrations from an end-user in its environment.
R.A. Gutierrez, V.Chu, A.L. Thomaz and S. Niekum, "Incremental Task Modification via Corrective Demonstrations." ICRA 2018.
Embodied Active Learning Queries
M. Cakmak, A.L. Thomaz
Programming new skills on a robot should take minimal time and effort. One approach to achieve this goal is to allow the robot to ask questions (called Active Learning). In this work, we identify three types of questions (label, demonstration and feature queries) and show how a robot can use these "Embodied Queries" while learning new skills from demonstration.
M. Cakmak, "Guided teaching interactions with robots." PhD Thesis, Georgia Tech, 2012.
M. Cakmak and A.L. Thomaz, "Designing Robot Learners that Ask Good Questions." HRI 2012.
Keyframe-based Learning from Demonstration
B. Akgun, M. Cakmak, K. Jiang, and A.L. Thomaz
Kinesthetic teaching is an approach to LfD where a human physically guides a robot to perform a skill. In the common usage, the robot’s trajectory during a demonstration is recorded from start to end. We propose an alternative, keyframe demonstrations, in which the human provides a sparse set of consecutive keyframes that can be connected to perform the skill. We have presented a user-study comparing the two approaches and highlighting their complementary nature. Thus, we introduce a hybrid method that combines trajectories and keyframes in a single demonstration, and present a learning framework that can handle all three types of input.
B. Akgun, et al., "Trajectories and Keyframes for Kinesthetic Teaching: A Human-Robot Interaction Perspective." HRI 2012 -- Best paper nominee.
B. Akgun, et al., "Keyframe-based learning from demonstration." International Journal of Social Robotics, 2012.
Mixed-Initiative Active Learning for HRI
C. Chao, M. Cakmak, A.L. Thomaz
We are investigating some of the problems that arise when using active learning in the context of human–robot interaction (HRI). In experiments with human subjects we have explored three different versions of mixed-initiative active learning, and shown they are all preferable to passive supervised learning. But issues arrise around balance of control, compliance to queries, and perceived utility of the questions.
M. Cakmak et al., "Designing Interactions for Robot Active Learners." in IEEE Transactions on Autonomous Mental Development, 2010.
C. Chao et al., "Transparent active learning for robots." HRI 2010.
Learning Task Goals from Demonstration
C. Chao, M. Cakmak, A.L. Thomaz
In this project a social robot learns task goals from human demonstrations without prior knowledge of high-level concepts. New concepts are grounded from low-level continuous sensor data through unsupervised learning, and task goals are subsequently learned using a Bayesian approach. These concepts can be used to transfer knowledge to future tasks, resulting in faster learning of those tasks.
Chao et al., "Towards Grounding Concepts for Transfer in Goal Learning from Demonstration." ICDL 2011.
Learning about Objects from Humans and Self Exploration
V. Chu, T. Fitzgerald, B. Akgun, M. Cakmak, A.L. Thomaz
Our work focuses on robots to be deployed in human environments.These robots will need specialized object manipulation skills. A general learning task for a robot in a new environment is to learn about objects and what actions/eﬀects they aﬀord. To approach this, we look at ways that a human partner can intuitively help the robot learn (socially guided machine learning), leveraging end-users to efficiently learn the affordances (e.g. pull-able, open-able, push-able) of objects in their environment. This approach is promising because people naturally focus on showing salient aspects of the objects. We conducted experiments and made six observations characterizing how people approached teaching about objects. We showed that the robot successfully used transparency to mitigate errors. Further our work also characterizes the benefits of self and supervised affordance learning and show that a combined approach is the most efficient and successful.
V.Chu, A.L. Thomaz, "Analyzing Differences between Teachers when Learning Object Affordances via Guided-Exploration." IJRR 2017
V. Chu, R.A. Gutierrez, S.Chernova, A.L. Thomaz. Real-time Multisensory Affordance-based Control for Adaptive Object Manipulation. . ICRA 2019.
V. Chu, T. Fitzgerald, A.L. Thomaz, "Learning Object Affordances by Leveraging the Combination of Human-Guidance and Self-Exploration." HRI 2016 -- Nominated for Best Technical Advance in HRI Paper Award.
V. Chu, B. Akgun, and A.L. Thomaz. "Learning haptic affordances from demonstration and human-guided exploration." HAPTICS, 2016.
A.L. Thomaz and M. Cakmak, "Learning about objects with human teachers." HRI 2009.
Biologically Inspired Social Learning
M. Cakmak, N. DePalma, R.I. Arriaga, A.L.Thomaz
"Social" learning in robotics has focused on imitation learning, but we take a broader view and are interested in the multifaceted ways that a social partner can inﬂuence the learning process. We implement stimulus enhancement, emulation, mimicking and imiation on a robot, and illustrate the computational beneﬁts of social learning over self exploration. Additionally we characterize the differences between strategies, showing that the preferred strategy is dependent on the environment and the behavior of the social partner.
M. Cakmak et al., "Exploiting social partners in robot learning." Autonomous Robots, 2010.
M. Cakmak et al., "Computational benefits of social learning mechanisms: Stimulus enhancement and emulation." ICDL 2009 -- Best paper award.
A.L. Thomaz et al., "Effects of social exploration mechanisms on robot learning." RO-MAN 2009.
Webgames for Interactive Learning Agents
L. Cobo, K. Subramanian, P. Zang, C. Isbell, A.L. Thomaz
We are interested in machines that can learn from everyday people. To study this, we are building a suite of short computer games, with interactive learning agents. These serve as a testbed for experiments with various algorithms and interface techniques, looking at how to allow the average person to successfully teach machine learning agents.
L. Cobo et al., "Automatic task decomposition and state abstraction from demonstration." AAMAS 2012.
L. Cobo et al., "Automatic state abstraction from demonstration." IJCAI 2011.
P. Zang et al., "Batch versus Interactive LbD." ICDL 2010.
Sophie's Kitchen: Interactive Reinforcement Learning
A.L. Thomaz, C. Breazeal
Sophie's Kitchen is work from Prof. Thomaz' PhD thesis at MIT with Cynthia Breazeal. This is an environment to experiment with Interactive Reinforcement Learning. You can find out more about the Sophie project, and teach Sophie to bake a cake, at the Sophie's Kitchen demo page.
Natural Interaction Patterns for HRI
Algorithms for Effective Human-Robot Teamwork
M. L. Chang, T. Kessler Faulkner, E. S. Short, Z. H. Pope, A. Gutierrez, T. Wei, G. Anandaraman, P. Kante, A. L. Thomaz
Robotic teammates must be able to reason about the multi-dimensional aspects of an effective team. Motivated by the concept of shared cooperative activity, developed in prior work by Bratman, we propose algorithms that enable robots to become full-fledged team members. These algorithms allow robots to reason about mutual responsiveness, commitment to the joint activity, and commitment to mutual support to improve teamwork.
M. L. Chang, G. Trafton, J. M. McCurry, A. L. Thomaz. Unfair! Perceptions of Fairness in Human-Robot Teams. IEEE International Conference on Robot and Human Interactive Communication (R0-MAN), 2021.
M. L. Chang, T. Kessler Faulkner, T. Wei, E. S. Short, G. Anandaraman, A. L. Thomaz. TASC: Teammate Algorithm for Shared Cooperation. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020.
M. L. Chang, Z. H. Pope, E. S. Short, A. L. Thomaz. Defining Fairness in Human-Robot Teams. IEEE International Conference on Robot & Human Interactive Communication (R0-MAN), 2020.
M. L. Chang, R. A. Gutierrez, P. Khante, E. S. Short, and A.L. Thomaz. Effects of Integrated Intent Recognition and Communication on Human-Robot Collaboration. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018.
E.S. Short, M.L. Chang, V. Chu, K. Bullard, T. Fitzgerald, C. Chao, J. Lee, J.F. Kieser, M. Begum, A.F. Bobick, A.L.Thomaz
We are developing novel methods for detecting a contingent response by a human to the stimulus of a robot action. Contingency is defined as a change in an agent’s behavior within a specific time window in direct response to a signal from another agent; detection of such responses is essential to assess the willingness and interest of a human in interacting with the robot.
E.S. Short, M.L. Chang, A. Thomaz. "Detecting Contingency for HRI in Open-World Environments." HRI, 2018.
V. Chu, K. Bullard, and A.L. Thomaz. "Multimodal Real-Time Contingency Detection for HRI." IROS, 2014.
J. Lee, et al., "Multi-cue Contingency Detection." Journal of Social Robotics 2012.
J. Lee, et al., "Vision-based Contingency Detection." HRI 2011.
Human Gaze Patterns for Human-Robot Interaction
A. Saran, S. Majumdar, E. S. Short, A.L. Thomaz, S. Niekum
Human gaze is known to be a strong indicator of underlying human intentions and goals during manipulation tasks. This work studies gaze patterns of human teachers demonstrating tasks to robots and proposes ways in which such patterns can be used to enhance robot learning. Using both kinesthetic teaching and video demonstrations, we identify novel intention-revealing gaze behaviors during teaching. These prove to be informative in a variety of problems ranging from reference frame inference to segmentation of multi-step tasks. Based on our findings, we propose two proof-of-concept algorithms which show that gaze data can enhance subtask classification for a multi-step task up to 6% and reward inference and policy learning for a single-step task up to 67%. Our findings provide a foundation for a model of natural human gaze in robot learning from demonstration settings and present open problems for utilizing human gaze to enhance robot learning.
A. Saran, S. Majumdar, E. S. Short, A. L. Thomaz, S. Niekum. Human Gaze Following for Human-Robot Interaction. IROS 2018.
A. Saran, E. S. Short, A. L. Thomaz, S. Niekum. Understanding Teacher Gaze Patterns for Robot Learning. CoRL 2019.
Multimodal Turn-taking for HRI
C. Chao, A. L. Thomaz
If we want robots to engage effectively with humans on a daily basis in service applications or in collaborative work scenarios, then it will become increasingly important for them to achieve the type of interaction fluency that comes naturally between humans. In this work we are developing an autonomous robot controller for multi-modal reciprocal turn-taking interactions, allowing a robot to better manage how they time their actions with a human partner.
C. Chao and A. L. Thomaz. "Timing in multimodal reciprocal interactions: control and analysis using timed Petri nets." Journal of Human-Robot Interaction, 2012.
C. Chao, A. L. Thomaz, "Turn-Taking for Human-Robot Interaction." AAAI Fall Symposium, 2010.
C. Chao et al., "Simon plays Simon says", RO-MAN 2011.
Life-like Robot Motion
M.Gielniak, C.K. Liu, A.L.Thomaz
We hypothesize that believable "human-like" motion increases communication, improves interaction, and advances task completion for social robots interacting with human partners. In this work we explore the interaction benefits gained when robots communicate with their partners using a familiar way: robot motion that is human-like. This has two concrete goals: (1) synthesize robot motion that is more human-like, and (2) add communication to benefit interaction.
One contribution of our research has been showing motor coordination (i.e. spatiotemporal correspondence) to be a metric for believable motion; We use this to develop a real-time, dynamic, autonomous motion algorithm, which systematically composes communicative signals to robot motion using minimal prior information.
Additionally we have introduced algorithms for three specific methods of communicating via motion (i.e. secondary motion, exaggeration, and anticipation).
M.J. Gielniak and A.L. Thomaz, "Anticipation in Robot Motion." RO-MAN 2011.
M.J. Gielniak, C.K. Liu and A.L. Thomaz, "Task-aware Variations in Robot Motion." ICRA 2011.
M.J. Gielniak and A.L. Thomaz, "Spatiotemporal Correspondence as a Metric for Human-like Robot Motion." HRI 2011 -- Best paper award.
M.J. Gielniak, C.K Liu and A.L. Thomaz, "Stylized Motion Generalization Through Adaptation of Velocity Profiles." RO-MAN 2010.
M.J. Gielniak, C.K Liu and A.L. Thomaz, "Secondary Action in Robot Motion." RO-MAN 2010.