### Abstract

Learning by imitation in humanoids is challenging due to the unpredictable environments these robots have to face during reproduction. Two sets of tools are relevant for this purpose: 1) probabilistic machine learning methods that can extract and exploit the regularities and important features of the task; and 2) dynamical systems that can cope with perturbation in real-time without having to replan the whole movement. We present a learning by imitation approach combining the two benefits. It is based on a superposition of virtual spring-damper systems to drive a humanoid robot's movement. The method relies on a statistical description of the springs attractor points acting in different candidate frames of reference. It extends dynamic movement primitives models by formulating the dynamical systems parameters estimation problem as a Gaussian mixture regression problem with projection in different coordinate systems. The robot exploits local variability information extracted from multiple demonstrations of movements to determine which frames are relevant for the task, and how the movement should be modulated with respect to these frames. The approach is tested on the new prototype of the COMAN compliant humanoid with time-based and time-invariant movements, including bimanual coordination skills.

### Bibtex reference

@inproceedings{Calinon12Hum,
author="Calinon, S. and Li, Z. and Alizadeh, T. and Tsagarakis, N. G. and Caldwell, D. G.",
title="Statistical dynamical systems for skills acquisition in humanoids",
booktitle="Proc. {IEEE} Intl Conf. on Humanoid Robots ({H}umanoids)",
year="2012",
pages="323--329"
}

### Video

Imitation is not simply recording and replaying movements. The learned skills require to be generalized to new situations. For example, if someone grasps a bottle of orange juice, shakes it and pours its content into a glass, the robot should be able to reproduce the task even if the position of the bottle and the glass are different than during the demonstrations. The robot should be able to shake the bottle even if its body does not have the same exact shape and configuration of articulations (a.k.a. correspondence problem, retargeting problem, mapping problem).

In contrast to robots in standard industrial settings, humanoids and compliant robots can work in unpredictable environment and in the proximity of users. Two sets of tools are relevant for learning and reproducing skills in unpredictable environment:

• Probabilistic machine learning tools: they can extract and exploit the regularities and relevant characteristics of the task.
• Dynamical systems: they are able to cope with perturbations in real-time without having to replan the whole trajectory.

We study how to make these two sets of tools work together. The problem can be illustrated as follows. We assume that the motion of the robot is driven by a set of virtual springs that are related (connected) to a set of candidate objects or body parts of the robot. The learning problem consists of estimating when and where to activate these springs. This can be learned from demonstrations by exploiting the invariant characteristics of the task (the parts of the movement that are the same between the multiple demonstrations). The consistent characteristics will result in strong springs, and the irrelevant will result in soft springs.

What does the video show?

By using the same software, we can teach different skills to the robot. Demonstrations were recorded by placing visual markers on an object and on the hands of the user, tracked by the Optitrack system.

When the model is tested with hand clapping movements, the robot extracted that bimanual coordination was required. Namely, that the position of one hand is regulated with respect to the other hand. The clapping motion is not perturbed by moving the object around the robot, because the robot learned that the object was not relevant for this skill (i.e., the springs attached to the object frame are too weak to influence the movement).

When the model is tested with reaching movements, the robot learned that it should use one or the other hand depending on the position of the object. If the object is in the center, the robot can try to grasp it with both hands, by following the behavior previously demonstrated by the user. The hand that is not used for tracking the object automatically comes back to a neutral pose.

### Source codes

#### Usage

For the Matlab version, unzip the file and run 'demo1' in Matlab.
For the C++ version, unzip the file and follow the instructions in the ReadMe.txt file.

#### Reference

• Calinon, S., Li, Z., Alizadeh, T., Tsagarakis, N.G. and Caldwell, D.G. (2012) Statistical dynamical systems for skills acquisition in humanoids. Proc. of the IEEE Intl Conf. on Humanoid Robots (Humanoids).

#### Demo 1 - Simple example of DMP learning with GMR

Simple example of estimating the parameters of a DMP (dynamic movement primitives) through GMR (Gaussian mixture regression).
A DMP is composed of a virtual spring-damper system modulated by a non-linear force. The standard method to train a DMP is to predefine a set of activations functions and estimate a set of force components through a weighted least-squares (WLS) approach. The weighted sum of force components form a non-linear force perturbing the system, by moving it away from the point-to-point linear motion while following a desired trajectory.
GMR is used here to learn the joint distribution between the decay term s (determined by a canonical dynamical system) and the non-linear force variable to estimate.
Replacing WLS with GMR has the following advantages:

• It provides a probabilistic formulation of DMP (e.g., to allow the exploitation of correlation and variation information, and to make the DMP approach compatible with other statistical machine learning tools).
• It simultaneously learns the non-linear force together with the activation functions. Namely, the Gaussian kernels do not need to be equally spaced in time (or at predefined values of the decay term 's'), and the bandwidths (variance of the Gaussians) are automatically estimated from the data instead of being hand-tuned.
• It provides a more accurate approximation of the non-linear perturbing force with local linear models of degree 1 instead of degree 0 (by exploiting the conditional probability properties of Gaussian distributions).