### Robot Learning & Interaction

Our work focuses on human-centered robotic applications in which the robots can learn new skills by interacting with the end-users. From a machine learning perspective, the challenge is to acquire skills from only few demonstrations and interactions, with strong generalization demands. It requires the development of intuitive active learning interfaces to acquire meaningful demonstrations, the development of models that can exploit the structure and geometry of the acquired data in an efficient way, and the development of adaptive control techniques that can exploit the learned task variations and coordination patterns.

The developed models typically need to serve several purposes (recognition, prediction, online synthesis), and be compatible with different learning strategies (imitation, emulation, incremental refinement or exploration). For the reproduction of skills, these models need to be enriched with force and impedance information to enable human-robot collaboration and to generate safe and natural movements.

These models and algorithms can be applied to a wide range of robotic applications, with robots that are either close to us (assistive robots in I-DRESS), parts of us (prosthetic hands in TACT-HAND), or far away from us (manipulation skills in deep water with DexROV), which are projects supported by the European Commission and by the Swiss National Science Foundation, and by the Swiss Innovation Agency.

**Research Topics:**

- Learning in a handful of trials
- Learning with structures
- A broader view of model predictive control
- Objects-centered models of movements and skills
- Gaussian mixture regression (GMR)
- Tensor-variate regression
- Geometry-aware learning and control

**Applications:**

- Assistive teleoperation of a bimanual underwater robot (DexROV)
- Personalized assistance in dressing (I-DRESS)
- Control of prosthetic hands from myography data (TACT-HAND)
- Intuitive user interfaces for the generation of natural movements
- Robot skills acquisition through active learning and social interaction strategies (ROSALIS)

### Research Topics

#### LEARNING IN A HANDFUL OF TRIALS

The impressive results of deep learning in vision and speech processing have substantially influenced research in robotics. As several other researchers, we question this shift of attention. While some elements of deep learning can play key roles in robotics, we find it harmful that all research efforts are spent in this direction.

In many robot applications, there is an interactive data generation/collection that goes beyond the standard training/testing data paradigm. In this respect, many problems in robotics are closer to small data problems than big data problems. Instead of focusing only on techniques working for big data, robotics would likely benefit from techniques working with wide-ranging data. This includes models that can start learning from small datasets, and that are rich enough to be able to exploit more data if such data are available during the robot's lifespan.

**A Wisdom Winter...**

With the spread of deep learning, small data problems are unjustifiably set aside, wrongly characterized as unimportant problem, or as techniques that would not be mature yet. Worse, some of the key challenges in robotics are sometimes artificially transformed into big data problems instead of solving the initial issues.

**Reference:**

Chatzilygeroudis, K., Vassiliades, A., Stulp, F., Calinon, S. and Mouret, J.-B. (2018). **A Survey on Policy Search Algorithms for Learning Robot Controllers in a Handful of Trials**. arXiv:1807.02303. info pdf

#### LEARNING WITH STRUCTURES

**Nature vs Nurture: the shallowness of deep learning**

It is well knwown that animals rely on a combination of innate and learned mechanisms. We are born with highly structured brain connectivity which is transmitted through a genome, in the form of rules to wire up our brains. This combination is subtle: if our genome would encode all wirings explicitly, we would not be able to adapt to changing environments; if it wasn't pre-wiring our brains, it would take us too long to adapt and we would not survive. Such structure allows us to acquire skills fast, while still enabling us to adapt to new environments (see Zador's 2019 article in Nature Communications for an excellent perspective regarding this).

When it comes to machine learning, current trends hinder this picture by attributing the successes and the spotlights to the learning algorithms instead of the underlying structures and representations. The field of deep learning (and in particular, deep reinforcement learning) tends to attribute the success of a skill acquisition problem to the learning of network parameters, most often overshadowing that this success also comes in a large part by the structure of the network and by a careful selection of the training procedure. Smart structures and representations exist, such as convolutional neural networks enabling patterns to be detected with shift-invariance properties, but they are often not brought forward. The preference is often to highlight the rather naive "learning from scratch" aspect, sometimes with erroneous claims (a famous example is AlphaGo Zero's publication in Nature).

Existing structures remain quite small compared to the number of tasks we would like our learning systems to solve. This is particularly true in robotics, due to the high variety of problems a robot has to face in the real world. These structures are either too low-level or difficult to apply to existing network-based structures. Our work put a big emphasis on model structures that would enable robots to learn skills from small datasets, for tasks demanding high generalization capability. We believe that such setting is currently the realm of a broad range of problems in robotics. Typically, we don't have the same formats and amount of available datasets as in an image classification problem, and the data stream not only covers perception, but instead often includes highly structured planning and control pathways with robustness and safety guarantees, which greatly limits the extension of existing successful deep learning techniques to robotics.

Similarly to the joint role of innate and learned mechanisms, we believe that it is important to investigate which structures our robot needs, in order to learn from a small number of demonstrations and exploration trials, while being able to generalize within the range of variations required by the task to be useful in real world environments. It means finding model structures allowing our robots to learn just what is needed for adaptation, but not more, because it would otherwise require too much data, which would be ineffective for real-world applications. It also means finding bidirectional interaction structures that allow skills to be transfered efficiently (including iterative learning, feedbacks or scaffolding the environment).

To devise these structures, we take inspiration from diverse research fields, including sensorimotor control, biomechanics, human motion science, Riemannian geometry or tensor methods. We adopt a biological perspective only at a very high level, by preferring structures that can specifically exploit current robot capability and associated computation capability. Such structures (and associated structural rules) include object-centered movement primitives, model predictive control and geometrical aspects. A few of those are detailed below.

#### A BROADER VIEW OF MODEL PREDICTIVE CONTROL

Model predictive control (MPC) is ubiquitous in robot control, but the core formulation of this control problem and its associated algorithms can be extended to a wider range of problems, which has often been overlooked in robot learning. In particular, the most simple form of MPC (unconstrained and linear, with a homogeneous double integrator system) already has advantage for motion synthesis and planning problems, where it can be combined elegantly with probabilistic representations of movements.

This method allows the retrieval of smooth and natural trajectories analytically, by taking into account variation and coordination constraints. Instead of learning trajectories directly, the approach allows the learning of the underlying controllers to drive the robot. Namely, it learns to reject perturbations only in the directions that would affect task performance (minimal intervention control). This can typically be exploited with torque-controlled robots to regulate the tracking gain and compliance required to reproduce a task in an adaptive manner.

Interestingly, when combined with mixture models, this basic form of MPC also has links with Bézier curves and the use of dynamic features in generative models such as hidden Markov models (trajectory-HMM).

**References:**

Calinon, S. and Lee, D. (2018). **Learning Control**. Vadakkepat, P. and Goswami, A. (eds.). Humanoid Robotics: a Reference. Springer. info pdf

Calinon, S. (2016). **Stochastic learning and control in multiple coordinate systems**. Intl Workshop on Human-Friendly Robotics (HFR). info pdf

**Code examples:**

#### OBJECTS-CENTERED MODELS OF MOVEMENTS AND SKILLS

**One point of view does not show the whole picture...**

In many robotics applications, demonstrations or experiences are sparse. In such situation, it is important to get as much information as possible from each demonstration. We explore approaches encoding demonstrations from the perspective of multiple coordinate systems. This is achieved by providing a list of observers that could potentially be relevant for the movement or the task to transfer. A statistical learning approach is then used to determine the variability and coordination patterns in the movement by considering the different coordinate systems simultaneously, which then allows the orchestration of the different coordinate systems to reproduce the movement in new situations (typically, to adapt a movement to new positions of objects).

This approach provides better generalization capability than the conventional approach of associating task parameters (describing the situation/context) with policy parameters (describing the motion/skill), which would consist of mapping the two as a joint distribution, and then use regression to retrieve a new motion for a given new situation. The proposed task-parameterized models instead exploit the structure of the task parameters, which can in many robotics problems be expressed in the form of coordinate systems or local projections (including nullspace projection operators). It was shown that such approach can provide extrapolation capability that could not be achieved by treating the problem as standard regression.

**References:**

Calinon, S. (2016). **A Tutorial on Task-Parameterized Movement Learning and Retrieval**. Intelligent Service Robotics (Springer), 9:1, 1-29. info pdf

Calinon, S. (2015). **Robot learning with task-parameterized generative models**. Intl Symp. on Robotics Research (ISRR). info pdf

**Code examples:**

#### GAUSSIAN MIXTURE REGRESSION

Gaussian mixture regression (GMR) is a simple nonlinear regression technique that does not model the regression function directly, but instead first models the joint probability density of input-output data in the form of a Gaussian mixture model (GMM), which can for example be estimated by an expectation-maximization (EM) procedure.

Its computation relies on linear transformation and conditional distribution properties of multivariate normal distributions. GMR provides a fast regression approach in which multivariate output distributions can be computed in an online manner, with a computation time independent of the number of datapoints used to train the model, by exploiting the learned joint density model. In GMR, both input and output variables can be multivariate, and after learning, any subset of input-output dimensions can be selected for regression. This can for example be exploited to handle different sources of missing data, where expectations on the remaining dimensions can be computed as a multivariate distribution. These properties make GMR an attractive tool for robotics, which can be used in a wide range of problems and that can be combined fluently with other techniques or be used as a base for new developments.

**References:**

Calinon, S. (2016). **A Tutorial on Task-Parameterized Movement Learning and Retrieval**. Intelligent Service Robotics (Springer), 9:1, 1-29. info pdf

Calinon, S. and Lee, D. (2018). **Learning Control**. Vadakkepat, P. and Goswami, A. (eds.). Humanoid Robotics: a Reference. Springer. info pdf

**Code examples:**

#### TENSOR-VARIATE REGRESSION

Sensory data in robotics are typically organized as multidimensional arrays (arrays of sensors, multiple channels, time evolution of data, multiple coordinate systems, etc.). This leads our group to investigate the field of tensor methods, also called multilinear algebra. Tensors are generalization of matrices to arrays of higher dimensions, where vectors and matrices correspond to 1st and 2nd-order tensors. When data are organized in matrices or arrays of higher dimensions (tensors), classical regression methods first transform these data into vectors, therefore ignoring the underlying structure of the data and increasing the dimensionality of the problem. This flattening operation typically leads to overfitting when only few training data is available.

We investigate the use of expert models (product of experts, mixture of experts) relying on tensorial representations. The goal is to devise models and algorithms that can take into account the underlying structure of the data, and that remain efficient even when only few training data are available.

**Reference:**

Jaquier, N., Haschke, R. and Calinon, S. (2019). **Tensor-variate Mixture of Experts**. arXiv:1902.11104. info pdf

#### GEOMETRY-AWARE LEARNING AND CONTROL

The data encountered in robotics are characterized by simple but varied geometries, which are often underexploited when developing learning and control algorithms. Such data range from joint angles in revolving articulations, rigid body motions, orientations represented as unit quaternions, sensory data processed as spatial covariance features, or other forms of symmetric positive definite matrices such as inertia or manipulability ellipsoids. Moreover, many applications require these data to be handled altogether.

We exploit Riemannian manifold techniques to extend algorithms initially developed for Euclidean data, by efficiently taking into account prior knowledge about these manifolds and by modeling joint distributions among these heterogeneous data. The use of these differential geometry techniques allow us to treat data of various forms in a unified manner (including data in standard Euclidean spaces). It can typically be used to revisit common optimization problems in robotics formulated in standard Euclidean spaces, by treating them as unconstrained problems inherently taking into account the geometry of the data.

**References:**

Zeestraten, M.J.A., Havoutis, I., Silvério, J., Calinon, S. and Caldwell, D.G. (2017). **An Approach for Imitation Learning on Riemannian Manifolds**. IEEE Robotics and Automation Letters (RA-L), 2:3, 1240-1247. info pdf

Jaquier, N., Rozo, L., Caldwell, D.G. and Calinon, S. (2018). **Geometry-aware Tracking of Manipulability Ellipsoids**. In Proc. Robotics: Science and Systems (RSS). info pdf

**Code examples:**

### Applications

#### ASSISTIVE TELEOPERATION OF A BIMANUAL UNDERWATER ROBOT (DEXROV)

The DexROV project aims at controlling a bimanual underwater robot from a teleoperation center, with a user wearing an exoskeleton, controlling the robot in a virtual reality environment.

In DexROV, we develop approaches recasting teleoperation as a collaborative human-robot teamwork. The combination of model predictive control and task-parameterized probabilistic models is explored as a way to cope with teleoperation with long transmission delays (up to a second), and/or the simultaneous teleoperation of many degrees of freedom.

In this approach, the model parameters are first transmitted to both the teleoperator side and the robot side. The task parameters on the robot side can then update the model of the skill at fast pace by local sensing, without requiring the transmission of this change to the teleoperator.

As an example, if the robot has observed that the task of drilling requires the drill to be perpendicular when it approaches a surface, the robot will then automatically orient the drill when it approaches another surface, letting the teleoperator concentrate on the position to drill by delegating the orientation tracking aspect to the robot. The robot will automatically react to perturbations transparently to the user (e.g., when reorienting the surface). The aim is to reduce the cognitive load of the teleoperator for repetitive or well structured tasks.

**Project website:** http://www.dexrov.eu/

**References:**

Havoutis, I. and Calinon, S. (2018). **Learning from demonstration for semi-autonomous teleoperation**. Autonomous Robots. info pdf

Havoutis, I. and Calinon, S. (2017). **Supervisory teleoperation with online learning and optimal control**. In Proc. of the IEEE Intl Conf. on Robotics and Automation (ICRA). info pdf

Tanwani, A.K. and Calinon, S. (2016). **Learning Robot Manipulation Tasks with Task-Parameterized Semi-Tied Hidden Semi-Markov Model**. IEEE Robotics and Automation Letters (RA-L), 1:1, 235-242. info pdf

#### PERSONALIZED ASSISTANCE IN DRESSING (I-DRESS)

The I-DRESS project aims at providing dressing assistance to persons with limited mobility. Two case studies are considered: to put on a jacket and to put on shoes.

In I-DRESS, we tackle the problem of transferring assistive skills to robots, by letting the robots acquire not only movements, but also force and compliance behaviors. We explore extensions of movement primitives frequently employed in robotics to a wider repertoire of skills composed of reactive, time-dependent and time-independent behaviors based on force, impedance, position and orientation information.

Assisting a person to dress is a challenging case study, in which the robot needs to adapt to different morphologies, pathologies or stages of recovery, with varied requirements for movement generation and physical interaction. Since this assistance is person-dependent, with preferences and requirements that can evolve with time, it cannot be pre-programmed. Dressing assistance is typically provided by healthcare workers, which is not always convenient. From the worker perspective, the activity takes time and is not particularly gratifying. From the patient perspective, such assistance is often viewed negatively because it reduces the sense of independence.

In this context, learning from demonstration can provide a solution to transfer dressing assistance skills to the robot. We explore if this could be achieved by means of kinesthetic teaching, where demonstrations are used to let the robot acquire person-specific requirements and preferences.

**Project website:** https://www.i-dress-project.eu/

**References:**

Pignat, E. and Calinon, S. (2017). **Learning adaptive dressing assistance from human demonstration**. Robotics and Autonomous Systems, 93, 61-75. info pdf

Canal, G., Pignat, E., Alenya, G., Calinon, S. and Torras, C. (2018). **Joining high-level symbolic planning with low-level motion primitives in adaptive HRI: application to dressing assistance**. In Proc. of the IEEE Intl Conf. on Robotics and Automation (ICRA). info pdf

#### CONTROL OF PROSTHETIC HANDS FROM MYOGRAPHY DATA (TACT-HAND)

We apply Riemannian geometry approaches and tensor methods to the control of prosthetic hands by exploiting several sensor modalities. A tactile bracelet is providing sensory data organized as a cylindric grid, which are augmented with noisy surface electromyography (sEMG) signals typically processed as spatial covariance features. Instead of flattening or vectorizing the above data, we aim at treating the different sources statistically as a joint distribution, by keeping the original geometry and structure of the data.

**Project website:** http://www.idiap.ch/project/tact-hand/

**References:**

Jaquier, N. and Calinon, S. (2017). **Gaussian Mixture Regression on Symmetric Positive Definite Matrices Manifolds: Application to Wrist Motion Estimation with sEMG**. In Proc. of the IEEE/RSJ Intl Conf. on Intelligent Robots and Systems (IROS). info pdf

Jaquier, N., Castellini, C. and Calinon, S. (2017). **Improving Hand and Wrist Activity Detection Using Tactile Sensors and Tensor Regression Methods on Riemannian Manifolds**. Myoelectric Controls Symposium (MEC). info pdf

#### INTUITIVE USER INTERFACES FOR THE GENERATION OF NATURAL MOVEMENTS

Many computer aided design applications require the generation of continuous traces, where the most common interface is to edit the control points of some form of interpolating spline. In this research, conducted in collaboration with Daniel Berio and Frederic Fol Leymarie at Goldsmiths University of London, we study how tools from computational motor control can be used to extend such methods to encapsulate movement dynamics, precision and coordination.

Our approach relies on a probabilistic formulation of model predictive control, by providing users with a simple interactive interface to generate multiple movements and traces at once by defining a distribution of trajectories rather then a single one. A dynamical system is then used to generate natural looking curves stochastically.

The method is applied to the generation of traces and patterns that are similar to the ones that can be seen in art forms such as calligraphy and graffiti.

**Reference:**

Berio, D., Calinon, S. and Fol Leymarie, F. (2017). **Generating Calligraphic Trajectories with Model Predictive Control**. In Proc. of the 43rd Conf. on Graphics Interface. info pdf

#### ROBOT SKILLS ACQUISITION THROUGH ACTIVE LEARNING AND SOCIAL INTERACTION STRATEGIES (ROSALIS)

Most efforts in robot learning from demonstration are turned toward developing algorithms for the acquisition of specific skills from training data. While such developments are important, they often do not take into account the social structure of the process, in particular, that the interaction with the user and the selection of the different interaction steps can directly influence the quality of the collected data.

Similarly, while skills acquisition encompasses a wide range of social and self-refinement learning strategies, including mimicking (without understanding the objective), goal-level emulation (discovering the objectives by discarding the specific way in which a task is achieved), exploration with self-assessed rewards or feedback from the users, they each require the design of dedicated algorithms, but the ways in which they can be organized have been overlooked so far. We address this challenge in the ROSALIS project, by exploiting social interaction to transfer skills efficiently to robots.