Robot Learning & Interaction

Our work focuses on human-centered robotic applications in which the robots can learn new skills by interacting with the end-users. From a machine learning perspective, the challenge is to acquire skills from only few demonstrations and interactions, with strong generalization demands. It requires the development of intuitive active learning interfaces to acquire meaningful demonstrations, the development of models that can exploit the structure and geometry of the acquired data in an efficient way, and the development of adaptive control techniques that can exploit the learned task variations and coordination patterns.

The developed models typically need to serve several purposes (recognition, prediction, online synthesis), and be compatible with different learning strategies (imitation, emulation, incremental refinement or exploration). For the reproduction of skills, these models need to be enriched with force and impedance information to enable human-robot collaboration and to generate safe and natural movements.

These models and algorithms can be applied to a wide range of robotic applications, with robots that are either close to us (assistive robots in I-DRESS), parts of us (prosthetic hands in TACT-HAND), or far away from us (manipulation skills in deep water with DexROV), which are projects supported by the European Commission and by the Swiss National Science Foundation, and by the Swiss Innovation Agency.

Research Topics: Applications:

Research Topics



LEARNING IN A HANDFUL OF TRIALS


The impressive results of deep learning in vision and speech processing have substantially influenced research in robotics. As several other researchers, we question this shift of attention. While some elements of deep learning can play key roles in robotics, we find it harmful that all research efforts are spent in this direction.

In many robot applications, there is an interactive data generation/collection that goes beyond the standard training/testing data paradigm. In this respect, many problems in robotics are closer to small data problems than big data problems. Instead of focusing only on techniques working for big data, robotics would likely benefit from techniques working with wide-ranging data. This includes models that can start learning from small datasets, and that are rich enough to be able to exploit more data if such data are available during the robot's lifespan.

A Wisdom Winter...

With the spread of deep learning, small data problems are unjustifiably set aside, wrongly characterized as unimportant problem, or as techniques that would not be mature yet. Worse, some of the key challenges in robotics are sometimes artificially transformed into big data problems instead of solving the initial issues.

Reference:

Chatzilygeroudis, K., Vassiliades, A., Stulp, F., Calinon, S. and Mouret, J.-B. (2019). A Survey on Policy Search Algorithms for Learning Robot Controllers in a Handful of Trials. IEEE Trans. on Robotics. info pdf



LEARNING WITH STRUCTURES


Nature vs Nurture: the shallowness of deep learning

It is well knwown that animals rely on a combination of innate and learned mechanisms. We are born with highly structured brain connectivity which is transmitted through a genome, in the form of rules to wire up our brains. This combination is subtle: if our genome would encode all wirings explicitly, we would not be able to adapt to changing environments; if it wasn't pre-wiring our brains, it would take us too long to adapt and we would not survive. Such structure allows us to acquire skills fast, while still enabling us to adapt to new environments (see Zador's 2019 article in Nature Communications for an excellent perspective regarding this).

When it comes to machine learning, current trends hinder this picture by attributing the successes and the spotlights to the learning algorithms instead of the underlying structures and representations. The field of deep learning (and in particular, deep reinforcement learning) tends to attribute the success of a skill acquisition problem to the learning of network parameters, most often overshadowing that this success also comes in a large part by the structure of the network and by a careful selection of the training procedure. Smart structures and representations exist, such as convolutional neural networks enabling patterns to be detected with shift-invariance properties, but they are often not brought forward. The preference is often to highlight the rather naive "learning from scratch" aspect, sometimes with erroneous claims (a famous example is AlphaGo Zero's publication in Nature).

Existing structures remain quite small compared to the number of tasks we would like our learning systems to solve. This is particularly true in robotics, due to the high variety of problems a robot has to face in the real world. These structures are either too low-level or difficult to apply to existing network-based structures. Our work put a big emphasis on model structures that would enable robots to learn skills from small datasets, for tasks demanding high generalization capability. We believe that such setting is currently the realm of a broad range of problems in robotics. Typically, we don't have the same formats and amount of available datasets as in an image classification problem, and the data stream not only covers perception, but instead often includes highly structured planning and control pathways with robustness and safety guarantees, which greatly limits the extension of existing successful deep learning techniques to robotics.

Similarly to the joint role of innate and learned mechanisms, we believe that it is important to investigate which structures our robot needs, in order to learn from a small number of demonstrations and exploration trials, while being able to generalize within the range of variations required by the task to be useful in real world environments. It means finding model structures allowing our robots to learn just what is needed for adaptation, but not more, because it would otherwise require too much data, which would be ineffective for real-world applications. It also means finding bidirectional interaction structures that allow skills to be transfered efficiently (including iterative learning, feedbacks or scaffolding the environment).

To devise these structures, we take inspiration from diverse research fields, including sensorimotor control, biomechanics, human motion science, Riemannian geometry or tensor methods. We adopt a biological perspective only at a very high level, by preferring structures that can specifically exploit current robot capability and associated computation capability. Such structures (and associated structural rules) include object-centered movement primitives, model predictive control and geometrical aspects. A few of those are detailed below.



A BROADER VIEW OF MODEL PREDICTIVE CONTROL


Model predictive control (MPC) is ubiquitous in robot control, but the core formulation of this control problem and its associated algorithms can be extended to a wider range of problems, which has often been overlooked in robot learning. In particular, the most simple form of MPC (unconstrained and linear, with a homogeneous double integrator system) already has advantage for motion synthesis and planning problems, where it can be combined elegantly with probabilistic representations of movements.

This method allows the retrieval of smooth and natural trajectories analytically, by taking into account variation and coordination constraints. Instead of learning trajectories directly, the approach allows the learning of the underlying controllers to drive the robot. Namely, it learns to reject perturbations only in the directions that would affect task performance (minimal intervention control). This can typically be exploited with torque-controlled robots to regulate the tracking gain and compliance required to reproduce a task in an adaptive manner.

Interestingly, when combined with mixture models, this basic form of MPC also has links with Bézier curves and the use of dynamic features in generative models such as hidden Markov models (trajectory-HMM).

References:

Calinon, S. and Lee, D. (2018). Learning Control. Vadakkepat, P. and Goswami, A. (eds.). Humanoid Robotics: a Reference. Springer. info pdf

Calinon, S. (2016). Stochastic learning and control in multiple coordinate systems. Intl Workshop on Human-Friendly Robotics (HFR). info pdf

Code examples:

demo_batchLQR01.m from pbdlib-matlab

demo_MPC_batch01.cpp from pbdlib-cpp

Standard MPC with the objective of reaching a target at the end of the movement.
MPC combined with a statistical representation of the task to achieve. An MPC problem is typically composed of a tracking cost and a control cost (top part of the image), which is minimized to find a set of control commands or a set of tracking gains. The weighting terms and the target references in the cost function can be learned from demonstrations, with a compact generative model of the task (bottom part of the image). Solving the MPC problem with a double integrator system (center of the image) results in an analytical solution, fast to compute, corresponding to a controller smoothly tracking the stepwise reference generated by the model, anticipating the step changes and modulating the tracking gains in accordance to the extracted precision and coordination patterns.
Hidden semi-Markov model (HSMM) combined with model predictive control (MPC) for the learning and synthesis of movements.


OBJECTS-CENTERED MODELS OF MOVEMENTS AND SKILLS


Statistical learning in multiple coordinate systems, and retrieval in a new situation.
Task-parameterized Gaussian mixture model (TP-GMM) combined with model predictive control (MPC).
The proposed task-parameterized approach can generalize the five demonstrations (in semi-transparent color) to new situations, by providing a trajectory distribution adapted to the situation (new position and orientation of the two objects).
The solution of MPC with a quadratic cost corresponds to a product of Gaussian distributions with a vector representation of the control commands and states (i.e. stacking the control commands and states for each time step of the time horizon). The solution has a direct interpretation as a fusion of controllers, with the control cost corresponding to a Gaussian centered on 0 depicted in blue, and the tracking cost for different coordinate systems depicted in green. The solution of the problem is given by the gray Gaussian. In MPC, the center of this Gaussian would typically be used as solution. Our work explores the use of the retrieved covariance, which contains additional information about the solution. A distribution on the control commands provides a mechanism to adapt to other constraints or to generate stochastic reproduction attempts.

One point of view does not show the whole picture...

In many robotics applications, demonstrations or experiences are sparse. In such situation, it is important to get as much information as possible from each demonstration. We explore approaches encoding demonstrations from the perspective of multiple coordinate systems. This is achieved by providing a list of observers that could potentially be relevant for the movement or the task to transfer. A statistical learning approach is then used to determine the variability and coordination patterns in the movement by considering the different coordinate systems simultaneously, which then allows the orchestration of the different coordinate systems to reproduce the movement in new situations (typically, to adapt a movement to new positions of objects).

This approach provides better generalization capability than the conventional approach of associating task parameters (describing the situation/context) with policy parameters (describing the motion/skill), which would consist of mapping the two as a joint distribution, and then use regression to retrieve a new motion for a given new situation. The proposed task-parameterized models instead exploit the structure of the task parameters, which can in many robotics problems be expressed in the form of coordinate systems or local projections (including nullspace projection operators). It was shown that such approach can provide extrapolation capability that could not be achieved by treating the problem as standard regression.

References:

Calinon, S. (2016). A Tutorial on Task-Parameterized Movement Learning and Retrieval. Intelligent Service Robotics (Springer), 9:1, 1-29. info pdf

Calinon, S. (2015). Robot learning with task-parameterized generative models. Intl Symp. on Robotics Research (ISRR). info pdf

Code examples:

demo_TPbatchLQR01.m from pbdlib-matlab

demo_TPbatchLQR01.cpp from pbdlib-cpp

    
The conventional method would be to treat the problem as regression. It would consist of jointly learning the movement parameters and the task parameters, corresponding to the position and orientation of the objects (left), and then retrieve a movement based on new task parameters used as inputs of a regression problem. Such approach is generic since the task parameters can take any form, and it can usually interpolate well (top-left). However, such approach fails at extrapolating to situations that are further away (bottom-left). It is for this reason that we prefer to exploit a task-parameterized approach exploiting candidate coordinate systems, providing extrapolation capability in problems that can be structured in the form of coordinate systems, which is most often the case in robotics.


GAUSSIAN MIXTURE REGRESSION


Gaussian estimate of a mixture of Gaussians (law of total covariance). The three red distributions depict the density functions of three Gaussians in a GMM weighted by their respective priors. The red dashed line depicts the density function of the resulting sum. The green distribution represents the density function as a single Gaussian estimate of this mixture of Gaussians.
Gaussian mixture regression (GMR) with a 2D example (1D input, 1D output), based on a mixture of 2 Gaussians representing the joint distribution of the data.

Gaussian mixture regression (GMR) is a simple nonlinear regression technique that does not model the regression function directly, but instead first models the joint probability density of input-output data in the form of a Gaussian mixture model (GMM), which can for example be estimated by an expectation-maximization (EM) procedure.

Its computation relies on linear transformation and conditional distribution properties of multivariate normal distributions. GMR provides a fast regression approach in which multivariate output distributions can be computed in an online manner, with a computation time independent of the number of datapoints used to train the model, by exploiting the learned joint density model. In GMR, both input and output variables can be multivariate, and after learning, any subset of input-output dimensions can be selected for regression. This can for example be exploited to handle different sources of missing data, where expectations on the remaining dimensions can be computed as a multivariate distribution. These properties make GMR an attractive tool for robotics, which can be used in a wide range of problems and that can be combined fluently with other techniques or be used as a base for new developments.

References:

Calinon, S. (2016). A Tutorial on Task-Parameterized Movement Learning and Retrieval. Intelligent Service Robotics (Springer), 9:1, 1-29. info pdf

Calinon, S. and Lee, D. (2018). Learning Control. Vadakkepat, P. and Goswami, A. (eds.). Humanoid Robotics: a Reference. Springer. info pdf

Code examples:

demo_GMR01.m in pbdlib-matlab

demo_GMR01.cpp in pbdlib-cpp

Gaussian conditioning in standard GMR (left) can directly be extended to conditional distribution with uncertainty on the input (right).


TENSOR-VARIATE REGRESSION


Sensory data in robotics are typically organized as multidimensional arrays (arrays of sensors, multiple channels, time evolution of data, multiple coordinate systems, etc.). This leads our group to investigate the field of tensor methods, also called multilinear algebra. Tensors are generalization of matrices to arrays of higher dimensions, where vectors and matrices correspond to 1st and 2nd-order tensors. When data are organized in matrices or arrays of higher dimensions (tensors), classical regression methods first transform these data into vectors, therefore ignoring the underlying structure of the data and increasing the dimensionality of the problem. This flattening operation typically leads to overfitting when only few training data is available.

We investigate the use of expert models (product of experts, mixture of experts) relying on tensorial representations. The goal is to devise models and algorithms that can take into account the underlying structure of the data, and that remain efficient even when only few training data are available.

Reference:

Jaquier, N., Haschke, R. and Calinon, S. (2019). Tensor-variate Mixture of Experts. arXiv:1902.11104. info pdf

Extension of ridge regression and logistic regression to tensor-variate data, where $\circ$ are outer products and $\langle\cdot,\cdot\rangle$ are inner products.
Tensor-variate mixture of experts, with tensor regression as experts and tensor logistic regression as gating functions.


GEOMETRY-AWARE LEARNING AND CONTROL


The data encountered in robotics are characterized by simple but varied geometries, which are often underexploited when developing learning and control algorithms. Such data range from joint angles in revolving articulations, rigid body motions, orientations represented as unit quaternions, sensory data processed as spatial covariance features, or other forms of symmetric positive definite matrices such as inertia or manipulability ellipsoids. Moreover, many applications require these data to be handled altogether.

We exploit Riemannian manifold techniques to extend algorithms initially developed for Euclidean data, by efficiently taking into account prior knowledge about these manifolds and by modeling joint distributions among these heterogeneous data. The use of these differential geometry techniques allow us to treat data of various forms in a unified manner (including data in standard Euclidean spaces). It can typically be used to revisit common optimization problems in robotics formulated in standard Euclidean spaces, by treating them as unconstrained problems inherently taking into account the geometry of the data.

References:

Zeestraten, M.J.A., Havoutis, I., Silvério, J., Calinon, S. and Caldwell, D.G. (2017). An Approach for Imitation Learning on Riemannian Manifolds. IEEE Robotics and Automation Letters (RA-L), 2:3, 1240-1247. info pdf

Jaquier, N., Rozo, L., Caldwell, D.G. and Calinon, S. (2018). Geometry-aware Tracking of Manipulability Ellipsoids. In Proc. Robotics: Science and Systems (RSS). info pdf

Code examples:

demo_Riemannian_sphere_GMM01.m from pbdlib-matlab

demo_Riemannian_sphere_GMM01.cpp from pbdlib-cpp

Examples of problems in robotics that can leverage Gaussian-based representations on Riemannian manifolds. Such approach can be used to extend clustering, regression, fusion, control and planning problems to non-Euclidean data. In these examples, Gaussians are defined with centers on the manifolds and covariances in the tangent spaces of the centers.
Applications in robotics exploiting statistics on Riemannian manifolds rely on two well-known principles of Riemannian geometry: exponential/logarithmic mapping and parallel transport.
Structured manifolds in robotics. $\mathcal{S}^3$ can be used to represent the orientation of robot's end-effectors (unit quaternions). $\mathcal{S}^6_{++}$ can be used to represent manipulability ellipsoids (manipulability capability in translation and rotation), corresponding to a symmetric positive definite (SPD) matrix manifold. $\mathcal{H}^d$ can be used to represent graphs and roadmaps. $\mathcal{G}^{d,p}$ can be used to represent subspaces. For these four manifolds, the bottom graphs depict $\mathcal{S}^2$, $\mathcal{S}^2_{++}$ $\mathcal{H}^2$ and $\mathcal{G}^{3,2}$, with a clustering problem in which the datapoints (black dots/planes) are segmented in two classes, each represented by a center (red and blue dots/planes).

Interpolation on various manifolds.

Clustering on various manifolds with Gaussian mixture models.
Task-parameterized Gaussian mixture model (TP-GMM) extended to $\mathcal{S}^d$ manifolds.
Gaussian mixture regression (GMR) on SPD manifold. Top: Classical use of GMR to encode trajectories with time as input and spatial variables as output. Bottom: Extension to Riemannian manifolds with outputs on the SPD manifold. This nonlinear regression approach provides a conditional estimate of the output expressed in the form of matrix-variate Gaussians.

Applications



ASSISTIVE TELEOPERATION OF A BIMANUAL UNDERWATER ROBOT (DEXROV)


Assistive teleoperation by exploiting probabilistic models for both classification and synthesis purposes.
Minimal intervention controller able to exploit the task variations extracted from the set of demonstrations.
Dry trials with the DexROV robot and test panel.

The DexROV project aims at controlling a bimanual underwater robot from a teleoperation center, with a user wearing an exoskeleton, controlling the robot in a virtual reality environment.

In DexROV, we develop approaches recasting teleoperation as a collaborative human-robot teamwork. The combination of model predictive control and task-parameterized probabilistic models is explored as a way to cope with teleoperation with long transmission delays (up to a second), and/or the simultaneous teleoperation of many degrees of freedom.

In this approach, the model parameters are first transmitted to both the teleoperator side and the robot side. The task parameters on the robot side can then update the model of the skill at fast pace by local sensing, without requiring the transmission of this change to the teleoperator.

As an example, if the robot has observed that the task of drilling requires the drill to be perpendicular when it approaches a surface, the robot will then automatically orient the drill when it approaches another surface, letting the teleoperator concentrate on the position to drill by delegating the orientation tracking aspect to the robot. The robot will automatically react to perturbations transparently to the user (e.g., when reorienting the surface). The aim is to reduce the cognitive load of the teleoperator for repetitive or well structured tasks.

Project website: http://www.dexrov.eu/

References:

Havoutis, I. and Calinon, S. (2018). Learning from demonstration for semi-autonomous teleoperation. Autonomous Robots. info pdf

Havoutis, I. and Calinon, S. (2017). Supervisory teleoperation with online learning and optimal control. In Proc. of the IEEE Intl Conf. on Robotics and Automation (ICRA). info pdf

Tanwani, A.K. and Calinon, S. (2016). Learning Robot Manipulation Tasks with Task-Parameterized Semi-Tied Hidden Semi-Markov Model. IEEE Robotics and Automation Letters (RA-L), 1:1, 235-242. info pdf



PERSONALIZED ASSISTANCE IN DRESSING (I-DRESS)


The I-DRESS project aims at providing dressing assistance to persons with limited mobility. Two case studies are considered: to put on a jacket and to put on shoes.

In I-DRESS, we tackle the problem of transferring assistive skills to robots, by letting the robots acquire not only movements, but also force and compliance behaviors. We explore extensions of movement primitives frequently employed in robotics to a wider repertoire of skills composed of reactive, time-dependent and time-independent behaviors based on force, impedance, position and orientation information.

Assisting a person to dress is a challenging case study, in which the robot needs to adapt to different morphologies, pathologies or stages of recovery, with varied requirements for movement generation and physical interaction. Since this assistance is person-dependent, with preferences and requirements that can evolve with time, it cannot be pre-programmed. Dressing assistance is typically provided by healthcare workers, which is not always convenient. From the worker perspective, the activity takes time and is not particularly gratifying. From the patient perspective, such assistance is often viewed negatively because it reduces the sense of independence.

In this context, learning from demonstration can provide a solution to transfer dressing assistance skills to the robot. We explore if this could be achieved by means of kinesthetic teaching, where demonstrations are used to let the robot acquire person-specific requirements and preferences.

Project website: https://www.i-dress-project.eu/

References:

Pignat, E. and Calinon, S. (2017). Learning adaptive dressing assistance from human demonstration. Robotics and Autonomous Systems, 93, 61-75. info pdf

Canal, G., Pignat, E., Alenya, G., Calinon, S. and Torras, C. (2018). Joining high-level symbolic planning with low-level motion primitives in adaptive HRI: application to dressing assistance. In Proc. of the IEEE Intl Conf. on Robotics and Automation (ICRA). info pdf

Adaptive robotic assistance to put on a jacket (left) and shoes (right).
For the underlying representation of the assistive skills, I-DRESS uses a task-parameterized hidden semi-Markov model (TP-HSMM), combined with linear quadratic tracking (LQT). The red and blue Gaussians show the movement learned in the robot and shoe coordinate systems, respectively. The yellow Gaussians are used to generate the movement, which are computed as a product of Gaussians, effectively fusing the controllers associated to the two coordinate systems.
Images from [Canal, Pignat et al, ICRA'2018].


CONTROL OF PROSTHETIC HANDS FROM MYOGRAPHY DATA (TACT-HAND)


Fusion of sEMG and tactile sensing information for the control of a prosthetic hand.
GMR for the control of prosthetic hands, with SPD signals used as input (spatial covariances computed from sEMG sensors). Activation signals corresponding to different hand poses are used as outputs. Discarding the geometry of the data (treating datapoints as if they were in the Euclidean space) results in poor discrimination between hand poses (bottom graphs, in green). In this application, it is important to take the geometry of the data into account in GMR (bottom graphs, in blue).

We apply Riemannian geometry approaches and tensor methods to the control of prosthetic hands by exploiting several sensor modalities. A tactile bracelet is providing sensory data organized as a cylindric grid, which are augmented with noisy surface electromyography (sEMG) signals typically processed as spatial covariance features. Instead of flattening or vectorizing the above data, we aim at treating the different sources statistically as a joint distribution, by keeping the original geometry and structure of the data.

Project website: http://www.idiap.ch/project/tact-hand/

References:

Jaquier, N. and Calinon, S. (2017). Gaussian Mixture Regression on Symmetric Positive Definite Matrices Manifolds: Application to Wrist Motion Estimation with sEMG. In Proc. of the IEEE/RSJ Intl Conf. on Intelligent Robots and Systems (IROS). info pdf

Jaquier, N., Castellini, C. and Calinon, S. (2017). Improving Hand and Wrist Activity Detection Using Tactile Sensors and Tensor Regression Methods on Riemannian Manifolds. Myoelectric Controls Symposium (MEC). info pdf



INTUITIVE USER INTERFACES FOR THE GENERATION OF NATURAL MOVEMENTS


Many computer aided design applications require the generation of continuous traces, where the most common interface is to edit the control points of some form of interpolating spline. In this research, conducted in collaboration with Daniel Berio and Frederic Fol Leymarie at Goldsmiths University of London, we study how tools from computational motor control can be used to extend such methods to encapsulate movement dynamics, precision and coordination.

Our approach relies on a probabilistic formulation of model predictive control, by providing users with a simple interactive interface to generate multiple movements and traces at once by defining a distribution of trajectories rather then a single one. A dynamical system is then used to generate natural looking curves stochastically.

The method is applied to the generation of traces and patterns that are similar to the ones that can be seen in art forms such as calligraphy and graffiti.

Reference:

Berio, D., Calinon, S. and Fol Leymarie, F. (2017). Generating Calligraphic Trajectories with Model Predictive Control. In Proc. of the 43rd Conf. on Graphics Interface. info pdf

Alphabet letter edited by the proposed interface, enabling the definition of a path, together with additional variation and coordination information to generate naturally-looking variants of the path.


ROBOT SKILLS ACQUISITION THROUGH ACTIVE LEARNING AND SOCIAL INTERACTION STRATEGIES (ROSALIS)


ROSALIS considers an active and adaptive selection of the learning modalities that best benefit the skill acquisition process.

Most efforts in robot learning from demonstration are turned toward developing algorithms for the acquisition of specific skills from training data. While such developments are important, they often do not take into account the social structure of the process, in particular, that the interaction with the user and the selection of the different interaction steps can directly influence the quality of the collected data.

Similarly, while skills acquisition encompasses a wide range of social and self-refinement learning strategies, including mimicking (without understanding the objective), goal-level emulation (discovering the objectives by discarding the specific way in which a task is achieved), exploration with self-assessed rewards or feedback from the users, they each require the design of dedicated algorithms, but the ways in which they can be organized have been overlooked so far. We address this challenge in the ROSALIS project, by exploiting social interaction to transfer skills efficiently to robots.