Learning collaborative manipulation tasks by ... - LASA - EPFL

Learning collaborative manipulation tasks by demonstration using ahaptic interfaceSylvain Calinon, Paul Evrard, Elena Gribovskaya, Aude Billard, Abderrahmane KheddarAbstract— This paper presents a method by which a robotcan learn through observation to perform a collaborativemanipulation task, namely lifting an object. The task is firstdemonstrated by a user controlling the robot’s hand via a hapticinterface. Learning extracts statistical redundancies in theexamples provided during training by using Gaussian MixtureRegression and Hidden Markov Model. Haptic communicationreflects more than pure dynamic information on the task, andincludes communication patterns, which result from the twousers constantly adapting their hand motion to coordinate intime and space their respective motions. We show that theproposed statistical model can efficiently encapsulate typicalcommunication patterns across different dyads of users, that arestereotypical of collaborative behaviours between humans androbots. The proposed learning approach is generative and canbe used to drive the robot’s retrieval of the task by ensuring afaithful reproduction of the overall dynamics of the task, namelyby reproducing the force patterns for both lift the object andadapt to the human user’s hand motion. This work shows thepotential that teleoperation holds for transmitting both dynamicand communicative information on the task, which classicalmethods for programming by demonstration have traditionallyoverlooked.I. INTRODUCTIONRecent research in robotics reveals a constantly increasinginterest to the problem of physical robot-human interaction,where a robot and a human are contacting with each othereither directly or through an object. Endowing robots with theability to collaborate with human partners in a smooth andnatural way will greatly promote the use of robots in usercenteredapplications, which assume dynamic environmentsand the haptic contact between agents.Various conventional methods were proposed in lastdecades to address robot-human interaction [1]–[4]. A commonapproach consists of implementing impedance or admittancecontrollers following desired dynamics imposed bya user [1]. Such schemes have been extended with varyingimpedance parameters, or with identification of the jerkparameters best fitting a trajectory followed by a humanoperator, with subsequent tracking of a reference trajectorygenerated with the identified parameters [5], [6].This work was supported by the European Commission as part ofthe Robot@CWE project (http://www.robot-at-cwe.eu) under contractFP6-2005-IST-5, and as part of the FEELIX GROWING project(http://www.feelix-growing.org) under contract FP6 IST-045169.S. Calinon, E. Gribovskaya and A. Billard are with the LearningAlgorithms and Systems Laboratory (LASA), Ecole PolytechniqueFédérale de Lausanne (EPFL), CH-1015 Lausanne, Switzerland{sylvain.calinon,elena.gribovskaya,aude.billard}@epfl.ch.P. Evrard and A. Kheddar are with the CNRS-AIST, JRL, UMI3218/CRT,Tsukuba, Japan {evrard.paul,abderrahmane.kheddar}@aist.go.jp.Fig. 1. Experimental setup. A user (on the left) teleoperates a humanoidrobot to demonstrate how to perform a lifting task collaboratively withanother user (on the right).Although the above-mentioned approach has been successfullyimplemented in certain applications, it is ratherrestrictive as the robot is always assigned a follower role withrespect to a human partner. For transmitting both dynamicand communicative information on the task, we proposehere to learn a controller for the robot by imitation [7],where a user provides demonstrations of the collaborativeskill through a haptic device (see Fig. 7).We adopt a conventional terminology [8]–[10] and consideras a “leader” a partner planning the motion and as a“follower” a partner whose behavior is reactive with respectto the task plan imposed by the leader. A robot beingconstantly a follower puts more pressure on a human partner,as besides concentrating on the task goals he/she has to beconstantly aware about the robot’s physical capabilities andconstraints (e.g. to avoid bringing it to an unfavorable posture).Furthermore advanced robotic platforms are equippedwith various sensors allowing to gather more detailed andcomplete information about an environment. Therefore, incertain conditions, the robot should be able to switch toa leading behavior and guide the human partner (e.g., ifan obstacle reduces the field of view of the operator ora task requires precise positioning). Developing such controllersrequires the understanding and interpretation of theunderlying processes of haptic communication. Reed et alreported experiments showing that dyads of human usersexhibit specialization during collaboration that can be seen asa role switching between the partners during the task [11]. Ina previous work of ours, a generic model has been proposed

that encompasses this phenomenon and may encapsulatedifferent patterns of role switching [12]. However, this modelrequires knowledge about strategies in the role attributionduring collaborative tasks in order to implement advancedcollaborative behaviors on robotic platforms.We present here preliminary results towards buildinga statistical framework based on Hidden Markov Model(HMM) and Gaussian Mixture Regression (GMR) [13], [14]allowing to extract both leading and following behaviorsfrom task demonstrations. We take the perspective thatby encoding probabilistically the correlations between thedynamical signals (forces) and kinematic parameters of thetask in a continuous manner, the robot can autonomouslyselect a controller to reproduce the collaborative skill withan appropriate behavior.Wang et al recently suggested the use of discrete HMMsto automatically detect whether the human partner was actingactively or passively during handshaking between a roboticsystem and a human operator [15]. The behavior of therobot was then modified accordingly. In our work, continuousHMMs are used to represent both task motion and user’shaptic communication signals in the same framework.A. Teaching scenarioII. EXPERIMENTAL SETUPWe propose to teach a robot how to perform a collaborativelifting task. The task consists of lifting a rigid beam in acollaborative way and by keeping the beam horizontal. Inthe first set of recordings, the teacher is asked to close hiseyes while moving the robot’s hand and the other user hasthe role of initiating and terminating the motion (i.e. robotis following). The second set of recordings is the symmetriccase where the teacher has the role of initiating and terminatingthe motion while the other user is blindfolded (i.e.robot is leading).B. Hardware setupThe collaborative lifting task is demonstrated to a HRP-2 humanoid robot. HRP-2 is a full size humanoid with 30degrees of freedom (dof): 6 dof for each leg, 6 dof for eacharm, 2 dof for the chest, 2 dof for the head and 1 dof foreach gripper. Only the right arm is used to perform the task,while the robot is standing. We assume that the hand holdsthe object firmly enough so that the object can only translatevertically with the hand of the robot. The motion of the wristis constrained to move only along a vertical direction duringthe whole task, while its orientation is constrained to remainconstant.In order to perform the demonstrations, the teacher teleoperatesthe robot using a PHANToM Premium device with6 dof force/torque feedback. Hence, the teacher has a fullfeedback of the interaction wrench measured at the gripperof the robot. PHANToM devices are impedance type devices,meaning that they are low friction, low inertia mechanisms,and accept force and torque references. The haptic deviceused for the demonstration is a Premium 1.5 model withworkspace of 381mm x 267mm x 191mm. A slight rescalingwas necessary to map the workspace of the PHANToMdevice to the workspace of the robot.C. ControllerAs PHANToM devices accept force/torque references, anatural coupling scheme is a bilateral 2-channel Velocity-Force coupling. Hence, the velocity of the tip of the PHAN-ToM device are measured and sent as a velocity referenceto the robot. Forces are measured at the wrists of the robot,and the corresponding wrench at the gripper is sent as areference to the PHANToM device. The control law for thehaptic device is thusF m = U K f F s , (1)where F m is the reference force sent to the PHANToMdevice, F s is the wrench at the gripper of the robot, K f is adiagonal gain matrix, and U is a transformation matrix fromthe coordinates frame in which the sensor force is measuredto the PHANToM coordinates frame.The robot is teleoperated using the following law˙q = J † U −1 v h , (2)where J † is the Jacobian pseudoinverse of the robot’s gripperposition with respect to the angular velocities of the rightarm, v h is the velocity of the handle of the PHANToMdevice, and ˙q is a joint velocity reference that is integratedand sent to the lower level proportional-derivative jointposition controller of the robot.A. LearningIII. PROBABILISTIC MODELData gathered during demonstrations entail position x andvelocities ẋ of the robot’s hand, as well as the force F Ssensed at the level of its wrist. If F s is the force recordedby the force sensor and by considering that the object ishorizontal and held symmetrically by the two partners, 1 i.e.that the weight of the object is shared equally between thetwo partners. The force component due to the mass of theobject is eliminated by computing the interaction force F asF = F s − m (ẍ − g), (3)2where g is the standard gravity constant. 2 Only motions inthe vertical plane are considered in the experiment (x,ẋ,F ∈R), but the framework can be used with multiple degrees offreedom (see illustrative examples in Fig. 3 and 4). The frameof reference is pointing upward (i.e. if the user lifts the objectwhile the hand does not move, F becomes positive). Thedataset is thus composed of a set of datapoints ξ = [x,ẋ,F].The joint distribution P(x,ẋ,F) is encoded in a HiddenMarkov Model (HMM) of K states, where the output1 This remains reasonable as the contact with the object is achievedthrough two similar handles placed on both sides of the object.2 The mass of the object was here known in advance, but it can also beestimated at the beginning of the interaction when the object is alreadyhandled by the two partners but does not move yet (null acceleration).

Velocity target ˆẋVelocitytarget at t=0Position target ˆxInitial positionfor reproductionẋ20x2Initial velocityfor reproduction0ẋ 1Influence of ẍ VPosition target at t=0x 1Influence of ẍ Phkx2x2x 1Final reproductionx 1x2Fig. 2. Schematic of the Gaussian Mixture Regression (GMR) process.Top: By considering a single Gaussian distribution. Bottom: By consideringa GMM composed of two Gaussian distributions.distribution of each state is represented by a Gaussiandistribution representing locally the correlations between thedifferent variables. The parameters of the model {Π,a,µ,Σ}are learned through Baum-Welch algorithm, a variant ofExpectation-Maximization (EM) for HMM [16]. Π i is theinitial probability of being in state i, a ij is the probability totransit from state i to state j. µ i and Σ i represent the centerand covariance matrix of the i-th Gaussian distribution ofthe HMM with K states. The different variables of thedataset and associated model are labelled separately as[[µIiµ O iµ O′iµ I′i]=]=[ξIξ O ]=[ µxiµ F iµẋi[ µxiµ F iµẋi[ xFẋ],[] [, ΣIi ΣIO i] [,Σ OIiΣ O′iΣ O iΣ OI′iΣ IO′i Σ I′i]ξ O′=ξ I′]=]=[ xFẋ[Σxi Σ xFi],Σ xẋiΣ FxiΣ F iΣ FẋiΣẋx iΣẋF i[Σxi Σ xFiΣ FxΣẋiΣ xẋiiΣ F iΣ FẋiΣẋx iΣẋF iwhere the uppercase indices x, F and ẋ refer respectivelyto position, force and velocity components. 33 Note that this process can similarly be used to encode trajectories definedby position and velocity recordings (i.e. without force), where we simplyhave ξ I = x and ξ O = ẋ.Σẋi],],Fig. 3. Example of a dynamical system used to reproduce a demonstratedmotion by starting from a different initial position. A 2D motion isconsidered here as an illustrative example. The first row shows the HMM invelocity and position space encoding the {ẋ, x} relationships. In the secondand third row, the initial positions are represented by points and the retrievedtrajectories are represented in bold line.x2x20.150.10.050−0.05Low variability acrossthe 10 demonstrations−0.1−0.15 −0.1 −0.05 0 0.05 0.1x 10.150.10.050Initial positions−0.05for reproductionReproductions forlow variability dataReproductions forhigh variability data−0.1−0.15 −0.1 −0.05 0 0.05 0.1x 1x 1x2κ P0.150.10.050−0.05High variability acrossthe 10 demonstrations−0.1−0.15 −0.1 −0.05 0 0.05 0.1x 10.080.060.040.020GMM encoding datawith low variability.GMM encoding datawith high variability.0 2 4 6 8Fig. 4. Influence of the variability observed during demonstrations for thereproduction of the skill. To illustrate the influence of consistency acrossthe different observations, two datasets have first been generated from onereference 2-dimensional trajectory (in dashed line), where an HMM of 5states is trained on each dataset. The first dataset consists of 10 trajectorieswith strong consistency among the different demonstrations (first graph).The second dataset presents more variability (second graph). For each ofthe two models, two reproductions are then computed by starting from newinitial positions (third graph). The two trajectories in black line are retrievedwith the HMM represented in first graph, while the two trajectories in greyline are retrieved by the HMM represented in second graph.t

B. ReproductionDuring reproduction, at each time step the current observationξ = [x,ẋ,F] is used to define a weight factor h irepresenting the influence of the i-th stateh i (ξ t ) =with α i,t =α i,t∑ Kk=1 α , (4)k,t(∑ K)α k,t−1 a ki N(ξ t ; µ i ,Σ i ),k=1where α i,t is the forward variable (defined recursivelythrough the HMM representation) corresponding to the probabilityof partially observing the sequence {ξ 1 ,ξ 2 ,...,ξ t } oflength t and of being in state i at time t, see [16].A target position ˆx and target velocity ˆẋ to attain are thenestimated through Gaussian Mixture Regression (GMR) asˆẋ =ˆx =K∑i=1h i (ξ) ( µ O i + Σ OIi (Σ I i ) −1 (ξ I − µ I i ) ) , (5)K∑h i (ξ)i=1(µ O′i)+ Σ OI′i (Σ I′i ) −1 (ξ I′ − µ I′i ) . (6)One advantage of GMR over other regression approachesis that it does not learn a model for a predetermined setof input and output variables. Instead, the joint distributionof the data is first learned by the model through a compactrepresentation encapsulating locally the correlations acrossthe different variables. Regression is then performed by specifyingon-the-fly which are the input and output variables, see[13], [14], [17] for details. In the original version of GMR,the weight (4) is computed based on position informationonlyN(ξ; µ i ,Σ i )h i (ξ) = ∑ Kk=1 N(ξ; µ . (7)k,Σ k )We extend here the GMR approach by replacing theweight originally computed through the Gaussian MixtureModel (GMM) representation (7) by its analogous HiddenMarkov Model (HMM) representation (4), which encapsulatesrobustly the sequential nature of the task.Fig. 2 illustrates the principle of the regression process.From the current position and velocity of the system, a tasklevelproportional-derivative controller similar to a massspring-dampersystem is computed to reach for the desiredvelocity ˆẋ and for the desired position ˆx. 4 The accelerationcommand in task space is determined byẍ =ẍ V{ }} {(ˆẋ − ẋ)κ V +where κ V and κ P are gains defined asκ V = 1 ∆t ,ẍ P{ }} {(ˆx − x)κ P , (8)L(x) = κ κP P max − L(x)max , (9)L max − L min4 Note that this controller also shares similarities with the second-orderdifferential equation defined by a Vector Integration To Endpoint (VITE)system [18].withL max = maxi∈{1,K} log ( N(µ x i ;µ x i ,Σ x i ) ) ,L min = min log ( N(x;µ x i ,Σ x i ) ) .i∈{1,K}x∈WIn the above equation, the notation L is used to define loglikelihoods(that correspond to weighted distance measures).κ P max is the maximum gain to attain a target position (κ P max =0.08 has been fixed empirically). W defines the robot’sworkspace, or a predetermined range of situations fixed apriori for the reproduction attempts. ∆t is the duration of aniteration step (a constant ∆t = 0.01 is considered here).At each iteration, κ P (x) is thus close to zero if x is withinthe boundary determined by the Gaussian distributions (i.e.confidence bounds defined by the centers and covariancematrices). If x is far away from the positions that have beendemonstrated, the system comes back towards the closestGaussian distribution (in a likelihood sense) with a maximumgain of κ P max, still following the trend of motion in this region(determined by ˆẋ).Here, velocity and position are updated at each iterationthrough Euler numerical integrationẋ t = ẋ t−∆t + ∆t ẍ , x t = x t−∆t + ∆t ẋ t . (10)Note however that other numerical methods for ordinarydifferential equations can similarly be used here [19]. Aninverse kinematics solution that allows to solve a maintask and simultaneously takes supplementary constraints intoaccount is then used to control the robot in joint space, asdescribed in [20].In (8), ẍ V allows to follow the demonstrated dynamicsand ẍ P prevents the robot from moving far away froman unlearned situation and to come back to an alreadyencountered context if a perturbation occurs. By using bothterms concurrently, the robot follows the learned non-lineardynamics while coming back to a known position if itdeviates from the demonstrated motion and arrives in aportion of the workspace that remains undiscovered. Anillustration of the complete process is presented in Fig. 3. Thefirst row illustrates the process to determine components toreach a desired velocity ˆẋ and a desired position ˆx througha second order dynamical system. The vector fields in thesecond row show the influence of the two commands whenused separately. On the one hand, ẍ V follows the learnedmotion but tends to move away from the demonstrationsafter a few iterations or by starting from an unexploredposition. On the other hand, ẍ P moves toward the closestpoint of the generalized trajectory. The vector field in thelast row shows the reproduction behavior when consideringboth commands simultaneously. The final controller followsthe demonstrated dynamics and prevents the robot frommoving away from an unlearned situation by coming backto an already encountered position if it deviates from thedemonstrated motion.Fig. 4 illustrates the influence of the variability observedduring demonstrations for the reproduction attempts. It also

x0.40.30.20.1F10506Robot is leadingRobot is following0−0.10 2 4 6t−50 2 4 6tẋ42660ẋ8 x 10−3 42xẋ8 x 10−3 F42−28 x 10 −3 F−5 0 5 1000−2−0.1 0 0.1 0.2 0.3 0.4−2−5 0 5 10Fig. 6. Illustration of the changes of correlations between force F andvelocity ẋ along the task. The trajectories in black and grey representrespectively the cases where the robot is leading and following.ẋ8 x 10−3 6420−2−0.1 0 0.1 0.2 0.3 0.4xẋ86420−2−5 0 5 10x 10−3FF1086420ẋ43215 x 10−3 tFig. 5. Demonstrations of the collaborative task to the robot (first tworows) and associated HMM model (last row). The 5 trajectories in blackand the 5 trajectories in grey represent respectively the demonstrations ofthe robot acting as a leader and as a follower. The points represent thebeginning of the motions.shows the evolution of the adaptive gain κ P defined in (9)along the task.IV. EXPERIMENTAL RESULTSThis experiment aims at validating that the proposedmodel can distinguish stereotypical following and leadingbehaviors (i.e. where the user is explicitly told to follow oneor the other behavior along the task) and that the model canlead to different controllers during reproduction. This is afirst step to determine if the proposed model could address infurther work more complex types of behaviors (and switchingacross those). We assessed the robustness of the proposedsystem in the series of simulations where different possibleinput force profiles are fed into the system to modulate thekinematic behavior of the robot.Fig. 5 shows the demonstrations provided to the robot(first and second row) and the associated HMM models (lastrow). The dataset and model of the robot acting as a leader(conversely the user acting as a follower) are representedin black line. The dataset and model of the robot actingas a follower (conversely the user acting as a leader) arerepresented in grey line. In the fourth graph, we see that thecorrelations between ẋ and F change along the motion. Inthe two situations (leading and following), the correlationscan be roughly decomposed into three parts correspondingto the beginning of the motion (user/robot initiating thex−20 2 4 6 80.30.250.20.150.10.050−0.050 2 4 6 8ttx00 2 4 6 80.40.30.20.10−0.10 2 4 6Fig. 7. Reproduction attempts in the case of perturbed force signals (firstthree graphs) and by starting from different initial positions (last graph).task), middle of the motion (user and robot lifting the objecttogether) and end of the motion (user/robot notifying theend the task), see also Fig. 6. The first and last datapointsare characterized by a force and velocity close to zero (ormoving towards zero). The non-linearities observed alongthe task show that approximating the collaborative behaviorwith a system of constant damping factor (i.e. linear relationbetween force and velocity) would be inefficient to model thecollaborative behaviors. We see in the last graph that HMMscan encapsulate compactly and efficiently these differentcorrelations along the motion (two HMMs with 5 states havebeen used here for the leading and following cases).Fig. 7 shows reproduction attempts highlighting the robustnessof the system to temporal and spatial variability.To highlight the generalization capabilities of the systemin terms of temporal variations, the force signal recordedduring one of the demonstration (when the robot acts asa follower) is used to simulate the force input during areproduction attempt. These results are represented in solidt8

line for the generated force input (first graph), retrievedvelocity (second graph) and position (third graph). Threedifferent perturbations of this force signal are then simulatedby distorting non-homogeneously in time the original forcesignal (shown in dashed, dotted, and dash-dotted line). Distortingthe signal in such a way simulates situations wherethe force applied by the user may appear with some temporalvariability across the different reproductions. We see that thesystem successfully adapts to these changes by changing themotion behaviors accordingly. For example, the force inputgenerated in dash-dotted line can represent the behavior ofa user first very slow at initiating the task (e.g. the user maynot be ready yet), which is reflected by a quasi constantforce of 4N sensed by the robot (first graph). Then the usersuddenly tries to lift the object, that is, after the simulatedsteady phase, a force signal similar to the demonstration butstretched in time is applied. We see in the second and thirdgraphs that the robot correctly handles this perturbation bystaying still first, and then helping the user lift the beam assoon as this one is ready (null velocity for the first one thirdof the motion). We then see that the the task is collaborativelyachieved.The last graph of Fig. 7 presents reproduction attempts(in black line) where the spatial generalization capabilitiesof the system are highlighted in the case of the robot actingas a leader. By starting from different initial positions, wesee that the system is able to retrieve an appropriate motionto lift the object.V. CONCLUSIONS AND FURTHER WORKIn this paper we proposed an approach to teach a robotcollaborative tasks through a probabilistic model based onHidden Markov Model and Gaussian Mixture Regression.We emphasized the role that teleoperation holds for transmittingboth dynamic and communicative information on thecollaborative task, which classical methods of Programmingby Demonstration have so far overlooked. We then showthrough an experiment consisting of lifting an object collaborativelywith a humanoid robot that the proposed modelcan efficiently encapsulate typical communication patternsbetween different dyads of users acting with stereotypicalcollaboration behaviors. Reproduction attempts in simulationhave finally been presented.We used here only stereotypical behaviors that were predefinedbefore the interaction between the two users to validatethe use of the haptic interface and proposed probabilisticmodel to record and encode physical collaborative skills.Further work will first concentrate on extending the frameworkto more natural interactions where the users are not toldexplicitly to behave with predetermined roles, thus extendingthe complexity of the haptic communication cues to transferto the robot. We are currently working on reproducing thelearned skill on the real robot, which will be used to evaluatethe performance of the proposed controller. We finally planto contrast the data collected in this experiment with directrecordings of the same dyads of users performing a similartask by endowing the object to lift with force sensors at bothextremities, i.e. without passing through the robot and hapticinterface to record haptic/kinematics information.REFERENCES[1] K. Kosuge, H. Yoshida, and T. Fukuda, “Dynamic control for robothumancollaboration,” in Proceedings of the 2nd IEEE InternationalWorkshop on Robot and Human Communication, Tokyo, Japan, 1993,pp. 389–401.[2] H. Arai, T. Takubo, Y. Hayashibara, and K. Tanie, “Human-robotcooperative manipulation using a virtual nonholonomic constraint,”in Proc. IEEE Intl Conf. on Robotics and Automation (ICRA), April2000, pp. 4064–4070.[3] Y. Hirata, Y. Kume, Z.-D. Wang, and K. Kosuge, “Decentralizedcontrol of multiple mobile manipulators based on virtual 3D castermotion of handling an object in cooperation with a human,” in Proc.IEEE Intl Conf. on Robotics and Automation (ICRA), September 2003,pp. 938–943.[4] S. Yigit, C. Burghart, and H. Worn, “Cooperative carrying using pumplikeconstraints,” in Proc. IEEE/RSJ Intl Conf. on Intelligent Robotsand Systems (IROS), September–October 2004, pp. 3877–3882.[5] T. Tsumugiwa, R. Yokogawa, and K. Hara, “Variable impedancecontrol with regard to working process for man-machine cooperationworksystem,” in Proc. IEEE/RSJ Intl Conf. on Intelligent Robots andSystems (IROS), October–November 2001, pp. 1564–1569.[6] V. Duchaine and C. Gosselin, “General model of human-robot cooperationusing a novel velocity based variable impedance control,”in Joint EuroHaptics Conference and Symposium on Haptic Interfacesfor Virtual Environment and Teleoperator Systems, 2007, pp. 446–451.[7] A. Billard, S. Calinon, R. Dillmann, and S. Schaal, “Robot programmingby demonstration,” in Handbook of Robotics, B. Siciliano andO. Khatib, Eds. Secaucus, NJ, USA: Springer, 2008, pp. 1371–1394.[8] Z.-D. Wang, Y. Takano, Y. Hirata, and K. Kosuge, “A pushing leaderbased decentralized control method for cooperative object transportation,”in Proc. IEEE/RSJ Intl Conf. on Intelligent Robots and Systems(IROS), September–October 2004, pp. 1035–1040.[9] M. Rahman, R. Ikeura, and K. Mizutani, “Control characteristics oftwo humans in cooperative task,” in IEEE Intl Conf. on Systems, Man,and Cybernetics, vol. 2, 2000, pp. 1301–1306.[10] S. Gentry and E. Feron, “Musicality experiments in lead and followdance,” in IEEE Intl Conf. on Systems, Man, and Cybernetics, vol. 1,2004, pp. 984–988.[11] K. Reed and M. Peshkin, “Physical collaboration of human-humanand human-robot teams,” IEEE Transactions on Haptics, vol. 1, no. 2,pp. 108–120, December 2008.[12] P. Evrard and A. Kheddar, “Homotopy switching model for dyadhaptic interaction in physical collaborative tasks,” in Joint EurohapticsConference and Symposium on Haptic Interfaces for Virtual Environmentand Teleoperator Systems (WorldHaptics 2009), Salt Lake City,USA, March 2009.[13] M. Hersch, F. Guenter, S. Calinon, and A. Billard, “Dynamical systemmodulation for robot learning via kinesthetic demonstrations,” IEEETrans. on Robotics, vol. 24, no. 6, pp. 1463–1467, 2008.[14] S. Calinon, F. Guenter, and A. Billard, “On learning, representing andgeneralizing a task in a humanoid robot,” IEEE Trans. on Systems,Man and Cybernetics, Part B, vol. 37, no. 2, pp. 286–298, 2007.[15] Z. Wang, A. Peer, and M. Buss, “An HMM approach to realistichaptic human-robot interaction,” in Joint EuroHaptics Conferenceand Symposium on Haptic Interfaces for Virtual Environment andTeleoperator Systems, 2009, pp. 374–379.[16] L. Rabiner, “A tutorial on hidden Markov models and selected applicationsin speech recognition,” Proc. IEEE, vol. 77:2, pp. 257–285,February 1989.[17] Z. Ghahramani and M. Jordan, “Supervised learning from incompletedata via an EM approach,” in Advances in Neural Information ProcessingSystems, J. D. Cowan, G. Tesauro, and J. Alspector, Eds., vol. 6.Morgan Kaufmann Publishers, Inc., 1994, pp. 120–127.[18] D. Bullock and S. Grossberg, “Neural dynamics of planned armmovements: Emergent invariants and speed-accuracy properties duringtrajectory formation,” Psychological Review, vol. 95, no. 1, pp. 49–90,1988.[19] J. Butcher, Numerical Methods for Ordinary Differential Equations.Chichester, UK: Wiley, 2003.[20] N. Mansard, O. Khatib, and A. Kheddar, “Integrating unilateralconstraints inside the stack of tasks,” IEEE Trans. on Robotics, 2009,in press.

Learning collaborative manipulation tasks by ... - LASA - EPFL

Create successful ePaper yourself

Delete template?

Save as template?