Header logo is


2006


no image
Miniature endoscopic capsule robot using biomimetic micro-patterned adhesives

Karagozler, M. E., Cheung, E., Kwon, J., Sitti, M.

In Biomedical Robotics and Biomechatronics, 2006. BioRob 2006. The First IEEE/RAS-EMBS International Conference on, pages: 105-111, 2006 (inproceedings)

pi

[BibTex]

2006


[BibTex]


no image
Approximate nearest neighbor regression in very high dimensions

Vijayakumar, S., DSouza, A., Schaal, S.

In Nearest-Neighbor Methods in Learning and Vision, pages: 103-142, (Editors: Shakhnarovich, G.;Darrell, T.;Indyk, P.), Cambridge, MA: MIT Press, 2006, clmc (inbook)

am

link (url) [BibTex]

link (url) [BibTex]


no image
Toward micro wall-climbing robots using biomimetic fibrillar adhesives

Greuter, M., Shah, G., Caprari, G., Tâche, F., Siegwart, R., Sitti, M.

In Proceedings of the 3rd International Symposium on Autonomous Minirobots for Research and Edutainment (AMiRE 2005), pages: 39-46, 2006 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Geckobot: A gecko inspired climbing robot using elastomer adhesives

Unver, O., Uneri, A., Aydemir, A., Sitti, M.

In Robotics and Automation, 2006. ICRA 2006. Proceedings 2006 IEEE International Conference on, pages: 2329-2335, 2006 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Towards hybrid swimming microrobots: bacteria assisted propulsion of polystyrene beads

Behkam, B., Sitti, M.

In Engineering in Medicine and Biology Society, 2006. EMBS’06. 28th Annual International Conference of the IEEE, pages: 2421-2424, 2006 (inproceedings)

pi

Project Page [BibTex]

Project Page [BibTex]


no image
Soft microcontact printing with force control using microrobotic assembly based templates

Tafazzoli, A., Sitti, M.

In Advanced Motion Control, 2006. 9th IEEE International Workshop on, pages: 500-505, 2006 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Modeling of the supporting legs for designing biomimetic water strider robots

Song, Y. S., Suhr, S. H., Sitti, M.

In Robotics and Automation, 2006. ICRA 2006. Proceedings 2006 IEEE International Conference on, pages: 2303-2310, 2006 (inproceedings)

pi

[BibTex]

[BibTex]


no image
A novel water running robot inspired by basilisk lizards

Floyd, S., Keegan, T., Palmisano, J., Sitti, M.

In Intelligent Robots and Systems, 2006 IEEE/RSJ International Conference on, pages: 5430-5436, 2006 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Force-controlled microcontact printing using microassembled particle templates

Tafazzoli, A., Pawashe, C., Sitti, M.

In Robotics and Automation, 2006. ICRA 2006. Proceedings 2006 IEEE International Conference on, pages: 263-268, 2006 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Waalbot: An agile small-scale wall climbing robot utilizing pressure sensitive adhesives

Murphy, M. P., Tso, W., Tanzini, M., Sitti, M.

In Intelligent Robots and Systems, 2006 IEEE/RSJ International Conference on, pages: 3411-3416, 2006 (inproceedings)

pi

[BibTex]

[BibTex]

2005


no image
Natural Actor-Critic

Peters, J., Vijayakumar, S., Schaal, S.

In Proceedings of the 16th European Conference on Machine Learning, 3720, pages: 280-291, (Editors: Gama, J.;Camacho, R.;Brazdil, P.;Jorge, A.;Torgo, L.), Springer, ECML, 2005, clmc (inproceedings)

Abstract
This paper investigates a novel model-free reinforcement learning architecture, the Natural Actor-Critic. The actor updates are based on stochastic policy gradients employing AmariÕs natural gradient approach, while the critic obtains both the natural policy gradient and additional parameters of a value function simultaneously by linear regres- sion. We show that actor improvements with natural policy gradients are particularly appealing as these are independent of coordinate frame of the chosen policy representation, and can be estimated more efficiently than regular policy gradients. The critic makes use of a special basis function parameterization motivated by the policy-gradient compatible function approximation. We show that several well-known reinforcement learning methods such as the original Actor-Critic and BradtkeÕs Linear Quadratic Q-Learning are in fact Natural Actor-Critic algorithms. Em- pirical evaluations illustrate the effectiveness of our techniques in com- parison to previous methods, and also demonstrate their applicability for learning control on an anthropomorphic robot arm.

am ei

link (url) DOI [BibTex]

2005


link (url) DOI [BibTex]


no image
Comparative experiments on task space control with redundancy resolution

Nakanishi, J., Cory, R., Mistry, M., Peters, J., Schaal, S.

In Proceedings of the 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 3901-3908, Edmonton, Alberta, Canada, Aug. 2-6, IROS, 2005, clmc (inproceedings)

Abstract
Understanding the principles of motor coordination with redundant degrees of freedom still remains a challenging problem, particularly for new research in highly redundant robots like humanoids. Even after more than a decade of research, task space control with redundacy resolution still remains an incompletely understood theoretical topic, and also lacks a larger body of thorough experimental investigation on complex robotic systems. This paper presents our first steps towards the development of a working redundancy resolution algorithm which is robust against modeling errors and unforeseen disturbances arising from contact forces. To gain a better understanding of the pros and cons of different approaches to redundancy resolution, we focus on a comparative empirical evaluation. First, we review several redundancy resolution schemes at the velocity, acceleration and torque levels presented in the literature in a common notational framework and also introduce some new variants of these previous approaches. Second, we present experimental comparisons of these approaches on a seven-degree-of-freedom anthropomorphic robot arm. Surprisingly, one of our simplest algorithms empirically demonstrates the best performance, despite, from a theoretical point, the algorithm does not share the same beauty as some of the other methods. Finally, we discuss practical properties of these control algorithms, particularly in light of inevitable modeling errors of the robot dynamics.

am ei

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Modeling and testing of a biomimetic flagellar propulsion method for microscale biomedical swimming robots

Behkam, B., Sitti, M.

In Proceedings of Advanced Intelligent Mechatronics Conference, pages: 37-42, 2005 (inproceedings)

pi

Project Page [BibTex]

Project Page [BibTex]


no image
Predicting EMG Data from M1 Neurons with Variational Bayesian Least Squares

Ting, J., D’Souza, A., Yamamoto, K., Yoshioka, T., Hoffman, D., Kakei, S., Sergio, L., Kalaska, J., Kawato, M., Strick, P., Schaal, S.

In Advances in Neural Information Processing Systems 18 (NIPS 2005), (Editors: Weiss, Y.;Schölkopf, B.;Platt, J.), Cambridge, MA: MIT Press, Vancouver, BC, Dec. 6-11, 2005, clmc (inproceedings)

Abstract
An increasing number of projects in neuroscience requires the statistical analysis of high dimensional data sets, as, for instance, in predicting behavior from neural firing, or in operating artificial devices from brain recordings in brain-machine interfaces. Linear analysis techniques remain prevalent in such cases, but classi-cal linear regression approaches are often numercially too fragile in high dimen-sions. In this paper, we address the question of whether EMG data collected from arm movements of monkeys can be faithfully reconstructed with linear ap-proaches from neural activity in primary motor cortex (M1). To achieve robust data analysis, we develop a full Bayesian approach to linear regression that automatically detects and excludes irrelevant features in the data, and regular-izes against overfitting. In comparison with ordinary least squares, stepwise re-gression, partial least squares, and a brute force combinatorial search for the most predictive input features in the data, we demonstrate that the new Bayesian method offers a superior mixture of characteristics in terms of regularization against overfitting, computational efficiency, and ease of use, demonstrating its potential as a drop-in replacement for other linear regression techniques. As neuroscientific results, our analyses demonstrate that EMG data can be well pre-dicted from M1 neurons, further opening the path for possible real-time inter-faces between brains and machines.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Biologically inspired adhesion based surface climbing robots

Menon, C., Sitti, M.

In Robotics and Automation, 2005. ICRA 2005. Proceedings of the 2005 IEEE International Conference on, pages: 2715-2720, 2005 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Claytronics: highly scalable communications, sensing, and actuation networks

Aksak, Burak, Bhat, Preethi Srinivas, Campbell, Jason, DeRosa, Michael, Funiak, Stanislav, Gibbons, Phillip B, Goldstein, Seth Copen, Guestrin, Carlos, Gupta, Ashish, Helfrich, Casey, others

In Proceedings of the 3rd international conference on Embedded networked sensor systems, pages: 299-299, 2005 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Rapbid synchronization and accurate phase-locking of rhythmic motor primitives

Pongas, D., Billard, A., Schaal, S.

In IEEE International Conference on Intelligent Robots and Systems (IROS 2005), pages: 2911-2916, Edmonton, Alberta, Canada, Aug. 2-6, 2005, clmc (inproceedings)

Abstract
Rhythmic movement is ubiquitous in human and animal behavior, e.g., as in locomotion, dancing, swimming, chewing, scratching, music playing, etc. A particular feature of rhythmic movement in biology is the rapid synchronization and phase locking with other rhythmic events in the environment, for instance music or visual stimuli as in ball juggling. In traditional oscillator theories to rhythmic movement generation, synchronization with another signal is relatively slow, and it is not easy to achieve accurate phase locking with a particular feature of the driving stimulus. Using a recently developed framework of dynamic motor primitives, we demonstrate a novel algorithm for very rapid synchronizaton of a rhythmic movement pattern, which can phase lock any feature of the movement to any particulur event in the driving stimulus. As an example application, we demonstrate how an anthropomorphic robot can use imitation learning to acquire a complex rumming pattern and keep it synchronized with an external rhythm generator that changes its frequency over time.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Biologically Inspired Miniature Water Strider Robot.

Suhr, S. H., Song, Y. S., Lee, S. J., Sitti, M.

In Robotics: Science and Systems, pages: 319-326, 2005 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Polymer micro/nanofiber fabrication using micro/nanopipettes

Nain, A. S., Amon, C., Sitti, M.

In Nanotechnology, 2005. 5th IEEE Conference on, pages: 366-369, 2005 (inproceedings)

pi

[BibTex]

[BibTex]


no image
A new methodology for robot control design

Peters, J., Mistry, M., Udwadia, F. E., Schaal, S.

In The 5th ASME International Conference on Multibody Systems, Nonlinear Dynamics, and Control (MSNDC 2005), Long Beach, CA, Sept. 24-28, 2005, clmc (inproceedings)

Abstract
Gauss principle of least constraint and its generalizations have provided a useful insights for the development of tracking controllers for mechanical systems (Udwadia,2003). Using this concept, we present a novel methodology for the design of a specific class of robot controllers. With our new framework, we demonstrate that well-known and also several novel nonlinear robot control laws can be derived from this generic framework, and show experimental verifications on a Sarcos Master Arm robot for some of these controllers. We believe that the suggested approach unifies and simplifies the design of optimal nonlinear control laws for robots obeying rigid body dynamics equations, both with or without external constraints, holonomic or nonholonomic constraints, with over-actuation or underactuation, as well as open-chain and closed-chain kinematics.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Geckobot and waalbot: Small-scale wall climbing robots

Unver, O., Murphy, M., Sitti, M.

In Infotech@ Aerospace, pages: 6940, 2005 (incollection)

pi

[BibTex]

[BibTex]


no image
Fusion of biomedical microcapsule endoscope and microsystem technology

Kim, Tae Song, Kim, Byungkyu, Cho, Dongil Dan, Song, Si Young, Dario, P, Sitti, M

In Solid-State Sensors, Actuators and Microsystems, 2005. Digest of Technical Papers. TRANSDUCERS’05. The 13th International Conference on, 1, pages: 9-14, 2005 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Atomic force microscope based two-dimensional assembly of micro/nanoparticles

Tafazzoli, A., Pawashe, C., Sitti, M.

In Assembly and Task Planning: From Nano to Macro Assembly and Manufacturing, 2005.(ISATP 2005). The 6th IEEE International Symposium on, pages: 230-235, 2005 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Arm movement experiments with joint space force fields using an exoskeleton robot

Mistry, M., Mohajerian, P., Schaal, S.

In IEEE Ninth International Conference on Rehabilitation Robotics, pages: 408-413, Chicago, Illinois, June 28-July 1, 2005, clmc (inproceedings)

Abstract
A new experimental platform permits us to study a novel variety of issues of human motor control, particularly full 3-D movements involving the major seven degrees-of-freedom (DOF) of the human arm. We incorporate a seven DOF robot exoskeleton, and can minimize weight and inertia through gravity, Coriolis, and inertia compensation, such that subjects' arm movements are largely unaffected by the manipulandum. Torque perturbations can be individually applied to any or all seven joints of the human arm, thus creating novel dynamic environments, or force fields, for subjects to respond and adapt to. Our first study investigates a joint space force field where the shoulder velocity drives a disturbing force in the elbow joint. Results demonstrate that subjects learn to compensate for the force field within about 100 trials, and from the strong presence of aftereffects when removing the field in some randomized catch trials, that an inverse dynamics, or internal model, of the force field is formed by the nervous system. Interestingly, while post-learning hand trajectories return to baseline, joint space trajectories remained changed in response to the field, indicating that besides learning a model of the force field, the nervous system also chose to exploit the space to minimize the effects of the force field on the realization of the endpoint trajectory plan. Further applications for our apparatus include studies in motor system redundancy resolution and inverse kinematics, as well as rehabilitation.

am

link (url) [BibTex]

link (url) [BibTex]


no image
A unifying framework for the control of robotics systems

Peters, J., Mistry, M., Udwadia, F. E., Cory, R., Nakanishi, J., Schaal, S.

In IEEE International Conference on Intelligent Robots and Systems (IROS 2005), pages: 1824-1831, Edmonton, Alberta, Canada, Aug. 2-6, 2005, clmc (inproceedings)

Abstract
Recently, [1] suggested to derive tracking controllers for mechanical systems using a generalization of GaussÕ principle of least constraint. This method al-lows us to reformulate control problems as a special class of optimal control. We take this line of reasoning one step further and demonstrate that well-known and also several novel nonlinear robot control laws can be derived from this generic methodology. We show experimental verifications on a Sar-cos Master Arm robot for some of the the derived controllers.We believe that the suggested approach offers a promising unification and simplification of nonlinear control law design for robots obeying rigid body dynamics equa-tions, both with or without external constraints, with over-actuation or under-actuation, as well as open-chain and closed-chain kinematics.

am

link (url) [BibTex]

link (url) [BibTex]


no image
A new endoscopic microcapsule robot using beetle inspired microfibrillar adhesives

Cheung, E., Karagozler, M. E., Park, S., Kim, B., Sitti, M.

In Advanced Intelligent Mechatronics. Proceedings, 2005 IEEE/ASME International Conference on, pages: 551-557, 2005 (inproceedings)

pi

Project Page [BibTex]

Project Page [BibTex]

2004


no image
E. coli inspired propulsion for swimming microrobots

Behkam, B., Sitti, M.

In ASME 2004 International Mechanical Engineering Congress and Exposition, pages: 1037-1041, 2004 (inproceedings)

pi

Project Page [BibTex]

2004


Project Page [BibTex]


no image
Dynamic modes of nanoparticle motion during nanoprobe-based manipulation

Tafazzoli, A., Sitti, M.

In Nanotechnology, 2004. 4th IEEE Conference on, pages: 35-37, 2004 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Modeling and design of biomimetic adhesives inspired by gecko foot-hairs

Shah, G. J., Sitti, M.

In Robotics and Biomimetics, 2004. ROBIO 2004. IEEE International Conference on, pages: 873-878, 2004 (inproceedings)

pi

Project Page [BibTex]

Project Page [BibTex]


no image
Learning Composite Adaptive Control for a Class of Nonlinear Systems

Nakanishi, J., Farrell, J. A., Schaal, S.

In IEEE International Conference on Robotics and Automation, pages: 2647-2652, New Orleans, LA, USA, April 2004, 2004, clmc (inproceedings)

am

link (url) [BibTex]

link (url) [BibTex]


no image
Augmented reality user interface for nanomanipulation using atomic force microscopes

Vogl, W., Sitti, M., Ehrenstrasser, M., Zäh, M.

In Proc. of Eurohaptics, pages: 413-416, 2004 (inproceedings)

pi

[BibTex]

[BibTex]


no image
WaalBots for Space applications

Menon, C., Murphy, M., Angrilli, F., Sitti, M.

In 55th IAC Conference, Vancouver, Canada, 2004 (inproceedings)

pi

[BibTex]

[BibTex]


no image
A framework for learning biped locomotion with dynamic movement primitives

Nakanishi, J., Morimoto, J., Endo, G., Cheng, G., Schaal, S., Kawato, M.

In IEEE-RAS/RSJ International Conference on Humanoid Robots (Humanoids 2004), IEEE, Los Angeles, CA: Nov.10-12, Santa Monica, CA, 2004, clmc (inproceedings)

Abstract
This article summarizes our framework for learning biped locomotion using dynamical movement primitives based on nonlinear oscillators. Our ultimate goal is to establish a design principle of a controller in order to achieve natural human-like locomotion. We suggest dynamical movement primitives as a central pattern generator (CPG) of a biped robot, an approach we have previously proposed for learning and encoding complex human movements. Demonstrated trajectories are learned through movement primitives by locally weighted regression, and the frequency of the learned trajectories is adjusted automatically by a frequency adaptation algorithm based on phase resetting and entrainment of coupled oscillators. Numerical simulations and experimental implementation on a physical robot demonstrate the effectiveness of the proposed locomotion controller. Furthermore, we demonstrate that phase resetting contributes to robustness against external perturbations and environmental changes by numerical simulations and experiments.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Learning Motor Primitives with Reinforcement Learning

Peters, J., Schaal, S.

In Proceedings of the 11th Joint Symposium on Neural Computation, http://resolver.caltech.edu/CaltechJSNC:2004.poster020, 2004, clmc (inproceedings)

Abstract
One of the major challenges in action generation for robotics and in the understanding of human motor control is to learn the "building blocks of move- ment generation," or more precisely, motor primitives. Recently, Ijspeert et al. [1, 2] suggested a novel framework how to use nonlinear dynamical systems as motor primitives. While a lot of progress has been made in teaching these mo- tor primitives using supervised or imitation learning, the self-improvement by interaction of the system with the environment remains a challenging problem. In this poster, we evaluate different reinforcement learning approaches can be used in order to improve the performance of motor primitives. For pursuing this goal, we highlight the difficulties with current reinforcement learning methods, and line out how these lead to a novel algorithm which is based on natural policy gradients [3]. We compare this algorithm to previous reinforcement learning algorithms in the context of dynamic motor primitive learning, and show that it outperforms these by at least an order of magnitude. We demonstrate the efficiency of the resulting reinforcement learning method for creating complex behaviors for automous robotics. The studied behaviors will include both discrete, finite tasks such as baseball swings, as well as complex rhythmic patterns as they occur in biped locomotion

am

[BibTex]

[BibTex]


no image
Dynamic behavior and simulation of nanoparticle sliding during nanoprobe-based positioning

Tafazzoli, A., Sitti, M.

In Proc. ASME International Mechanical Engineering Conference, 19, pages: 32, 2004 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Three-dimensional nanoscale manipulation and manufacturing using proximal probes: controlled pulling of polymer micro/nanofibers

Nain, A. S., Amon, C., Sitti, M.

In Mechatronics, 2004. ICM’04. Proceedings of the IEEE International Conference on, pages: 224-230, 2004 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Computational approaches to motor learning by imitation

Schaal, S., Ijspeert, A., Billard, A.

In The Neuroscience of Social Interaction, (1431):199-218, (Editors: Frith, C. D.;Wolpert, D.), Oxford University Press, Oxford, 2004, clmc (inbook)

Abstract
Movement imitation requires a complex set of mechanisms that map an observed movement of a teacher onto one's own movement apparatus. Relevant problems include movement recognition, pose estimation, pose tracking, body correspondence, coordinate transformation from external to egocentric space, matching of observed against previously learned movement, resolution of redundant degrees-of-freedom that are unconstrained by the observation, suitable movement representations for imitation, modularization of motor control, etc. All of these topics by themselves are active research problems in computational and neurobiological sciences, such that their combination into a complete imitation system remains a daunting undertaking - indeed, one could argue that we need to understand the complete perception-action loop. As a strategy to untangle the complexity of imitation, this paper will examine imitation purely from a computational point of view, i.e. we will review statistical and mathematical approaches that have been suggested for tackling parts of the imitation problem, and discuss their merits, disadvantages and underlying principles. Given the focus on action recognition of other contributions in this special issue, this paper will primarily emphasize the motor side of imitation, assuming that a perceptual system has already identified important features of a demonstrated movement and created their corresponding spatial information. Based on the formalization of motor control in terms of control policies and their associated performance criteria, useful taxonomies of imitation learning can be generated that clarify different approaches and future research directions.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Micro-and nano-scale robotics

Sitti, M.

In American Control Conference, 2004. Proceedings of the 2004, 1, pages: 1-8, 2004 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Gecko inspired surface climbing robots

Menon, C., Murphy, M., Sitti, M.

In Robotics and Biomimetics, 2004. ROBIO 2004. IEEE International Conference on, pages: 431-436, 2004 (inproceedings)

pi

Project Page [BibTex]

Project Page [BibTex]

2003


no image
Dynamic movement primitives - A framework for motor control in humans and humanoid robots

Schaal, S.

In The International Symposium on Adaptive Motion of Animals and Machines, Kyoto, Japan, March 4-8, 2003, March 2003, clmc (inproceedings)

Abstract
Sensory-motor integration is one of the key issues in robotics. In this paper, we propose an approach to rhythmic arm movement control that is synchronized with an external signal based on exploiting a simple neural oscillator network. Trajectory generation by the neural oscillator is a biologically inspired method that can allow us to generate a smooth and continuous trajectory. The parameter tuning of the oscillators is used to generate a synchronized movement with wide intervals. We adopted the method for the drumming task as an example task. By using this method, the robot can realize synchronized drumming with wide drumming intervals in real time. The paper also shows the experimental results of drumming by a humanoid robot.

am

link (url) [BibTex]

2003


link (url) [BibTex]


no image
Bayesian backfitting

D’Souza, A., Vijayakumar, S., Schaal, S.

In Proceedings of the 10th Joint Symposium on Neural Computation (JSNC 2003), Irvine, CA, May 2003, 2003, clmc (inproceedings)

Abstract
We present an algorithm aimed at addressing both computational and analytical intractability of Bayesian regression models which operate in very high-dimensional, usually underconstrained spaces. Several domains of research frequently provide such datasets, including chemometrics [2], and human movement analysis [1]. The literature in nonparametric statistics provides interesting solutions such as Backfitting [3] and Partial Least Squares [4], which are extremely robust and efficient, yet lack a probabilistic interpretation that could place them in the context of current research in statistical learning algorithms that emphasize the estimation of confidence, posterior distributions, and model complexity. In order to achieve numerical robustness and low computational cost, we first derive a novel Bayesian interpretation of Backfitting (BB) as a computationally efficient regression algorithm. BBÕs learning complexity scales linearly with the input dimensionality by decoupling inference among individual input dimensions. We embed BB in an efficient, locally variational model selection mechanism that automatically grows the number of backfitting experts in a mixture-of-experts regression model. We demonstrate the effectiveness of the algorithm in performing principled regularization of model complexity when fitting nonlinear manifolds while avoiding the numerical hazards associated with highly underconstrained problems. We also note that this algorithm appears applicable in various areas of neural computation, e.g., in abstract models of computational neuroscience, or implementations of statistical learning on artificial systems.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Reinforcement learning for humanoid robotics

Peters, J., Vijayakumar, S., Schaal, S.

In IEEE-RAS International Conference on Humanoid Robots (Humanoids2003), Karlsruhe, Germany, Sept.29-30, 2003, clmc (inproceedings)

Abstract
Reinforcement learning offers one of the most general framework to take traditional robotics towards true autonomy and versatility. However, applying reinforcement learning to high dimensional movement systems like humanoid robots remains an unsolved problem. In this paper, we discuss different approaches of reinforcement learning in terms of their applicability in humanoid robotics. Methods can be coarsely classified into three different categories, i.e., greedy methods, `vanilla' policy gradient methods, and natural gradient methods. We discuss that greedy methods are not likely to scale into the domain humanoid robotics as they are problematic when used with function approximation. `Vanilla' policy gradient methods on the other hand have been successfully applied on real-world robots including at least one humanoid robot. We demonstrate that these methods can be significantly improved using the natural policy gradient instead of the regular policy gradient. A derivation of the natural policy gradient is provided, proving that the average policy gradient of Kakade (2002) is indeed the true natural gradient. A general algorithm for estimating the natural gradient, the Natural Actor-Critic algorithm, is introduced. This algorithm converges to the nearest local minimum of the cost function with respect to the Fisher information metric under suitable conditions. The algorithm outperforms non-natural policy gradients by far in a cart-pole balancing evaluation, and for learning nonlinear dynamic motor primitives for humanoid robot control. It offers a promising route for the development of reinforcement learning for truly high dimensionally continuous state-action systems.

am

link (url) [BibTex]

link (url) [BibTex]


no image
High aspect ratio polymer micro/nano-structure manufacturing using nanoembossing, nanomolding and directed self-assembly

Sitti, M.

In ASME 2003 International Mechanical Engineering Congress and Exposition, pages: 293-297, 2003 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Nsf workshop on future directions in nano-scale systems, dynamics and control

Sitti, M.

In Automatic Control Conference (ACC), 2003 (inproceedings)

pi

[BibTex]

[BibTex]


no image
3-D nano-fiber manufacturing by controlled pulling of liquid polymers using nano-probes

Nain, A. S., Sitti, M.

In Nanotechnology, 2003. IEEE-NANO 2003. 2003 Third IEEE Conference on, 1, pages: 60-63, 2003 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Discovering imitation strategies through categorization of multi-cimensional data

Billard, A., Epars, Y., Schaal, S., Cheng, G.

In IEEE International Conference on Intelligent Robots and Systems (IROS 2003), Las Vegas, NV, Oct. 27-31, 2003, clmc (inproceedings)

Abstract
An essential problem of imitation is that of determining Ówhat to imitateÓ, i.e. to determine which of the many features of the demonstration are relevant to the task and which should be reproduced. The strategy followed by the imitator can be modeled as a hierarchical optimization system, which minimizes the discrepancy between two multidimensional datasets. We consider imitation of a manipulation task. To classify across manipulation strategies, we apply a probabilistic analysis to data in Cartesian and joint spaces. We determine a general metric that optimizes the policy of task reproduction, following strategy determination. The model successfully discovers strategies in six different manipulation tasks and controls task reproduction by a full body humanoid robot. or the complete path followed by the demonstrator. We follow a similar taxonomy and apply it to the learning and reproduction of a manipulation task by a humanoid robot. We take the perspective that the features of the movements to imitate are those that appear most frequently, i.e. the invariants in time. The model builds upon previous work [3], [4] and is composed of a hierarchical time delay neural network that extracts invariant features from a manipulation task performed by a human demonstrator. The system analyzes the Carthesian trajectories of the objects and the joint

am

link (url) [BibTex]

link (url) [BibTex]


no image
Scaling reinforcement learning paradigms for motor learning

Peters, J., Vijayakumar, S., Schaal, S.

In Proceedings of the 10th Joint Symposium on Neural Computation (JSNC 2003), Irvine, CA, May 2003, 2003, clmc (inproceedings)

Abstract
Reinforcement learning offers a general framework to explain reward related learning in artificial and biological motor control. However, current reinforcement learning methods rarely scale to high dimensional movement systems and mainly operate in discrete, low dimensional domains like game-playing, artificial toy problems, etc. This drawback makes them unsuitable for application to human or bio-mimetic motor control. In this poster, we look at promising approaches that can potentially scale and suggest a novel formulation of the actor-critic algorithm which takes steps towards alleviating the current shortcomings. We argue that methods based on greedy policies are not likely to scale into high-dimensional domains as they are problematic when used with function approximation Ð a must when dealing with continuous domains. We adopt the path of direct policy gradient based policy improvements since they avoid the problems of unstabilizing dynamics encountered in traditional value iteration based updates. While regular policy gradient methods have demonstrated promising results in the domain of humanoid notor control, we demonstrate that these methods can be significantly improved using the natural policy gradient instead of the regular policy gradient. Based on this, it is proved that KakadeÕs Ôaverage natural policy gradientÕ is indeed the true natural gradient. A general algorithm for estimating the natural gradient, the Natural Actor-Critic algorithm, is introduced. This algorithm converges with probability one to the nearest local minimum in Riemannian space of the cost function. The algorithm outperforms nonnatural policy gradients by far in a cart-pole balancing evaluation, and offers a promising route for the development of reinforcement learning for truly high-dimensionally continuous state-action systems.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Learning attractor landscapes for learning motor primitives

Ijspeert, A., Nakanishi, J., Schaal, S.

In Advances in Neural Information Processing Systems 15, pages: 1547-1554, (Editors: Becker, S.;Thrun, S.;Obermayer, K.), Cambridge, MA: MIT Press, 2003, clmc (inproceedings)

Abstract
If globally high dimensional data has locally only low dimensional distributions, it is advantageous to perform a local dimensionality reduction before further processing the data. In this paper we examine several techniques for local dimensionality reduction in the context of locally weighted linear regression. As possible candidates, we derive local versions of factor analysis regression, principle component regression, principle component regression on joint distributions, and partial least squares regression. After outlining the statistical bases of these methods, we perform Monte Carlo simulations to evaluate their robustness with respect to violations of their statistical assumptions. One surprising outcome is that locally weighted partial least squares regression offers the best average results, thus outperforming even factor analysis, the theoretically most appealing of our candidate techniques.Ê

am

link (url) [BibTex]

link (url) [BibTex]


no image
Manufacturing of two and three-dimensional micro/nanostructures by integrating optical tweezers with chemical assembly

Castelino, K., Satyanarayana, S., Sitti, M.

In Nanotechnology, 2003. IEEE-NANO 2003. 2003 Third IEEE Conference on, 1, pages: 56-59, 2003 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Learning from demonstration and adaptation of biped locomotion with dynamical movement primitives

Nakanishi, J., Morimoto, J., Endo, G., Schaal, S., Kawato, M.

In Workshop on Robot Learning by Demonstration, IEEE International Conference on Intelligent Robots and Systems (IROS 2003), Las Vegas, NV, Oct. 27-31, 2003, clmc (inproceedings)

Abstract
In this paper, we report on our research for learning biped locomotion from human demonstration. Our ultimate goal is to establish a design principle of a controller in order to achieve natural human-like locomotion. We suggest dynamical movement primitives as a CPG of a biped robot, an approach we have previously proposed for learning and encoding complex human movements. Demonstrated trajectories are learned through the movement primitives by locally weighted regression, and the frequency of the learned trajectories is adjusted automatically by a novel frequency adaptation algorithm based on phase resetting and entrainment of oscillators. Numerical simulations demonstrate the effectiveness of the proposed locomotion controller.

am

link (url) [BibTex]

link (url) [BibTex]