Header logo is


2009


Thumb xl toc image
Full phase and amplitude control in computer-generated holography

Fratz, M., Fischer, P., Giel, D. M.

OPTICS LETTERS, 34(23):3659-3661, 2009 (article)

Abstract
We report what we believe to be the first realization of a computer-generated complex-valued hologram recorded in a single film of photoactive polymer. Complex-valued holograms give rise to a diffracted optical field with control over its amplitude and phase. The holograms are generated by a one-step direct laser writing process in which a spatial light modulator (SLM) is imaged onto a polymer film. Temporal modulation of the SLM during exposure controls both the strength of the induced birefringence and the orientation of the fast axis. We demonstrate that complex holograms can be used to impart arbitrary amplitude and phase profiles onto a beam and thereby open new possibilities in the control of optical beams. (C) 2009 Optical Society of America

pf

[BibTex]

2009


[BibTex]


Thumb xl toc image
Digital polarization holograms with defined magnitude and orientation of each pixel’s birefringence

Fratz, M., Giel, D. M., Fischer, P.

OPTICS LETTERS, 34(8):1270-1272, 2009 (article)

Abstract
A new form of digital polarization holography is demonstrated that permits both the amplitude and the phase of a diffracted beam to be independently controlled. This permits two independent intensity images to be stored in the same hologram. To fabricate the holograms, a birefringence with defined retardance and orientation of the fast axis is recorded into a photopolymer film. The holograms are selectively read out by choosing the polarization state of the read beam. Polarization holograms of this kind increase the data density in holographic data storage and allow higher quality diffractive optical elements to be written. (C) 2009 Optical Society of America

pf

[BibTex]


Thumb xl toc images
Controlled Propulsion of Artificial Magnetic Nanostructured Propellers

Ghosh, A., Fischer, P.

NANO LETTERS, 9(6):2243-2245, 2009, Featured highlight ‘Nanotechnology: The helix that delivers’ Nature 459, 13 (2009). (article)

Abstract
For biomedical applications, such as targeted drug delivery and microsurgery, it is essential to develop a system of swimmers that can be propelled wirelessly in fluidic environments with good control. Here, we report the construction and operation of chiral colloidal propellers that can be navigated in water with micrometer-level precision using homogeneous magnetic fields. The propellers are made via nanostructured surfaces and can be produced in large numbers. The nanopropellers can carry chemicals, push loads, and act as local probes in rheological measurements.

Featured highlight ‘Nanotechnology: The helix that delivers’ Nature 459, 13 (2009).

pf

Video - Nanospropellers DOI [BibTex]

Video - Nanospropellers DOI [BibTex]


Thumb xl toc image
Absolute Asymmetric Reduction Based on the Relative Orientation of Achiral Reactants

Kuhn, A., Fischer, P.

ANGEWANDTE CHEMIE-INTERNATIONAL EDITION, 48(37):6857-6860, 2009 (article)

pf

DOI [BibTex]

DOI [BibTex]


no image
Path integral-based stochastic optimal control for rigid body dynamics

Theodorou, E. A., Buchli, J., Schaal, S.

In Adaptive Dynamic Programming and Reinforcement Learning, 2009. ADPRL ’09. IEEE Symposium on, pages: 219-225, 2009, clmc (inproceedings)

Abstract
Recent advances on path integral stochastic optimal control [1],[2] provide new insights in the optimal control of nonlinear stochastic systems which are linear in the controls, with state independent and time invariant control transition matrix. Under these assumptions, the Hamilton-Jacobi-Bellman (HJB) equation is formulated and linearized with the use of the logarithmic transformation of the optimal value function. The resulting HJB is a linear second order partial differential equation which is solved by an approximation based on the Feynman-Kac formula [3]. In this work we review the theory of path integral control and derive the linearized HJB equation for systems with state dependent control transition matrix. In addition we derive the path integral formulation for the general class of systems with state dimensionality that is higher than the dimensionality of the controls. Furthermore, by means of a modified inverse dynamics controller, we apply path integral stochastic optimal control over the new control space. Simulations illustrate the theoretical results. Future developments and extensions are discussed.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Learning locomotion over rough terrain using terrain templates

Kalakrishnan, M., Buchli, J., Pastor, P., Schaal, S.

In Intelligent Robots and Systems, 2009. IROS 2009. IEEE/RSJ International Conference on, pages: 167-172, 2009, clmc (inproceedings)

Abstract
We address the problem of foothold selection in robotic legged locomotion over very rough terrain. The difficulty of the problem we address here is comparable to that of human rock-climbing, where foot/hand-hold selection is one of the most critical aspects. Previous work in this domain typically involves defining a reward function over footholds as a weighted linear combination of terrain features. However, a significant amount of effort needs to be spent in designing these features in order to model more complex decision functions, and hand-tuning their weights is not a trivial task. We propose the use of terrain templates, which are discretized height maps of the terrain under a foothold on different length scales, as an alternative to manually designed features. We describe an algorithm that can simultaneously learn a small set of templates and a foothold ranking function using these templates, from expert-demonstrated footholds. Using the LittleDog quadruped robot, we experimentally show that the use of terrain templates can produce complex ranking functions with higher performance than standard terrain features, and improved generalization to unseen terrain.

am

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


Valero-Cuevas, F., Hoffmann, H., Kurse, M. U., Kutch, J. J., Theodorou, E. A.

IEEE Reviews in Biomedical Engineering – (All authors have equally contributed), (2):110?135, 2009, clmc (article)

Abstract
Computational models of the neuromuscular system hold the potential to allow us to reach a deeper understanding of neuromuscular function and clinical rehabilitation by complementing experimentation. By serving as a means to distill and explore specific hypotheses, computational models emerge from prior experimental data and motivate future experimental work. Here we review computational tools used to understand neuromuscular function including musculoskeletal modeling, machine learning, control theory, and statistical model analysis. We conclude that these tools, when used in combination, have the potential to further our understanding of neuromuscular function by serving as a rigorous means to test scientific hypotheses in ways that complement and leverage experimental data.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Compact models of motor primitive variations for predictible reaching and obstacle avoidance

Stulp, F., Oztop, E., Pastor, P., Beetz, M., Schaal, S.

In IEEE-RAS International Conference on Humanoid Robots (Humanoids 2009), Paris, Dec.7-10, 2009, clmc (inproceedings)

Abstract
over and over again. This regularity allows humans and robots to reuse existing solutions for known recurring tasks. We expect that reusing a set of standard solutions to solve similar tasks will facilitate the design and on-line adaptation of the control systems of robots operating in human environments. In this paper, we derive a set of standard solutions for reaching behavior from human motion data. We also derive stereotypical reaching trajectories for variations of the task, in which obstacles are present. These stereotypical trajectories are then compactly represented with Dynamic Movement Primitives. On the humanoid robot Sarcos CB, this approach leads to reproducible, predictable, and human-like reaching motions.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Human optimization strategies under reward feedback

Hoffmann, H., Theodorou, E., Schaal, S.

In Abstracts of Neural Control of Movement Conference (NCM 2009), Waikoloa, Hawaii, 2009, 2009, clmc (inproceedings)

Abstract
Many hypothesis on human movement generation have been cast into an optimization framework, implying that movements are adapted to optimize a single quantity, like, e.g., jerk, end-point variance, or control cost. However, we still do not understand how humans actually learn when given only a cost or reward feedback at the end of a movement. Such a reinforcement learning setting has been extensively explored theoretically in engineering and computer science, but in human movement control, hardly any experiment studied movement learning under reward feedback. We present experiments probing which computational strategies humans use to optimize a movement under a continuous reward function. We present two experimental paradigms. The first paradigm mimics a ball-hitting task. Subjects (n=12) sat in front of a computer screen and moved a stylus on a tablet towards an unknown target. This target was located on a line that the subjects had to cross. During the movement, visual feedback was suppressed. After the movement, a reward was displayed graphically as a colored bar. As reward, we used a Gaussian function of the distance between the target location and the point of line crossing. We chose such a function since in sensorimotor tasks, the cost or loss function that humans seem to represent is close to an inverted Gaussian function (Koerding and Wolpert 2004). The second paradigm mimics pocket billiards. On the same experimental setup as above, the computer screen displayed a pocket (two bars), a white disk, and a green disk. The goal was to hit with the white disk the green disk (as in a billiard collision), such that the green disk moved into the pocket. Subjects (n=8) manipulated with the stylus the white disk to effectively choose start point and movement direction. Reward feedback was implicitly given as hitting or missing the pocket with the green disk. In both paradigms, subjects increased the average reward over trials. The surprising result was that in these experiments, humans seem to prefer a strategy that uses a reward-weighted average over previous movements instead of gradient ascent. The literature on reinforcement learning is dominated by gradient-ascent methods. However, our computer simulations and theoretical analysis revealed that reward-weighted averaging is the more robust choice given the amount of movement variance observed in humans. Apparently, humans choose an optimization strategy that is suitable for their own movement variance.

am

[BibTex]

[BibTex]


no image
Bayesian Methods for Autonomous Learning Systems (Phd Thesis)

Ting, J.

Department of Computer Science, University of Southern California, Los Angeles, CA, 2009, clmc (phdthesis)

am

PDF [BibTex]

PDF [BibTex]


no image
On-line learning and modulation of periodic movements with nonlinear dynamical systems

Gams, A., Ijspeert, A., Schaal, S., Lenarčič, J.

Autonomous Robots, 27(1):3-23, 2009, clmc (article)

Abstract
Abstract  The paper presents a two-layered system for (1) learning and encoding a periodic signal without any knowledge on its frequency and waveform, and (2) modulating the learned periodic trajectory in response to external events. The system is used to learn periodic tasks on a humanoid HOAP-2 robot. The first layer of the system is a dynamical system responsible for extracting the fundamental frequency of the input signal, based on adaptive frequency oscillators. The second layer is a dynamical system responsible for learning of the waveform based on a built-in learning algorithm. By combining the two dynamical systems into one system we can rapidly teach new trajectories to robots without any knowledge of the frequency of the demonstration signal. The system extracts and learns only one period of the demonstration signal. Furthermore, the trajectories are robust to perturbations and can be modulated to cope with a dynamic environment. The system is computationally inexpensive, works on-line for any periodic signal, requires no additional signal processing to determine the frequency of the input signal and can be applied in parallel to multiple dimensions. Additionally, it can adapt to changes in frequency and shape, e.g. to non-stationary signals, such as hand-generated signals and human demonstrations.

am

link (url) [BibTex]

link (url) [BibTex]


no image
The SL simulation and real-time control software package

Schaal, S.

University of Southern California, Los Angeles, CA, 2009, clmc (techreport)

Abstract
SL was originally developed as a Simulation Laboratory software package to allow creating complex rigid-body dynamics simulations with minimal development times. It was meant to complement a real-time robotics setup such that robot programs could first be debugged in simulation before trying them on the actual robot. For this purpose, the motor control setup of SL was copied from our experience with real-time robot setups with vxWorks (Windriver Systems, Inc.)Ñindeed, more than 90% of the code is identical to the actual robot software, as will be explained later in detail. As a result, SL is divided into three software components: 1) the generic code that is shared by the actual robot and the simulation, 2) the robot specific code, and 3) the simulation specific code. The robot specific code is tailored to the robotic environments that we have experienced over the years, in particular towards VME-based multi-processor real-time operating systems. The simulation specific code has all the components for OpenGL graphics simulations and mimics the robot multi-processor environment in simple C-code. Importantly, SL can be used stand-alone for creating graphics an-imationsÑthe heritage from real-time robotics does not restrict the complexity of possible simulations. This technical report describes SL in detail and can serve as a manual for new users of SL.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Local dimensionality reduction for non-parametric regression

Hoffman, H., Schaal, S., Vijayakumar, S.

Neural Processing Letters, 2009, clmc (article)

Abstract
Locally-weighted regression is a computationally-efficient technique for non-linear regression. However, for high-dimensional data, this technique becomes numerically brittle and computationally too expensive if many local models need to be maintained simultaneously. Thus, local linear dimensionality reduction combined with locally-weighted regression seems to be a promising solution. In this context, we review linear dimensionality-reduction methods, compare their performance on nonparametric locally-linear regression, and discuss their ability to extend to incremental learning. The considered methods belong to the following three groups: (1) reducing dimensionality only on the input data, (2) modeling the joint input-output data distribution, and (3) optimizing the correlation between projection directions and output data. Group 1 contains principal component regression (PCR); group 2 contains principal component analysis (PCA) in joint input and output space, factor analysis, and probabilistic PCA; and group 3 contains reduced rank regression (RRR) and partial least squares (PLS) regression. Among the tested methods, only group 3 managed to achieve robust performance even for a non-optimal number of components (factors or projection directions). In contrast, group 1 and 2 failed for fewer components since these methods rely on the correct estimate of the true intrinsic dimensionality. In group 3, PLS is the only method for which a computationally-efficient incremental implementation exists. Thus, PLS appears to be ideally suited as a building block for a locally-weighted regressor in which projection directions are incrementally added on the fly.

am

link (url) [BibTex]

link (url) [BibTex]


no image
The SL simulation and real-time control software package

Schaal, S.

University of Southern California, Los Angeles, CA, 2009, clmc (techreport)

Abstract
SL was originally developed as a Simulation Laboratory software package to allow creating complex rigid-body dynamics simulations with minimal development times. It was meant to complement a real-time robotics setup such that robot programs could first be debugged in simulation before trying them on the actual robot. For this purpose, the motor control setup of SL was copied from our experience with real-time robot setups with vxWorks (Windriver Systems, Inc.)â??indeed, more than 90% of the code is identical to the actual robot software, as will be explained later in detail. As a result, SL is divided into three software components: 1) the generic code that is shared by the actual robot and the simulation, 2) the robot specific code, and 3) the simulation specific code. The robot specific code is tailored to the robotic environments that we have experienced over the years, in particular towards VME-based multi-processor real-time operating systems. The simulation specific code has all the components for OpenGL graphics simulations and mimics the robot multi-processor environment in simple C-code. Importantly, SL can be used stand-alone for creating graphics an-imationsâ??the heritage from real-time robotics does not restrict the complexity of possible simulations. This technical report describes SL in detail and can serve as a manual for new users of SL.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Learning and generalization of motor skills by learning from demonstration

Pastor, P., Hoffmann, H., Asfour, T., Schaal, S.

In International Conference on Robotics and Automation (ICRA2009), Kobe, Japan, May 12-19, 2009, 2009, clmc (inproceedings)

Abstract
We provide a general approach for learning robotic motor skills from human demonstration. To represent an observed movement, a non-linear differential equation is learned such that it reproduces this movement. Based on this representation, we build a library of movements by labeling each recorded movement according to task and context (e.g., grasping, placing, and releasing). Our differential equation is formulated such that generalization can be achieved simply by adapting a start and a goal parameter in the equation to the desired position values of a movement. For object manipulation, we present how our framework extends to the control of gripper orientation and finger position. The feasibility of our approach is demonstrated in simulation as well as on a real robot. The robot learned a pick-and-place operation and a water-serving task and could generalize these tasks to novel situations.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Compliant quadruped locomotion over rough terrain

Buchli, J., Kalakrishnan, M., Mistry, M., Pastor, P., Schaal, S.

In Intelligent Robots and Systems, 2009. IROS 2009. IEEE/RSJ International Conference on, pages: 814-820, 2009, clmc (inproceedings)

Abstract
Many critical elements for statically stable walking for legged robots have been known for a long time, including stability criteria based on support polygons, good foothold selection, recovery strategies to name a few. All these criteria have to be accounted for in the planning as well as the control phase. Most legged robots usually employ high gain position control, which means that it is crucially important that the planned reference trajectories are a good match for the actual terrain, and that tracking is accurate. Such an approach leads to conservative controllers, i.e. relatively low speed, ground speed matching, etc. Not surprisingly such controllers are not very robust - they are not suited for the real world use outside of the laboratory where the knowledge of the world is limited and error prone. Thus, to achieve robust robotic locomotion in the archetypical domain of legged systems, namely complex rough terrain, where the size of the obstacles are in the order of leg length, additional elements are required. A possible solution to improve the robustness of legged locomotion is to maximize the compliance of the controller. While compliance is trivially achieved by reduced feedback gains, for terrain requiring precise foot placement (e.g. climbing rocks, walking over pegs or cracks) compliance cannot be introduced at the cost of inferior tracking. Thus, model-based control and - in contrast to passive dynamic walkers - active balance control is required. To achieve these objectives, in this paper we add two crucial elements to legged locomotion, i.e., floating-base inverse dynamics control and predictive force control, and we show that these elements increase robustness in face of unknown and unanticipated perturbations (e.g. obstacles). Furthermore, we introduce a novel line-based COG trajectory planner, which yields a simpler algorithm than traditional polygon based methods and creates the appropriate input to our control system.We show results from bot- h simulation and real world of a robotic dog walking over non-perceived obstacles and rocky terrain. The results prove the effectivity of the inverse dynamics/force controller. The presented results show that we have all elements needed for robust all-terrain locomotion, which should also generalize to other legged systems, e.g., humanoid robots.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Incorporating Muscle Activation-Contraction dynamics to an optimal control framework for finger movements

Theodorou, Evangelos A., Valero-Cuevas, Francisco J.

Abstracts of Neural Control of Movement Conference (NCM 2009), 2009, clmc (article)

Abstract
Recent experimental and theoretical work [1] investigated the neural control of contact transition between motion and force during tapping with the index finger as a nonlinear optimization problem. Such transitions from motion to well-directed contact force are a fundamental part of dexterous manipulation. There are 3 alternative hypotheses of how this transition could be accomplished by the nervous system as a function of changes in direction and magnitude of the torque vector controlling the finger. These hypotheses are 1) an initial change in direction with a subsequent change in magnitude of the torque vector; 2) an initial change in magnitude with a subsequent directional change of the torque vector; and 3) a simultaneous and proportionally equal change of both direction and magnitude of the torque vector. Experimental work in [2] shows that the nervous system selects the first strategy, and in [1] we suggest that this may in fact be the optimal strategy. In [4] the framework of Iterative Linear Quadratic Optimal Regulator (ILQR) was extended to incorporate motion and force control. However, our prior simulation work assumed direct and instantaneous control of joint torques, which ignores the known delays and filtering properties of skeletal muscle. In this study, we implement an ILQR controller for a more biologically plausible biomechanical model of the index finger than [4], and add activation-contraction dynamics to the system to simulate muscle function. The planar biomechanical model includes the kinematics of the 3 joints while the applied torques are driven by activation?contraction dynamics with biologically plausible time constants [3]. In agreement with our experimental work [2], the task is to, within 500 ms, move the finger from a given resting configuration to target configuration with a desired terminal velocity. ILQR does not only stabilize the finger dynamics according to the objective function, but it also generates smooth joint space trajectories with minimal tuning and without an a-priori initial control policy (which is difficult to find for highly dimensional biomechanical systems). Furthemore, the use of this optimal control framework and the addition of activation-contraction dynamics considers the full nonlinear dynamics of the index finger and produces a sequence of postures which are compatible with experimental motion data [2]. These simulations combined with prior experimental results suggest that optimal control is a strong candidate for the generation of finger movements prior to abrupt motion-to-force transitions. This work is funded in part by grants NIH R01 0505520 and NSF EFRI-0836042 to Dr. Francisco J. Valero- Cuevas 1 Venkadesan M, Valero-Cuevas FJ. 
Effects of neuromuscular lags on controlling contact transitions. 
Philosophical Transactions of the Royal Society A: 2008. 2 Venkadesan M, Valero-Cuevas FJ. 
Neural Control of Motion-to-Force Transitions with the Fingertip. 
J. Neurosci., Feb 2008; 28: 1366 - 1373; 3 Zajac. Muscle and tendon: properties, models, scaling, and application to biomechanics and motor control. Crit Rev Biomed Eng, 17 4. Weiwei Li., Francisco Valero Cuevas: ?Linear Quadratic Optimal Control of Contact Transition with Fingertip ? ACC 2009

am

PDF [BibTex]

PDF [BibTex]


no image
Inertial parameter estimation of floating-base humanoid systems using partial force sensing

Mistry, M., Schaal, S., Yamane, K.

In IEEE-RAS International Conference on Humanoid Robots (Humanoids 2009), Paris, Dec.7-10, 2009, clmc (inproceedings)

Abstract
Recently, several controllers have been proposed for humanoid robots which rely on full-body dynamic models. The estimation of inertial parameters from data is a critical component for obtaining accurate models for control. However, floating base systems, such as humanoid robots, incur added challenges to this task (e.g. contact forces must be measured, contact states can change, etc.) In this work, we outline a theoretical framework for whole body inertial parameter estimation, including the unactuated floating base. Using a least squares minimization approach, conducted within the nullspace of unmeasured degrees of freedom, we are able to use a partial force sensor set for full-body estimation, e.g. using only joint torque sensors, allowing for estimation when contact force measurement is unavailable or unreliable (e.g. due to slipping, rolling contacts, etc.). We also propose how to determine the theoretical minimum force sensor set for full body estimation, and discuss the practical limitations of doing so.

am

link (url) [BibTex]

link (url) [BibTex]


no image
On-line learning and modulation of periodic movements with nonlinear dynamical systems

Gams, A., Ijspeert, A., Schaal, S., Lenarčič, J.

Autonomous Robots, 27(1):3-23, 2009, clmc (article)

Abstract
Abstract  The paper presents a two-layered system for (1) learning and encoding a periodic signal without any knowledge on its frequency and waveform, and (2) modulating the learned periodic trajectory in response to external events. The system is used to learn periodic tasks on a humanoid HOAP-2 robot. The first layer of the system is a dynamical system responsible for extracting the fundamental frequency of the input signal, based on adaptive frequency oscillators. The second layer is a dynamical system responsible for learning of the waveform based on a built-in learning algorithm. By combining the two dynamical systems into one system we can rapidly teach new trajectories to robots without any knowledge of the frequency of the demonstration signal. The system extracts and learns only one period of the demonstration signal. Furthermore, the trajectories are robust to perturbations and can be modulated to cope with a dynamic environment. The system is computationally inexpensive, works on-line for any periodic signal, requires no additional signal processing to determine the frequency of the input signal and can be applied in parallel to multiple dimensions. Additionally, it can adapt to changes in frequency and shape, e.g. to non-stationary signals, such as hand-generated signals and human demonstrations.

am

link (url) [BibTex]

link (url) [BibTex]

2008


no image
Learning to control in operational space

Peters, J., Schaal, S.

International Journal of Robotics Research, 27, pages: 197-212, 2008, clmc (article)

Abstract
One of the most general frameworks for phrasing control problems for complex, redundant robots is operational space control. However, while this framework is of essential importance for robotics and well-understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in face of modeling errors, which are inevitable in com- plex robots, e.g., humanoid robots. In this paper, we suggest a learning approach for opertional space control as a direct inverse model learning problem. A first important insight for this paper is that a physically cor- rect solution to the inverse problem with redundant degrees-of-freedom does exist when learning of the inverse map is performed in a suitable piecewise linear way. The second crucial component for our work is based on the insight that many operational space controllers can be understood in terms of a constrained optimal control problem. The cost function as- sociated with this optimal control problem allows us to formulate a learn- ing algorithm that automatically synthesizes a globally consistent desired resolution of redundancy while learning the operational space controller. From the machine learning point of view, this learning problem corre- sponds to a reinforcement learning problem that maximizes an immediate reward. We employ an expectation-maximization policy search algorithm in order to solve this problem. Evaluations on a three degrees of freedom robot arm are used to illustrate the suggested approach. The applica- tion to a physically realistic simulator of the anthropomorphic SARCOS Master arm demonstrates feasibility for complex high degree-of-freedom robots. We also show that the proposed method works in the setting of learning resolved motion rate control on real, physical Mitsubishi PA-10 medical robotics arm.

am ei

link (url) DOI [BibTex]

2008


link (url) DOI [BibTex]


Thumb xl toc image
Voltage-Controllable Magnetic Composite Based on Multifunctional Polyethylene Microparticles

Ghosh, A., Sheridon, N. K., Fischer, P.

SMALL, 4(11):1956-1958, 2008 (article)

pf

DOI [BibTex]


no image
Adaptation to a sub-optimal desired trajectory

M. Mistry, E. A. G. L. T. Y. S. S. M. K.

Advances in Computational Motor Control VII, Symposium at the Society for Neuroscience Meeting, Washington DC, 2008, 2008, clmc (article)

am

PDF [BibTex]

PDF [BibTex]


no image
Human movement generation based on convergent flow fields: A computational model and a behavioral experiment

Hoffmann, H., Schaal, S.

In Advances in Computational Motor Control VII, Symposium at the Society for Neuroscience Meeting, Washington DC, 2008, 2008, clmc (inproceedings)

am

link (url) [BibTex]

link (url) [BibTex]


no image
Operational space control: A theoretical and emprical comparison

Nakanishi, J., Cory, R., Mistry, M., Peters, J., Schaal, S.

International Journal of Robotics Research, 27(6):737-757, 2008, clmc (article)

Abstract
Dexterous manipulation with a highly redundant movement system is one of the hallmarks of hu- man motor skills. From numerous behavioral studies, there is strong evidence that humans employ compliant task space control, i.e., they focus control only on task variables while keeping redundant degrees-of-freedom as compliant as possible. This strategy is robust towards unknown disturbances and simultaneously safe for the operator and the environment. The theory of operational space con- trol in robotics aims to achieve similar performance properties. However, despite various compelling theoretical lines of research, advanced operational space control is hardly found in actual robotics imple- mentations, in particular new kinds of robots like humanoids and service robots, which would strongly profit from compliant dexterous manipulation. To analyze the pros and cons of different approaches to operational space control, this paper focuses on a theoretical and empirical evaluation of different methods that have been suggested in the literature, but also some new variants of operational space controllers. We address formulations at the velocity, acceleration and force levels. First, we formulate all controllers in a common notational framework, including quaternion-based orientation control, and discuss some of their theoretical properties. Second, we present experimental comparisons of these approaches on a seven-degree-of-freedom anthropomorphic robot arm with several benchmark tasks. As an aside, we also introduce a novel parameter estimation algorithm for rigid body dynamics, which ensures physical consistency, as this issue was crucial for our successful robot implementations. Our extensive empirical results demonstrate that one of the simplified acceleration-based approaches can be advantageous in terms of task performance, ease of parameter tuning, and general robustness and compliance in face of inevitable modeling errors.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Movement reproduction and obstacle avoidance with dynamic movement primitives and potential fields

Park, D., Hoffmann, H., Pastor, P., Schaal, S.

In IEEE International Conference on Humanoid Robots, 2008., 2008, clmc (inproceedings)

am

PDF [BibTex]

PDF [BibTex]


no image
The dual role of uncertainty in force field learning

Mistry, M., Theodorou, E., Hoffmann, H., Schaal, S.

In Abstracts of the Eighteenth Annual Meeting of Neural Control of Movement (NCM), Naples, Florida, April 29-May 4, 2008, clmc (inproceedings)

Abstract
Force field experiments have been a successful paradigm for studying the principles of planning, execution, and learning in human arm movements. Subjects have been shown to cope with the disturbances generated by force fields by learning internal models of the underlying dynamics to predict disturbance effects or by increasing arm impedance (via co-contraction) if a predictive approach becomes infeasible. Several studies have addressed the issue uncertainty in force field learning. Scheidt et al. demonstrated that subjects exposed to a viscous force field of fixed structure but varying strength (randomly changing from trial to trial), learn to adapt to the mean disturbance, regardless of the statistical distribution. Takahashi et al. additionally show a decrease in strength of after-effects after learning in the randomly varying environment. Thus they suggest that the nervous system adopts a dual strategy: learning an internal model of the mean of the random environment, while simultaneously increasing arm impedance to minimize the consequence of errors. In this study, we examine what role variance plays in the learning of uncertain force fields. We use a 7 degree-of-freedom exoskeleton robot as a manipulandum (Sarcos Master Arm, Sarcos, Inc.), and apply a 3D viscous force field of fixed structure and strength randomly selected from trial to trial. Additionally, in separate blocks of trials, we alter the variance of the randomly selected strength multiplier (while keeping a constant mean). In each block, after sufficient learning has occurred, we apply catch trials with no force field and measure the strength of after-effects. As expected in higher variance cases, results show increasingly smaller levels of after-effects as the variance is increased, thus implying subjects choose the robust strategy of increasing arm impedance to cope with higher levels of uncertainty. Interestingly, however, subjects show an increase in after-effect strength with a small amount of variance as compared to the deterministic (zero variance) case. This result implies that a small amount of variability aides in internal model formation, presumably a consequence of the additional amount of exploration conducted in the workspace of the task.

am

[BibTex]

[BibTex]


no image
Dynamic movement primitives for movement generation motivated by convergent force fields in frog

Hoffmann, H., Pastor, P., Schaal, S.

In Adaptive Motion of Animals and Machines (AMAM), 2008, clmc (inproceedings)

am

PDF [BibTex]

PDF [BibTex]


no image
Efficient inverse kinematics algorithms for highdimensional movement systems

Tevatia, G., Schaal, S.

CLMC Technical Report: TR-CLMC-2008-1, 2008, clmc (techreport)

Abstract
Real-time control of the endeffector of a humanoid robot in external coordinates requires computationally efficient solutions of the inverse kinematics problem. In this context, this paper investigates methods of resolved motion rate control (RMRC) that employ optimization criteria to resolve kinematic redundancies. In particular we focus on two established techniques, the pseudo inverse with explicit optimization and the extended Jacobian method. We prove that the extended Jacobian method includes pseudo-inverse methods as a special solution. In terms of computational complexity, however, pseudo-inverse and extended Jacobian differ significantly in favor of pseudo-inverse methods. Employing numerical estimation techniques, we introduce a computationally efficient version of the extended Jacobian with performance comparable to the original version. Our results are illustrated in simulation studies with a multiple degree-offreedom robot, and were evaluated on an actual 30 degree-of-freedom full-body humanoid robot.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Behavioral experiments on reinforcement learning in human motor control

Hoffmann, H., Theodorou, E., Schaal, S.

In Abstracts of the Eighteenth Annual Meeting of Neural Control of Movement (NCM), Naples, Florida, April 29-May 4, 2008, clmc (inproceedings)

Abstract
Reinforcement learning (RL) - learning solely based on reward or cost feedback - is widespread in robotics control and has been also suggested as computational model for human motor control. In human motor control, however, hardly any experiment studied reinforcement learning. Here, we study learning based on visual cost feedback in a reaching task and did three experiments: (1) to establish a simple enough experiment for RL, (2) to study spatial localization of RL, and (3) to study the dependence of RL on the cost function. In experiment (1), subjects sit in front of a drawing tablet and look at a screen onto which the drawing pen's position is projected. Beginning from a start point, their task is to move with the pen through a target point presented on screen. Visual feedback about the pen's position is given only before movement onset. At the end of a movement, subjects get visual feedback only about the cost of this trial. We choose as cost the squared distance between target and virtual pen position at the target line. Above a threshold value, the cost was fixed at this value. In the mapping of the pen's position onto the screen, we added a bias (unknown to subject) and Gaussian noise. As result, subjects could learn the bias, and thus, showed reinforcement learning. In experiment (2), we randomly altered the target position between three different locations (three different directions from start point: -45, 0, 45). For each direction, we chose a different bias. As result, subjects learned all three bias values simultaneously. Thus, RL can be spatially localized. In experiment (3), we varied the sensitivity of the cost function by multiplying the squared distance with a constant value C, while keeping the same cut-off threshold. As in experiment (2), we had three target locations. We assigned to each location a different C value (this assignment was randomized between subjects). Since subjects learned the three locations simultaneously, we could directly compare the effect of the different cost functions. As result, we found an optimal C value; if C was too small (insensitive cost), learning was slow; if C was too large (narrow cost valley), the exploration time was longer and learning delayed. Thus, reinforcement learning in human motor control appears to be sen

am

[BibTex]

[BibTex]


no image
Movement generation by learning from demonstration and generalization to new targets

Pastor, P., Hoffmann, H., Schaal, S.

In Adaptive Motion of Animals and Machines (AMAM), 2008, clmc (inproceedings)

am

PDF [BibTex]

PDF [BibTex]


no image
Combining dynamic movement primitives and potential fields for online obstacle avoidance

Park, D., Hoffmann, H., Schaal, S.

In Adaptive Motion of Animals and Machines (AMAM), Cleveland, Ohio, 2008, 2008, clmc (inproceedings)

am

link (url) [BibTex]

link (url) [BibTex]


no image
A library for locally weighted projection regression

Klanke, S., Vijayakumar, S., Schaal, S.

Journal of Machine Learning Research, 9, pages: 623-626, 2008, clmc (article)

Abstract
In this paper we introduce an improved implementation of locally weighted projection regression (LWPR), a supervised learning algorithm that is capable of handling high-dimensional input data. As the key features, our code supports multi-threading, is available for multiple platforms, and provides wrappers for several programming languages.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Computational model for movement learning under uncertain cost

Theodorou, E., Hoffmann, H., Mistry, M., Schaal, S.

In Abstracts of the Society of Neuroscience Meeting (SFN 2008), Washington, DC 2008, 2008, clmc (inproceedings)

Abstract
Stochastic optimal control is a framework for computing control commands that lead to an optimal behavior under a given cost. Despite the long history of optimal control in engineering, it has been only recently applied to describe human motion. So far, stochastic optimal control has been mainly used in tasks that are already learned, such as reaching to a target. For learning, however, there are only few cases where optimal control has been applied. The main assumptions of stochastic optimal control that restrict its application to tasks after learning are the a priori knowledge of (1) a quadratic cost function (2) a state space model that captures the kinematics and/or dynamics of musculoskeletal system and (3) a measurement equation that models the proprioceptive and/or exteroceptive feedback. Under these assumptions, a sequence of control gains is computed that is optimal with respect to the prespecified cost function. In our work, we relax the assumption of the a priori known cost function and provide a computational framework for modeling tasks that involve learning. Typically, a cost function consists of two parts: one part that models the task constraints, like squared distance to goal at movement endpoint, and one part that integrates over the squared control commands. In learning a task, the first part of this cost function will be adapted. We use an expectation-maximization scheme for learning: the expectation step optimizes the task constraints through gradient descent of a reward function and the maximizing step optimizes the control commands. Our computational model is tested and compared with data given from a behavioral experiment. In this experiment, subjects sit in front of a drawing tablet and look at a screen onto which the drawing-pen's position is projected. Beginning from a start point, their task is to move with the pen through a target point presented on screen. Visual feedback about the pen's position is given only before movement onset. At the end of a movement, subjects get visual feedback only about the cost of this trial. In the mapping of the pen's position onto the screen, we added a bias (unknown to subject) and Gaussian noise. Therefore the cost is a function of this bias. The subjects were asked to reach to the target and minimize this cost over trials. In this behavioral experiment, subjects could learn the bias and thus showed reinforcement learning. With our computational model, we could model the learning process over trials. Particularly, the dependence on parameters of the reward function (Gaussian width) and the modulation of movement variance over time were similar in experiment and model.

am

[BibTex]

[BibTex]


no image
Optimization strategies in human reinforcement learning

Hoffmann, H., Theodorou, E., Schaal, S.

Advances in Computational Motor Control VII, Symposium at the Society for Neuroscience Meeting, Washington DC, 2008, 2008, clmc (article)

am

PDF [BibTex]

PDF [BibTex]


no image
A Bayesian approach to empirical local linearizations for robotics

Ting, J., D’Souza, A., Vijayakumar, S., Schaal, S.

In International Conference on Robotics and Automation (ICRA2008), Pasadena, CA, USA, May 19-23, 2008, 2008, clmc (inproceedings)

Abstract
Local linearizations are ubiquitous in the control of robotic systems. Analytical methods, if available, can be used to obtain the linearization, but in complex robotics systems where the the dynamics and kinematics are often not faithfully obtainable, empirical linearization may be preferable. In this case, it is important to only use data for the local linearization that lies within a ``reasonable'' linear regime of the system, which can be defined from the Hessian at the point of the linearization -- a quantity that is not available without an analytical model. We introduce a Bayesian approach to solve statistically what constitutes a ``reasonable'' local regime. We approach this problem in the context local linear regression. In contrast to previous locally linear methods, we avoid cross-validation or complex statistical hypothesis testing techniques to find the appropriate local regime. Instead, we treat the parameters of the local regime probabilistically and use approximate Bayesian inference for their estimation. This approach results in an analytical set of iterative update equations that are easily implemented on real robotics systems for real-time applications. As in other locally weighted regressions, our algorithm also lends itself to complete nonlinear function approximation for learning empirical internal models. We sketch the derivation of our Bayesian method and provide evaluations on synthetic data and actual robot data where the analytical linearization was known.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Do humans plan continuous trajectories in kinematic coordinates?

Hoffmann, H., Schaal, S.

In Abstracts of the Society of Neuroscience Meeting (SFN 2008), Washington, DC 2008, 2008, clmc (inproceedings)

Abstract
The planning and execution of human arm movements is still unresolved. An ongoing controversy is whether we plan a movement in kinematic coordinates and convert these coordinates with an inverse internal model into motor commands (like muscle activation) or whether we combine a few muscle synergies or equilibrium points to move a hand, e.g., between two targets. The first hypothesis implies that a planner produces a desired end-effector position for all time points; the second relies on the dynamics of the muscular-skeletal system for a given control command to produce a continuous end-effector trajectory. To distinguish between these two possibilities, we use a visuomotor adaptation experiment. Subjects moved a pen on a graphics tablet and observed the pen's mapped position onto a screen (subjects quickly adapted to this mapping). The task was to move a cursor between two points in a given time window. In the adaptation test, we manipulated the velocity profile of the cursor feedback such that the shape of the trajectories remained unchanged (for straight paths). If humans would use a kinematic plan and map at each time the desired end-effector position onto control commands, subjects should adapt to the above manipulation. In a similar experiment, Wolpert et al (1995) showed adaptation to changes in the curvature of trajectories. This result, however, cannot rule out a shift of an equilibrium point or an additional synergy activation between start and end point of a movement. In our experiment, subjects did two sessions, one control without and one with velocity-profile manipulation. To skew the velocity profile of the cursor trajectory, we added to the current velocity, v, the function 0.8*v*cos(pi + pi*x), where x is the projection of the cursor position onto the start-goal line divided by the distance start to goal (x=0 at the start point). As result, subjects did not adapt to this manipulation: for all subjects, the true hand motion was not significantly modified in a direction consistent with adaptation, despite that the visually presented motion differed significantly from the control motion. One may still argue that this difference in motion was insufficient to be processed visually. Thus, as a control experiment, we replayed control and modified motions to the subjects and asked which of the two motions appeared 'more natural'. Subjects chose the unperturbed motion as more natural significantly better than chance. In summary, for a visuomotor transformation task, the hypothesis of a planned continuous end-effector trajectory predicts adaptation to a modified velocity profile. The current experiment found no adaptation under such transformation.

am

[BibTex]

[BibTex]

2002


no image
Forward models in visuomotor control

Mehta, B., Schaal, S.

J Neurophysiol, 88(2):942-53, August 2002, clmc (article)

Abstract
In recent years, an increasing number of research projects investigated whether the central nervous system employs internal models in motor control. While inverse models in the control loop can be identified more readily in both motor behavior and the firing of single neurons, providing direct evidence for the existence of forward models is more complicated. In this paper, we will discuss such an identification of forward models in the context of the visuomotor control of an unstable dynamic system, the balancing of a pole on a finger. Pole balancing imposes stringent constraints on the biological controller, as it needs to cope with the large delays of visual information processing while keeping the pole at an unstable equilibrium. We hypothesize various model-based and non-model-based control schemes of how visuomotor control can be accomplished in this task, including Smith Predictors, predictors with Kalman filters, tapped-delay line control, and delay-uncompensated control. Behavioral experiments with human participants allow exclusion of most of the hypothesized control schemes. In the end, our data support the existence of a forward model in the sensory preprocessing loop of control. As an important part of our research, we will provide a discussion of when and how forward models can be identified and also the possible pitfalls in the search for forward models in control.

am

link (url) [BibTex]

2002


link (url) [BibTex]


Thumb xl toc images
Chirality-specific nonlinear spectroscopies in isotropic media

Fischer, P., Albrecht, A.

BULLETIN OF THE CHEMICAL SOCIETY OF JAPAN, 75(5):1119-1124, 2002, 10th International Conference on Time-Resolved Vibrational Spectroscopy (TRVS 2001), OKAZZAKI, JAPAN, MAY 21-25, 2001 (article)

Abstract
Sum or difference frequency generation (SFG or DFG) in isotropic media is in the electric-dipole approximation only symmetry allowed for optically active systems. The hyperpolarizability giving rise to these three-wave mixing processes features only one isotropic component. It factorizes into two terms, an energy (denominator) factor and a triple product of transition moments. These forbid degenerate SFG, i.e., second harmonic generation, as well as the existence of the linear electrooptic effect (Pockels effect) in isotropic media. This second order response also has no static limit, which leads to particularly strong resonance phenomena that are qualitatively different from those usually seen in the ubiquitous even-wave mixing spectroscopies. In particular, the participation of two (not the usual one) excited states is essential to achieve dramatic resonance enhancement, We report our first efforts to see such resonantly enhanced chirality specific SFG.

pf

DOI [BibTex]

DOI [BibTex]


Thumb xl toc image
The chiral specificity of sum-frequency generation in solutions

Fischer, P., Beckwitt, K., Wise, F., Albrecht, A.

CHEMICAL PHYSICS LETTERS, 352(5-6):463-468, 2002 (article)

Abstract
Sum-frequency generation in isotropic media is in the electric-dipole approximation the only symmetry allowed for chiral systems. We demonstrate that the sum-frequency intensity from an optically active liquid depends quadratically on the difference in concentration of the two enantiomers. The dominant contribution to the signal is found to be due to the chirality specific electric-dipolar three-wave mixing nonlinearity. Selecting the polarization of all fields allows the chiral electric-dipolar contributions to the bulk sum-frequency signal to be discerned from any achiral magnetic-dipolar and electric-quadrupolar contributions. (C) 2002 Published by Elsevier Science B.V.

pf

DOI [BibTex]

DOI [BibTex]


Thumb xl toc image
On optical rectification in isotropic media

Fischer, P., Albrecht, A.

LASER PHYSICS, 12(8):1177-1181, 2002 (article)

Abstract
Coherent nonlinear optical processes at second-order are only electric-dipole allowed in isotropic media that are optically active. Sum-frequency generation in chiral liquids has recently been observed, and difference-frequency and optical rectification have been predicted to exist in isotropic chiral media. Both Rayleigh-Schrodinger perturbation theory and the density matrix approach are used to discuss the quantum-chemical basis of optical rectification in optically active liquids. For pinene we compute the corresponding orientationally averaged hyperpolarizability, and estimate the light-induced dc electric polarization and the consequent voltage across a measuring capacitor it may give rise to near resonance.

pf

[BibTex]

[BibTex]


no image
Learning rhythmic movements by demonstration using nonlinear oscillators

Ijspeert, J. A., Nakanishi, J., Schaal, S.

In IEEE International Conference on Intelligent Robots and Systems (IROS 2002), pages: 958-963, Piscataway, NJ: IEEE, Lausanne, Sept.30-Oct.4 2002, 2002, clmc (inproceedings)

Abstract
Locally weighted learning (LWL) is a class of statistical learning techniques that provides useful representations and training algorithms for learning about complex phenomena during autonomous adaptive control of robotic systems. This paper introduces several LWL algorithms that have been tested successfully in real-time learning of complex robot tasks. We discuss two major classes of LWL, memory-based LWL and purely incremental LWL that does not need to remember any data explicitly. In contrast to the traditional beliefs that LWL methods cannot work well in high-dimensional spaces, we provide new algorithms that have been tested in up to 50 dimensional learning problems. The applicability of our LWL algorithms is demonstrated in various robot learning examples, including the learning of devil-sticking, pole-balancing of a humanoid robot arm, and inverse-dynamics learning for a seven degree-of-freedom robot.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Learning robot control

Schaal, S.

In The handbook of brain theory and neural networks, 2nd Edition, pages: 983-987, 2, (Editors: Arbib, M. A.), MIT Press, Cambridge, MA, 2002, clmc (inbook)

Abstract
This is a review article on learning control in robots.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Arm and hand movement control

Schaal, S.

In The handbook of brain theory and neural networks, 2nd Edition, pages: 110-113, 2, (Editors: Arbib, M. A.), MIT Press, Cambridge, MA, 2002, clmc (inbook)

Abstract
This is a review article on computational and biological research on arm and hand control.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Scalable techniques from nonparameteric statistics for real-time robot learning

Schaal, S., Atkeson, C. G., Vijayakumar, S.

Applied Intelligence, 17(1):49-60, 2002, clmc (article)

Abstract
Locally weighted learning (LWL) is a class of techniques from nonparametric statistics that provides useful representations and training algorithms for learning about complex phenomena during autonomous adaptive control of robotic systems. This paper introduces several LWL algorithms that have been tested successfully in real-time learning of complex robot tasks. We discuss two major classes of LWL, memory-based LWL and purely incremental LWL that does not need to remember any data explicitly. In contrast to the traditional belief that LWL methods cannot work well in high-dimensional spaces, we provide new algorithms that have been tested on up to 90 dimensional learning problems. The applicability of our LWL algorithms is demonstrated in various robot learning examples, including the learning of devil-sticking, pole-balancing by a humanoid robot arm, and inverse-dynamics learning for a seven and a 30 degree-of-freedom robot. In all these examples, the application of our statistical neural networks techniques allowed either faster or more accurate acquisition of motor control than classical control engineering.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Movement imitation with nonlinear dynamical systems in humanoid robots

Ijspeert, J. A., Nakanishi, J., Schaal, S.

In International Conference on Robotics and Automation (ICRA2002), Washinton, May 11-15 2002, 2002, clmc (inproceedings)

Abstract
Locally weighted learning (LWL) is a class of statistical learning techniques that provides useful representations and training algorithms for learning about complex phenomena during autonomous adaptive control of robotic systems. This paper introduces several LWL algorithms that have been tested successfully in real-time learning of complex robot tasks. We discuss two major classes of LWL, memory-based LWL and purely incremental LWL that does not need to remember any data explicitly. In contrast to the traditional beliefs that LWL methods cannot work well in high-dimensional spaces, we provide new algorithms that have been tested in up to 50 dimensional learning problems. The applicability of our LWL algorithms is demonstrated in various robot learning examples, including the learning of devil-sticking, pole-balancing of a humanoid robot arm, and inverse-dynamics learning for a seven degree-of-freedom robot.

am

link (url) [BibTex]

link (url) [BibTex]


no image
A locally weighted learning composite adaptive controller with structure adaptation

Nakanishi, J., Farrell, J. A., Schaal, S.

In IEEE International Conference on Intelligent Robots and Systems (IROS 2002), Lausanne, Sept.30-Oct.4 2002, 2002, clmc (inproceedings)

Abstract
This paper introduces a provably stable adaptive learning controller which employs nonlinear function approximation with automatic growth of the learning network according to the nonlinearities and the working domain of the control system. The unknown function in the dynamical system is approximated by piecewise linear models using a nonparametric regression technique. Local models are allocated as necessary and their parameters are optimized on-line. Inspired by composite adaptive control methods, the pro-posed learning adaptive control algorithm uses both the tracking error and the estimation error to up-date the parameters. We provide Lyapunov analyses that demonstrate the stability properties of the learning controller. Numerical simulations illustrate rapid convergence of the tracking error and the automatic structure adaptation capability of the function approximator. This paper introduces a provably stable adaptive learning controller which employs nonlinear function approximation with automatic growth of the learning network according to the nonlinearities and the working domain of the control system. The unknown function in the dynamical system is approximated by piecewise linear models using a nonparametric regression technique. Local models are allocated as necessary and their parameters are optimized on-line. Inspired by composite adaptive control methods, the pro-posed learning adaptive control algorithm uses both the tracking error and the estimation error to up-date the parameters. We provide Lyapunov analyses that demonstrate the stability properties of the learning controller. Numerical simulations illustrate rapid convergence of the tracking error and the automatic structure adaptation capability of the function approximator

am

link (url) [BibTex]

link (url) [BibTex]