Büchler, Dieter (2019)
Robot Learning for Muscular Systems.
Technische Universität Darmstadt
doi: 10.25534/tuprints-00017210
Ph.D. Thesis, Primary publication, Publisher's Version
|
Text
phdthesis_buechler.pdf Copyright Information: CC BY-SA 4.0 International - Creative Commons, Attribution ShareAlike. Download (11MB) | Preview |
Item Type: | Ph.D. Thesis | ||||
---|---|---|---|---|---|
Type of entry: | Primary publication | ||||
Title: | Robot Learning for Muscular Systems | ||||
Language: | English | ||||
Referees: | Peters, Prof. Dr. Jan ; Asfour, Prof. Dr. Tamim | ||||
Date: | 17 December 2019 | ||||
Place of Publication: | Darmstadt | ||||
Date of oral examination: | 17 December 2019 | ||||
DOI: | 10.25534/tuprints-00017210 | ||||
Abstract: | Today's robots are capable of performing many tasks that tremendously improve human lives. For instance, in industrial applications, robots move heavy parts very quickly and precisely along a predefined path. Robots are also widely used in agriculture or domestic applications like vacuum cleaning and lawn mowing. However, in more general settings, the gap between human abilities and what current robots deliver is still not bridged, such as in dynamic tasks. Like table tennis with anthropomorphic robot arms, such tasks require the execution of fast motions that potentially harm the system. Optimizing for such fast motions and being able to execute them without impairing the robot still pose difficult challenges that, so far, have not been met. Humans perform dynamic tasks relatively easy at high levels of performance. Can we enable comparable perfection on kinematically anthropomorphic robots? This thesis investigates whether learning approaches on more human-like actuated robots bring the community a step closer towards this ambitious goal. Learning has the potential to alleviate control difficulties arising from fast motions and more complex robots. On the other hand, an essential part of learning is exploration, which forms a natural trade-off with robot safety, especially at dynamic tasks. This thesis's general theme is to show that more human-like actuation enables exploring and failing directly on the real system while attempting fast and risky motions. In the first part of this thesis, we develop a robotic arm with four degrees of freedom and eight pneumatic artificial muscles (PAM).Such a system is capable of replicating desired behaviors as seen in human arm motions: 1) high power-to-weight ratios, 2) inherent robustness due to passive compliance and 3) high-speed catapult-like motions as possible with fast energy release. Rather than recreating human anatomy, this system is designed to simplify control than previously designed pneumatic muscle robots. One of the main insights is that a simple PID controller is sufficient to control this system for slow motions accurately. When exploring fast movements directly on the real system, the antagonistic actuation avoids damages to the system. In this manner, the PID controller's parameters and additional feedforward terms can be tuned automatically using Bayesian optimization without further safety considerations. Having such a system and following our goal to show the benefits of the combination of learning and muscular systems, the next part's content is to learn a dynamics model and use it for control. In particular, the goal here is to learn a model purely from data as analytical models of PAM-based robots are not sufficiently good. Nonlinearities, hysteresis effects, massive actuator delay, and unobservable dependencies like temperature make such actuators' modeling especially hard. We learn probabilistic forward dynamics models using Gaussian processes and, subsequently, employ them for control to address this issue. However, Gaussian processes dynamics models cannot be set-up for our musculoskeletal robot as for traditional motor-driven robots because of unclear state composition, etc. In this part of the thesis, we empirically study and discuss how to tune these approaches to complex musculoskeletal robots. For the control part, introduce Variance Regularized Control (VRC) that tracks a desired trajectory using the learned probabilistic model. VRC incorporates the GP's variance prediction as a regularization term to optimize for actions that minimize the tracking error while staying in the training data's vicinity. In the third part of this thesis, we utilized the PAM-based robot to return and smash table tennis balls that have been shot by a ball launcher. Rather than optimizing the desired trajectory and subsequently track it to hit the ball, we employ model-free Reinforcement Learning to learn this task from scratch. By using RL with our system, we can specify the table tennis task directly in the reward function. The RL agent also applies the actions directly on the low-level controls (equivalent to the air pressure space) while robot safety is assured due to the antagonistic actuation. In this manner, we allow the RL agent to be applied to the real system in the same way as in simulation. Additionally, we make use of the robustness of PAM-driven robots by letting the training run for 1.5 million time steps 14 hours. We introduce a semi sim and real training procedure in order to avoid training with real balls. With this solution, we return 75% of all incoming balls to the opponent's side of the table without using real balls during training. We also learn to smash the ball with an average ball speed of 12 m\s (5 m\s for the return task) after the hit while sacrificing accuracy (return rate of 29%). In summary, we show that learning approaches to control of muscular systems can lead to increased performance in dynamic tasks. In this thesis, we went through many aspects of robotics: We started by building a PAM-based robot and showed its robustness and inherent safety by tuning control parameters automatically with BO. Also, we modeled the dynamics and used this model for control. In the last chapter, we on top used our system for a precision-demanding task that has not been achieved before. Altogether, this thesis makes a step towards showing that good performance in dynamic tasks can be achieved because and not despite PAM-driven robots. |
||||
Alternative Abstract: |
|
||||
Status: | Publisher's Version | ||||
URN: | urn:nbn:de:tuda-tuprints-172109 | ||||
Classification DDC: | 600 Technology, medicine, applied sciences > 600 Technology 600 Technology, medicine, applied sciences > 620 Engineering and machine engineering |
||||
Divisions: | 20 Department of Computer Science > Intelligent Autonomous Systems | ||||
Date Deposited: | 11 Dec 2020 09:09 | ||||
Last Modified: | 12 Dec 2020 01:29 | ||||
URI: | https://tuprints.ulb.tu-darmstadt.de/id/eprint/17210 | ||||
PPN: | 473866021 | ||||
Export: |
View Item |