Tanneberg, Daniel (2020)
Understand-Compute-Adapt: Neural Networks for Intelligent Agents.
Technische Universität Darmstadt
doi: 10.25534/tuprints-00017234
Ph.D. Thesis, Primary publication, Publisher's Version
|
Text
thesis_tanneberg_final_10-12-20_tuprints_small.pdf Copyright Information: CC BY-NC-SA 4.0 International - Creative Commons, Attribution NonCommercial, ShareAlike. Download (11MB) | Preview |
Item Type: | Ph.D. Thesis | ||||
---|---|---|---|---|---|
Type of entry: | Primary publication | ||||
Title: | Understand-Compute-Adapt: Neural Networks for Intelligent Agents | ||||
Language: | English | ||||
Referees: | Peters, Prof. Dr. Jan ; Rückert, Prof. Dr. Elmar ; Riedmiller, Prof. Dr. Martin | ||||
Date: | 2020 | ||||
Place of Publication: | Darmstadt | ||||
Collation: | xii, 106 Seiten | ||||
Date of oral examination: | 3 December 2020 | ||||
DOI: | 10.25534/tuprints-00017234 | ||||
Abstract: | An artificial intelligent agent needs to be equipped with a multitude of abilities in order to interact in the world among us. These requirements for intelligent behaviour can roughly be separated into two main categories, cognitive abilities and physical skills. The cognitive abilities refer to cognition and problem solving, whereas the physical skills correspond to movements of an intelligent robot in the real world. In this thesis, we investigate three research questions tackling those different abilities. Precisely, how can new knowledge be taught to a robot in a natural way? How can neural networks learn abstract solution strategies that are independent of the task complexity, data representation and task domain? How can a robot efficiently adapt its movement during execution with a bio-inspired stochastic neural network? These questions span core requirements for intelligent autonomous agents, which we categorize as Understand-Compute-Adapt (UCA), in the style of the classical Sense-Plan-Act framework in robotics. To answer these questions, we investigate neural network based models on these cognitive and physical abilities. In detail, the first question tackles the ability of cognition, which refers to an understanding of the world and is investigated by learning a set of skills from unlabelled demonstrations of full task executions. Therefore, we studied the task of trajectory segmentation and skill library learning. To provide a natural interface for teaching a robot new tasks, it is desirable to have the user only demonstrating the desired task, without worrying about all the skills that are required for the task and without manually annotating the demonstrations. Such an interface not only enables non-experts to teach robots, but also provides a cheaper approach to teaching robots, as demonstrating all individual skills or segmenting and labelling demonstrations by hand is time consuming and expensive. The approach proposed here learns to segment trajectories and the required skill library simultaneously from unlabelled demonstrations. In addition to this segmenting and skill discovery, the approach also learns the relations between individual skills, i.e., modelling how likely a certain skill follows after another skill. This additional knowledge, or understanding, can be used, for example, in human-robot-interaction scenarios by predicting the human behaviour and therefore enables a more intelligent adaptive behaviour of the robot. The approach was successfully evaluated on multiple different trajectory datasets with varying complexities. The second aforementioned required cognitive ability, problem solving, refers to the second question and the Compute step. In particular, we investigated the challenge of learning algorithmic solutions, i.e., learning abstract strategies that can easily be transferred to unfamiliar problem instantiations. This transferring of abstract knowledge and solution strategies into novel domains is another crucial feature of intelligent behaviour. Therefore, we investigated the learning of algorithmic solutions that are characterized by three requirements highlighting the abstract nature of the solution: scaling to arbitrary task configurations and complexities, and the independence of both the data representations as well as the task domain. For this purpose we developed a novel framework, the Neural Harvard Computer, that is based on memory-augmented neural networks and whose modular design is inspired by the von Neumann and Harvard architectures of modern computers. This framework enables the learning of abstract algorithmic solutions through its modular design and the separation of information flow into data and control signals. The algorithmic solution is learned in a reinforcement learning setting and solely operating on the control signal flow, enabling the independence of the data representation and task domain. We evaluated the framework's generalization and abstraction features by learning 11 different algorithms, where the approach was able to reliably learn algorithmic solutions with perfect generalization and abstraction, allowing to solve problems with complexities far beyond seen during training and by straight forward transfer to novel task representations and domains. Ultimately an intelligent robot has to interact in the real world, giving rise to the third entry Adapt, the question of efficient online adaptation. In order to cope with the complex, dynamic and often unstructured real world, in addition to dealing with other agents and humans, the agent has to be able to adapt its models and movements while interacting. This online adaptation belongs to the mentioned physical skills that are required for intelligent behaviour. Moreover, this online adaptation has to be efficient in terms of number of physical interactions and be task-independent, as not every situation can be foreseen when constructing the agent or the method. In this thesis, we studied online adaptation within a bio-inspired spiking neural network that generates movements by simulating its inherent dynamics. The underlying stochastically spiking neurons mimic the behaviour of hippocampal place cells and their decoded activity represents the planned movement. Task-independent adaptation is achieved by using intrinsic motivation signals inspired by cognitive dissonance to guide the learning. These signals capture the discrepancy between the agents expectation of the world (the current model) and the observations of the world, and the online adaptation is triggered and steered through this mismatch. Sample-efficiency is accomplished by using a mental replay strategy to intensify experienced situations and is implemented by using the inherent stochasticity of the framework. We evaluated this framework for online model adaptation and movement generation on an anthropomorphic KUKA LWR arm, where the robot has to adapt to unknown obstacles while performing a waypoint following task. The online adaptation happens within seconds and from few physical interactions while keeping interacting with the environment. In summary, this thesis investigates three key aspects of intelligent behaviour with respect to cognitive and physical abilities. In more detail, we investigated how neural network based models can be used from learning to understand over learning to compute to learning to adapt to tackle the three raised research question. Each topic has its own requirements on the used neural network model and the learning mechanism. This modularity and diversity of subroutines is a crucial aspect for creating artificial intelligence. |
||||
Alternative Abstract: |
|
||||
Status: | Publisher's Version | ||||
URN: | urn:nbn:de:tuda-tuprints-172343 | ||||
Classification DDC: | 000 Generalities, computers, information > 004 Computer science | ||||
Divisions: | 20 Department of Computer Science > Intelligent Autonomous Systems | ||||
Date Deposited: | 23 Dec 2020 08:02 | ||||
Last Modified: | 31 May 2023 13:48 | ||||
URI: | https://tuprints.ulb.tu-darmstadt.de/id/eprint/17234 | ||||
PPN: | 474417565 | ||||
Export: |
View Item |