Hülsmann, Robert Alexander (2022)
Formal Falsification Criteria as a Basis for Behavior Planning based on Reinforcement Learning Algorithms.
Technische Universität Darmstadt
doi: 10.26083/tuprints-00019116
Master Thesis, Primary publication, Publisher's Version
Text
MaTh_Huelsmann_CC-BY.pdf Copyright Information: CC BY 4.0 International - Creative Commons, Attribution. Download (3MB) |
Item Type: | Master Thesis | ||||
---|---|---|---|---|---|
Type of entry: | Primary publication | ||||
Title: | Formal Falsification Criteria as a Basis for Behavior Planning based on Reinforcement Learning Algorithms | ||||
Language: | English | ||||
Date: | 2022 | ||||
Place of Publication: | Darmstadt | ||||
Collation: | X, 76 Seiten | ||||
DOI: | 10.26083/tuprints-00019116 | ||||
Abstract: | For the purpose of compliance with behavioral rules by Autonomous Vehicle (AV), especially in urban traffic, the Behavior-Semantic Scenery Description (BSSD) can be used to describe the limits of the legal behavioral space for each route segment of a road map. In order to test the applicability of BSSD to an online behavior planner, the task of this thesis was to convert selected route segments into the BSSD format, derive behavioral boundaries as formal falsification criteria from the specification of BSSD, and subsequently use them for the application of a Reinforcement Learning (RL) behavior planner. For this purpose, these criteria were first extracted from the specification, their logical form were identified and formalized. A training environment for the behavior planner including the simulation of other Traffic Participant (TP) and a visualization was created and the falsification criteria and the behavior planner were implemented to the extent as it was possible within the scope of this thesis. Finally, the behavior planner was trained and evaluated. The formalization and implementation of the falsification criteria highlighted strengths and weaknesses in the machine interpretability of BSSD. Two of the six extracted criteria could not be fully formalized and implemented. For the remaining criteria, however, it was possible to complete this task. The evaluation of the learned behavior model showed that in the training environment, a vehicle controlled by the simple behavior planner reacts to changes in the maximum permitted speeds and adjusts its speed promptly. Also, lane changes are avoided since they are prohibited or not possible in most places in the selected road section. For compliance with other falsification criteria, improvements must be made to the behavior planner and to the level of penalties for violating the behavior limits respectively rewards for error-free progress. Some suggestions for this have been given in this elaboration. |
||||
Alternative Abstract: |
|
||||
Status: | Publisher's Version | ||||
URN: | urn:nbn:de:tuda-tuprints-191161 | ||||
Classification DDC: | 000 Generalities, computers, information > 004 Computer science 600 Technology, medicine, applied sciences > 620 Engineering and machine engineering |
||||
Divisions: | 16 Department of Mechanical Engineering > Institute of Automotive Engineering (FZD) | ||||
Date Deposited: | 07 Jun 2022 12:03 | ||||
Last Modified: | 07 Jun 2022 12:04 | ||||
URI: | https://tuprints.ulb.tu-darmstadt.de/id/eprint/19116 | ||||
PPN: | 496555030 | ||||
Export: |
View Item |