Schramowski, Patrick (2023)
Self-Supervised Learning of Machine Ethics.
Technische Universität Darmstadt
doi: 10.26083/tuprints-00023090
Ph.D. Thesis, Primary publication, Publisher's Version
Text
dissertation_schramowski.pdf Copyright Information: CC BY-SA 4.0 International - Creative Commons, Attribution ShareAlike. Download (24MB) |
Item Type: | Ph.D. Thesis | ||||
---|---|---|---|---|---|
Type of entry: | Primary publication | ||||
Title: | Self-Supervised Learning of Machine Ethics | ||||
Language: | English | ||||
Referees: | Kersting, Prof. Dr. Kristian ; Fraser, Prof. Dr. Alexander M. | ||||
Date: | 2023 | ||||
Place of Publication: | Darmstadt | ||||
Collation: | xxi, 208 Seiten | ||||
Date of oral examination: | 20 March 2023 | ||||
DOI: | 10.26083/tuprints-00023090 | ||||
Abstract: | In recent years Artificial Intelligence (AI), especially deep learning, has proven to be a technology driver in industry. However, while advancing existing and creating novel technologies, automatizing processes, and assisting humans in essential areas such as drug discovery, they raise many concerns, like other groundbreaking novel technologies before. In this case, these concerns include, for instance, models producing stereotypical and derogatory content as well as gender and racial biases. Since AI technologies will permeate more of our lives in the coming years, these concerns need to be addressed. This thesis examines recent data-driven approaches, which often suffer from degenerated and biased behavior through their self-supervised training on large-scale noisy web data, containing potential inappropriate data. While this is well-established, we will investigate and demonstrate the promises of deep models’ acquired knowledge and capabilities through the provision of this very particular potentially inappropriate data. Importantly, we present the first approaches for learning ethics from data. Our findings suggest that if we build an AI system that learns an improved representation of data and that is able to better understand and produce it, in the process, it will also acquire more accurate societal knowledge, in this case, historical cultural associations to make human-like "right" and "wrong" choices. Furthermore, based on these findings, we consequently ask the arguably "circular" question of whether a machine can help us mitigate their associated concerns. Importantly, we demonstrate the importance of their ability to distinguish between "right" and "wrong" and how utilizing them can mitigate associated risks surrounding large-scale models themselves. However, we also highlight the role of human-machine interaction to explore and reinforce AI systems’ properties, including their flaws and merits, and present how human feedback on explanations can align deep learning based models with our precepts. We present these algorithms and corresponding findings, providing important insights for the goal of putting human values into AI systems, which, summarized, may not be insurmountable in the long run. |
||||
Alternative Abstract: |
|
||||
Status: | Publisher's Version | ||||
URN: | urn:nbn:de:tuda-tuprints-230900 | ||||
Classification DDC: | 000 Generalities, computers, information > 004 Computer science | ||||
Divisions: | 20 Department of Computer Science > Artificial Intelligence and Machine Learning | ||||
Date Deposited: | 24 May 2023 12:11 | ||||
Last Modified: | 25 May 2023 06:52 | ||||
URI: | https://tuprints.ulb.tu-darmstadt.de/id/eprint/23090 | ||||
PPN: | 507940911 | ||||
Export: |
View Item |