Kuznetsov, Ilia (2022)
The Role of Linguistics in Probing Task Design.
Technische Universität Darmstadt
doi: 10.26083/tuprints-00019851
Ph.D. Thesis, Primary publication, Publisher's Version
Text
Kuznetsov2021_PhD_LingP.pdf Copyright Information: CC BY-NC-ND 4.0 International - Creative Commons, Attribution NonCommercial, NoDerivs. Download (5MB) |
Item Type: | Ph.D. Thesis | ||||
---|---|---|---|---|---|
Type of entry: | Primary publication | ||||
Title: | The Role of Linguistics in Probing Task Design | ||||
Language: | English | ||||
Referees: | Gurevych, Prof. Dr. Iryna ; Palmer, Prof. Dr. Martha ; Nivre, Prof. Dr. Joakim | ||||
Date: | 2022 | ||||
Place of Publication: | Darmstadt | ||||
Collation: | ix, 149 Seiten | ||||
Date of oral examination: | 20 October 2021 | ||||
DOI: | 10.26083/tuprints-00019851 | ||||
Abstract: | Over the past decades natural language processing has evolved from a niche research area into a fast-paced and multi-faceted discipline that attracts thousands of contributions from academia and industry and feeds into real-world applications. Despite the recent successes, natural language processing models still struggle to generalize across domains, suffer from biases and lack transparency. Aiming to get a better understanding of how and why modern NLP systems make their predictions for complex end tasks, a line of research in probing attempts to interpret the behavior of NLP models using basic probing tasks. Linguistic corpora are a natural source of such tasks, and linguistic phenomena like part of speech, syntax and role semantics are often used in probing studies. The goal of probing is to find out what information can be easily extracted from a pre-trained NLP model or representation. To ensure that the information is extracted from the NLP model and not learned during the probing study itself, probing models are kept as simple and transparent as possible, exposing and augmenting conceptual inconsistencies between NLP models and linguistic resources. In this thesis we investigate how linguistic conceptualization can affect probing models, setups and results. In Chapter 2 we investigate the gap between the targets of classical type-level word embedding models like word2vec, and the items of lexical resources and similarity benchmarks. We show that the lack of conceptual alignment between word embedding vocabularies and lexical resources penalizes the word embedding models in both benchmark-based and our novel resource-based evaluation scenario. We demonstrate that simple preprocessing techniques like lemmatization and POS tagging can partially mitigate the issue, leading to a better match between word embeddings and lexicons. Linguistics often has more than one way of describing a certain phenomenon. In Chapter 3 we conduct an extensive study of the effects of lingustic formalism on probing modern pre-trained contextualized encoders like BERT. We use role semantics as an excellent example of a data-rich multi-framework phenomenon. We show that the choice of linguistic formalism can affect the results of probing studies, and deliver additional insights on the impact of dataset size, domain, and task architecture on probing. Apart from mere labeling choices, linguistic theories might differ in the very way of conceptualizing the task. Whereas mainstream NLP has treated semantic roles as a categorical phenomenon, an alternative, prominence-based view opens new opportunities for probing. In Chapter 4 we investigate prominence-based probing models for role semantics, incl. semantic proto-roles and our novel regression-based role probe. Our results indicate that pre-trained language models like BERT might encode argument prominence. Finally, we propose an operationalization of thematic role hierarchy - a widely used linguistic tool to describe syntactic behavior of verbs, and show that thematic role hierarchies can be extracted from text corpora and transfer cross-lingually. The results of our work demonstrate the importance of linguistic conceptualization for probing studies, and highlight the dangers and the opportunities associated with using linguistics as a meta-langauge for NLP model interpretation. |
||||
Alternative Abstract: |
|
||||
Status: | Publisher's Version | ||||
URN: | urn:nbn:de:tuda-tuprints-198510 | ||||
Classification DDC: | 000 Generalities, computers, information > 004 Computer science 400 Language > 400 Language, linguistics |
||||
Divisions: | 20 Department of Computer Science > Ubiquitous Knowledge Processing | ||||
Date Deposited: | 26 Jan 2022 12:31 | ||||
Last Modified: | 26 Jan 2022 12:31 | ||||
URI: | https://tuprints.ulb.tu-darmstadt.de/id/eprint/19851 | ||||
PPN: | 490532322 | ||||
Export: |
View Item |