Miller, Tristan (2016)
Adjusting Sense Representations for Word Sense Disambiguation and Automatic Pun Interpretation.
Technische Universität Darmstadt
Ph.D. Thesis, Primary publication, Publisher's Version
|
Text
Miller_thesis_20160411_print.pdf Copyright Information: CC BY-SA 3.0 Unported - Creative Commons, Attribution, ShareAlike. Download (1MB) | Preview |
Item Type: | Ph.D. Thesis | ||||
---|---|---|---|---|---|
Type of entry: | Primary publication | ||||
Title: | Adjusting Sense Representations for Word Sense Disambiguation and Automatic Pun Interpretation | ||||
Language: | English | ||||
Referees: | Gurevych, Prof. Dr. Iryna ; Mihalcea, Prof. Dr. Rada ; Balke, Prof. Dr. Wolf-Tilo | ||||
Date: | 13 April 2016 | ||||
Place of Publication: | Darmstadt | ||||
Collation: | xv, 200 pages, ill. | ||||
Date of oral examination: | 22 March 2016 | ||||
Corresponding Links: | |||||
Abstract: | Word sense disambiguation (WSD)—the task of determining which meaning a word carries in a particular context—is a core research problem in computational linguistics. Though it has long been recognized that supervised (machine learning–based) approaches to WSD can yield impressive results, they require an amount of manually annotated training data that is often too expensive or impractical to obtain. This is a particular problem for under-resourced languages and domains, and is also a hurdle in well-resourced languages when processing the sort of lexical-semantic anomalies employed for deliberate effect in humour and wordplay. In contrast to supervised systems are knowledge-based techniques, which rely only on pre-existing lexical-semantic resources (LSRs). These techniques are of more general applicability but tend to suffer from lower performance due to the informational gap between the target word's context and the sense descriptions provided by the LSR. This dissertation is concerned with extending the efficacy and applicability of knowledge-based word sense disambiguation. First, we investigate two approaches for bridging the information gap and thereby improving the performance of knowledge-based WSD. In the first approach we supplement the word's context and the LSR's sense descriptions with entries from a distributional thesaurus. The second approach enriches an LSR's sense information by aligning it to other, complementary LSRs. Our next main contribution is to adapt techniques from word sense disambiguation to a novel task: the interpretation of puns. Traditional NLP applications, including WSD, usually treat the source text as carrying a single meaning, and therefore cannot cope with the intentionally ambiguous constructions found in humour and wordplay. We describe how algorithms and evaluation methodologies from traditional word sense disambiguation can be adapted for the "disambiguation" of puns, or rather for the identification of their double meanings. Finally, we cover the design and construction of technological and linguistic resources aimed at supporting the research and application of word sense disambiguation. Development and comparison of WSD systems has long been hampered by a lack of standardized data formats, language resources, software components, and workflows. To address this issue, we designed and implemented a modular, extensible framework for WSD. It implements, encapsulates, and aggregates reusable, interoperable components using UIMA, an industry-standard information processing architecture. We have also produced two large sense-annotated data sets for under-resourced languages or domains: one of these targets German-language text, and the other English-language puns. |
||||
Alternative Abstract: |
|
||||
Uncontrolled Keywords: | word sense disambiguation, puns, word sense alignment, distributional similarity, natural language processing | ||||
Status: | Publisher's Version | ||||
URN: | urn:nbn:de:tuda-tuprints-54002 | ||||
Classification DDC: | 000 Generalities, computers, information > 004 Computer science 400 Language > 400 Language, linguistics |
||||
Divisions: | 20 Department of Computer Science > Ubiquitous Knowledge Processing | ||||
Date Deposited: | 13 Apr 2016 08:44 | ||||
Last Modified: | 08 Aug 2024 06:22 | ||||
URI: | https://tuprints.ulb.tu-darmstadt.de/id/eprint/5400 | ||||
PPN: | 386814201 | ||||
Export: |
View Item |