Volkanovska, Elena ; Tan, Sherry ; Duan, Changxu ; Bartsch, Sabine ; Stille, Wolfgang (2025)
The InsightsNet Climate Change Corpus (ICCC) : Compiling a Multimodal Corpus of Discourses in a Multi-Disciplinary Domain.
In: Datenbank-Spektrum : Zeitschrift für Datenbanktechnologien und Information Retrieval, 2023, 23 (3)
doi: 10.26083/tuprints-00028356
Article, Secondary publication, Publisher's Version
Text
s13222-023-00454-1.pdf Copyright Information: CC BY 4.0 International - Creative Commons, Attribution. Download (1MB) |
Item Type: | Article |
---|---|
Type of entry: | Secondary publication |
Title: | The InsightsNet Climate Change Corpus (ICCC) : Compiling a Multimodal Corpus of Discourses in a Multi-Disciplinary Domain |
Language: | English |
Date: | 16 January 2025 |
Place of Publication: | Darmstadt |
Year of primary publication: | November 2023 |
Place of primary publication: | Berlin ; Heidelberg |
Publisher: | Springer |
Journal or Publication Title: | Datenbank-Spektrum : Zeitschrift für Datenbanktechnologien und Information Retrieval |
Volume of the journal: | 23 |
Issue Number: | 3 |
DOI: | 10.26083/tuprints-00028356 |
Corresponding Links: | |
Origin: | Secondary publication DeepGreen |
Abstract: | The discourse on climate change has become a centerpiece of public debate, thereby creating a pressing need to analyze the multitude of messages created by the participants in this communication process. In addition to text, information on this topic is conveyed multimodally, through images, videos, tables and other data objects that are embedded within documents and accompany the text. This paper presents the process of building a multimodal pilot corpus to the InsightsNet Climate Change Corpus (ICCC) and using natural language processing (NLP) tools to enrich corpus (meta)data, thus creating a dataset that lends itself to the exploration of the interplay between the various modalities that constitute the discourse on climate change. We demonstrate how the pilot corpus can be queried for relevant information in two types of databases, and how the proposed data model promotes a more comprehensive sentiment analysis approach. |
Uncontrolled Keywords: | Corpus, Climate change, Computational linguistics, Annotation, Metadata |
Status: | Publisher's Version |
URN: | urn:nbn:de:tuda-tuprints-283563 |
Classification DDC: | 000 Generalities, computers, information > 004 Computer science 400 Language > 400 Language, linguistics 400 Language > 420 English |
Divisions: | 02 Department of History and Social Science > Institut für Sprach- und Literaturwissenschaft > Corpus- und Computerlinguistik, Englische Philologie Zentrale Einrichtungen > hessian.AI - The Hessian Center for Artificial Intelligence |
Date Deposited: | 16 Jan 2025 13:44 |
Last Modified: | 16 Jan 2025 13:44 |
SWORD Depositor: | Deep Green |
URI: | https://tuprints.ulb.tu-darmstadt.de/id/eprint/28356 |
PPN: | |
Export: |
View Item |