TU Darmstadt / ULB / TUprints

An Algorithm for the Detection of Hidden Propaganda in Mixed-Code Text over the Internet

Tundis, Andrea ; Mukherjee, Gaurav ; Mühlhäuser, Max (2021)
An Algorithm for the Detection of Hidden Propaganda in Mixed-Code Text over the Internet.
In: Applied Sciences, 2021, 11 (5)
doi: 10.26083/tuprints-00019316
Article, Secondary publication, Publisher's Version

Copyright Information: CC BY 4.0 International - Creative Commons, Attribution.

Download (879kB) | Preview
Item Type: Article
Type of entry: Secondary publication
Title: An Algorithm for the Detection of Hidden Propaganda in Mixed-Code Text over the Internet
Language: English
Date: 24 August 2021
Place of Publication: Darmstadt
Year of primary publication: 2021
Publisher: MDPI
Journal or Publication Title: Applied Sciences
Volume of the journal: 11
Issue Number: 5
Collation: 22 Seiten
DOI: 10.26083/tuprints-00019316
Corresponding Links:
Origin: Secondary publication via sponsored Golden Open Access

Internet-based communication systems have become an increasing tool for spreading misinformation and propaganda. Even though there exist mechanisms that are able to track unwarranted information and messages, users made up different ways to avoid their scrutiny and detection. An example is represented by the mixed-code language, that is text written in an unconventional form by combining different languages, symbols, scripts and shapes. It aims to make more difficult the detection of specific content, due to its custom and ever changing appearance, by using special characters to substitute for alphabet letters. Indeed, such substitute combinations of symbols, which tries to resemble the shape of the intended alphabet’s letter, makes it still intuitively readable to humans, however nonsensical to machines. In this context, the paper explores the possibility of identifying propaganda in such mixed-code texts over the Internet, centred on a machine learning based approach. In particular, an algorithm in combination with a deep learning models for character identification is proposed in order to detect and analyse whether an element contains propaganda related content. The overall approach is presented, the results gathered from its experimentation are discussed and the achieved performances are compared with the related works.

Status: Publisher's Version
URN: urn:nbn:de:tuda-tuprints-193167
Classification DDC: 000 Generalities, computers, information > 004 Computer science
Divisions: 20 Department of Computer Science > Telecooperation
Date Deposited: 24 Aug 2021 07:30
Last Modified: 14 Nov 2023 19:03
URI: https://tuprints.ulb.tu-darmstadt.de/id/eprint/19316
Actions (login required)
View Item View Item