TU Darmstadt / ULB / TUprints

Evaluation of whole graph embedding techniques for a clustering task in the manufacturing domain

Iskandar, Yusef (2022)
Evaluation of whole graph embedding techniques for a clustering task in the manufacturing domain.
Technische Universität Darmstadt
doi: 10.26083/tuprints-00022340
Master Thesis, Primary publication, Publisher's Version

[img] Text
tuprint.pdf
Copyright Information: CC BY 4.0 International - Creative Commons, Attribution.

Download (4MB)
Item Type: Master Thesis
Type of entry: Primary publication
Title: Evaluation of whole graph embedding techniques for a clustering task in the manufacturing domain
Language: English
Referees: Metternich, Prof. Dr. Joachim ; Bretones Cassoli, M. Sc. Beatriz
Date: 2022
Place of Publication: Darmstadt
Collation: 83, V Seiten
DOI: 10.26083/tuprints-00022340
Abstract:

Production systems in manufacturing consume and generate data. Representing the relationships between subsystems and their associated data is complex, but suitable for Knowledge Graphs (KG), which allow us to visualize the relationships between subsystems and store their measurement data. In this work, KG act as a feature engineering technique for a clustering task by converting KG into Euclidean space with so-called graph embeddings and serving as input to a clustering algorithm. The Python library Karate Club proposes 10 different technologies for embedding whole graphs, i.e., only one vector is generated for each graph. These were successfully tested on benchmark datasets that include social media platforms and chemical or biochemical structures. This work presents the potential of graph embeddings for the manufacturing domain for a clustering task by modifying and evaluating Karate Club’s techniques for a manufacturing dataset. First, an introduction to graph theory is given and the state of the art in whole graph embedding techniques is explained. Second, the Bosch production line dataset is examined with an Exploratory Data Analysis (EDA), and a graph data model for directed and undirected graphs is defined based on the results. Third, a data processing pipeline is developed to generate graph embeddings from the raw data. Finally, the graph embeddings are used as input to a clustering algorithm, and a quantitative comparison of the performance of the techniques is conducted.

Status: Publisher's Version
URN: urn:nbn:de:tuda-tuprints-223405
Classification DDC: 000 Generalities, computers, information > 004 Computer science
500 Science and mathematics > 510 Mathematics
600 Technology, medicine, applied sciences > 620 Engineering and machine engineering
Divisions: 16 Department of Mechanical Engineering > Institute of Production Technology and Machine Tools (PTW)
16 Department of Mechanical Engineering > Institute of Production Technology and Machine Tools (PTW) > Management of Industrial Production
Date Deposited: 20 Dec 2022 12:52
Last Modified: 23 Dec 2022 07:06
URI: https://tuprints.ulb.tu-darmstadt.de/id/eprint/22340
PPN: 503107670
Export:
Actions (login required)
View Item View Item