Efficient Algorithms for Symmetry Detection

The use of symmetry dramatically impacts the efficiency of algorithms in various application areas. However, no matter how symmetries are used, there needs to be a way to obtain them efficiently. In practice, symmetries of combinatorial structures are usually computed by modeling said structure as a graph. The automorphisms of this graph then precisely model the symmetries of the combinatorial structure. A so-called practical graph isomorphism solver is then used to compute the automorphism group of the constructed graph. Practical graph isomorphism solvers have been developed for more than half a century. While all state-of-the-art solvers are based on the same principles, namely the so-called individualization-refinement paradigm, there are vast differences among them. The differences in the algorithms, in turn, lead to vast performance differences in practice. While practical graph isomorphism has been considered "solved" multiple times over the past few decades, no single algorithm can efficiently solve all the graphs stemming from the various application areas. Indeed, there are even some particular applications for which state-of-the-art implementations struggle to solve all the graphs efficiently. Ideally, one would like to have a singular algorithm that is fast on all graphs.

In this thesis, we describe the design of a new practical graph isomorphism solver called dejavu. Instead of trying to develop a better implementation ad hoc, we perform a theoretical analysis of essential algorithmic ingredients used in state-of-the-art algorithms. A central result is a theoretical model for the backtracking behavior of all state-of-the-art algorithms. Within this model, we prove that a Monte Carlo strategy is optimal in the worst-case up to logarithmic factors. A Monte Carlo strategy can use randomization and is allowed to err with bounded probability. In particular, we prove that randomized strategies outperform deterministic strategies. In theory, the Monte Carlo backtracking strategy outperforms all strategies currently used in practice in the worst-case. Further theoretical results include a characterization of the structure of the aforementioned backtracking trees and an analysis of design choices in the so-called color refinement algorithm, a crucial subroutine used by the solvers.

In turn, the design of dejavu is based on these theoretical results. In particular, the backtracking strategy of dejavu follows the near-optimal Monte Carlo strategy. The solver further contains various novel practical components. These components are carefully designed to complement the Monte Carlo strategy. Notably, one of the components is a preprocessor designed to shrink large and sparse inputs, which can also be used with all the other state-of-the-art solvers. Benchmarks on a vast library of graphs reveal that dejavu is faster than any other state-of-the-art solver on most tested graph classes.

Sprache

Englisch

Alternativtitel

Effiziente Algorithmen zur Symmetrieerkennung

Alternatives Abstract

Die Ausnutzung von Symmetrien hat einen großen Einfluss auf die Effizienz praktischer Algorithmen in vielen verschiedenen Bereichen. Egal wo und wie Symmetrien verwendet werden, wird eine effiziente Möglichkeit benötigt, um Symmetrien zunächst zu berechnen. Dafür werden kombinatorische Strukturen in der Praxis üblicherweise als Graphen modelliert. Die Automorphismengruppe dieses Modellgraphen entspricht dann den Symmetrien der kombinatorischen Struktur. Ein Graphisomorphie-Algorithmus wird anschließend verwendet, um die Automorphismengruppe des Modellgraphen zu berechnen. Praktische Graphisomorphie-Algorithmen werden seit über 50 Jahren entwickelt. Obwohl alle modernen praktische Algorithmen auf demselben Prinzip basieren, nämlich dem Individualization-Refinement (IR) Paradigma, gibt es dennoch große Unterschiede zwischen den Implementierungen. Dies wiederum führt zu großen Leistungsunterschieden zwischen den Algorithmen. Obwohl die praktische Graphisomorphie schon oft als "gelöst" bezeichnet wurde, ist in der Tat kein einziger praktischer Algorithmus in der Lage, alle Graphen aus den verschiedenen Anwendungsbereichen effizient zu lösen. Es gibt sogar einzelne Anwendungen, in denen es keiner der Algorithmen schafft, alle Graphen effizient zu lösen. Idealerweise gäbe es einen einzigen Algorithmus, der auf allen Graphen schnell ist.

Die vorliegende Arbeit beschäftigt sich mit dem Entwurf eines neuen Algorithmus, dejavu, für die Erkennung von Symmetrien auf Graphen. Anstatt ad-hoc nach einer besseren Implementierung zu suchen, führen wir eine theoretische Analyse der wesentlichen algorithmischen Bestandteile des IR Paradigmas durch. Ein zentrales Ergebnis ist ein theoretisches Modell für die Backtracking-Strategien moderner praktischer Algorithmen. Wir beweisen, dass ein bestimmter Monte-Carlo Algorithmus innerhalb des Modells optimal bis auf logarithmische Faktoren ist. Ein Monte-Carlo Algorithmus kann Zufall in Berechnungen miteinbeziehen, und darf mit einer begrenzten Wahrscheinlichkeit Fehler machen. Insbesondere beweisen wir, dass in unserem Modell probabilistische Strategien beweisbar besser sind als deterministische. Die Monte-Carlo Strategie ist asymptotisch schneller als alle anderen Strategien, die in der Praxis verwendet werden. Weitere theoretische Resultate beinhalten eine konstruktive Charakterisierung der Backtracking-Bäume, sowie eine Analyse von Design-Entscheidungen im sogenannten Color Refinement Algorithmus.

Das Design von dejavu basiert auf unseren neuen theoretischen Erkenntnissen. Insbesondere verwendet dejavu eine Monte-Carlo Backtracking-Strategie. Darüber hinaus besteht der Solver aus vielen weiteren praktischen Komponenten, die dafür entwickelt wurden, die Monte-Carlo Strategie zu unterstützen. Eine dieser Komponenten ist ein Preprocessor, der auch mit allen anderen modernen Implementierungen verwendet werden kann. Eine experimentelle Analyse auf einer weitreichenden Kollektion von Graphen demonstriert anschließend, dass dejavu auf der großen Mehrheit der Graphklassen signifikant schneller ist als alle anderen Implementierungen.

Fachbereich/-gebiet

04 Fachbereich Mathematik > Didaktik

DDC

500 Naturwissenschaften und Mathematik > 510 Mathematik

Institution

Technische Universität Darmstadt

Ort

Darmstadt

Datum der mündlichen Prüfung

17.09.2024

Gutachter:innen

Schweitzer, Pascal

Piperno, Adolfo

McKay, Brendan

Handelt es sich um eine kumulative Dissertation?

Nein

Name der Gradverleihenden Institution

Technische Universität Darmstadt

Ort der Gradverleihenden Institution

Darmstadt

PPN

523280211