Graphenrekonstruktion anhand abhängiger Zeitreihen in biologischen Netzwerken

Boba, Patrick (2019)
Graphenrekonstruktion anhand abhängiger Zeitreihen in biologischen Netzwerken.
Technische Universität Darmstadt
Ph.D. Thesis, Primary publication

Preview

Text
boba_diss.pdf - Accepted Version
Copyright Information: CC BY-SA 4.0 International - Creative Commons, Attribution ShareAlike.
Download (2MB) | Preview

Item Type:

Ph.D. Thesis

Type of entry:

Primary publication

Title:

Graphenrekonstruktion anhand abhängiger Zeitreihen in biologischen Netzwerken

Language:

German

Referees:

Hamacher, Prof. Dr. Kay ; Thiel, Prof. Dr. Gerhard ; Stein, Prof. Dr. Viktor ; Dreizler, Prof. Dr. Andreas

Date:

29 September 2019

Place of Publication:

Darmstadt

Date of oral examination:

5 September 2019

Abstract:

Die Biologie befasst sich mit dem Aufbau und der Organisation von Lebewesen. Bei beiden Aspekten finden sich auf verschiedenen Abstraktionsebenen Phänomene, die sich als Netzwerke interpretieren lassen. Ein makroskopisches Beispiel dafür sind Räuber-BeuteBeziehungen (z. B. Größe einer Fuchspopulation in Abhängigkeit von ihren Beutetieren wie Kaninchen, Hühnern, etc.). Es ist leicht ersichtlich, dass die Größen der Populationen jeweils voneinander abhängen und eine wechselseitige Dynamik widerspiegeln. Auf molekularer Ebene gibt es ebenfalls Beispiele für Interaktionen, die sich über ein dynamisches Netzwerk beschreiben lassen, etwa bei zellulären Prozessen. Ein Beispiel hierfür ist die Katalyse einer chemischen Reaktion mittels eines Enzyms. Die Konzentration des Enzyms und der beteiligten Substanzen beeinflussen dabei die Geschwindigkeit, mit welcher der Stoffwechselprozess abläuft. Mit dieser (makro)molekularen Ebene beschäftigt sich diese Arbeit. Wie wichtig ein funktionierendes Netzwerk ist, wird deutlich wenn man ein gestörtes System betrachtet, etwa wenn eingeschleppte Arten ein Ökosystem aus dem Gleichgewicht bringen. Ein aktuelles Beispiel dazu ist der amerikanische Kalikokrebs (Orconectes immunis ), der sich derzeit in Europa schnell ausbreitet, da ihm natürliche Feinde fehlen. Gleichzeitig bedroht er durch seinen Ressourcenverbrauch Tierarten wie Libellen, Amphibien und einheimische Krebse. Auf zellulärer Ebene kann eine Störung des Netzwerks der DNA-Reparatur und der Zellzykluskontrolle zu der Entstehung von Krebs führen. Die DNA-Reparatur stellt ein komplexes System aus verschiedenen Proteinen und DNA dar. Der Ausfall eines Bestandteils dieses Systems kann für den Reparaturprozess verheerende Folgen haben. Es wird deutlich wie wichtig das Verständnis der Dynamik dieser Systeme ist, um Analysen und Prognosen für den Zustand dieser Systeme zu erstellen. In den beiden genannten Beispielen kann es helfen die Entstehung von Krebs besser vorherzusagen, bzw. bedrohte Tier- und Pflanzenarten zu schützen. Anhand von Netzwerken, die die Interaktion von Proteinen, DNA und RNA darstellen, ist das Ziel dieser Arbeit, den messbaren Informationsfluss zwischen verschiedenen beteiligten Elementen zu erkennen und mit dessen Hilfe die Struktur des Netzwerks zu rekonstruieren. Zu diesem Zweck werden die Zeitreihen der einzelnen Knoten mittels verschiedener statistischer und informationstheoretischer Maße miteinander in Beziehung gesetzt. Bei der Auswahl der verschiedenen Maße greife ich sowohl auf klassische statistische Maße (z. B. Korrelationskoeffizienten), als auch auf informationstheoretische (auf Shannon-Entropie basierende) Methoden zurück, die in den letzten Jahren im Bereich der Biologie populärer gewordenen sind. Der Vergleich dieser Methoden findet durch mehrere Beispielsysteme statt, die ich in drei verschiedene Kategorien eingeteilt habe. Allen Beispielen gemein ist die zeitliche Simulation, um ein dynamisches, veränderliches System abzubilden. Mit Hilfe der Messung des Zusammenhangs der einzelnen Knoten über die Zeit, soll im Umkehrschluss auf die Topologie des zugrunde liegenden Netzwerks zurück geschlossen werden. In die erste Kategorie fällt ein einfaches Differentialgleichungssystem, welches zwei Feedback-Schleifen miteinander koppelt. Die Parametrisierung des Netzwerks sorgt für eine stabile Schwingung der beiden Schleifen um ihren jeweiligen Mittelwert. Als nächste Kategorie werden zwei verschiedene Typen von Zufallsgraphen erzeugt. Der erste wird durch einem von mir entworfenen Algorithmus erstellt, der eine bestimmte Menge an Knoten erzeugt, die mit einer bestimmten Anzahl von Eingangskanten und Ausgangskanten verbunden sind. Der zweite Typus ist ein sogenanntes skalenfreies Netz. Diese Netzwerktopologie kann in vielen Systemen wieder gefunden werden. Dazu gehören sowohl biologische als auch auch digitale soziale Netzwerke. In der letzten Kategorie wende ich die genannten Methoden auf verschiedene Beispiele aus der BioModels Database an. Diese Datenbank bietet sich aufgrund der umfangreichen Datensätze an und enthält viele biochemische Netzwerke, z. B. Protein-ProteinInteraktion, Protein-RNA-Interaktion usw. Abschließend diskutiere ich die vorgelegten Ergebnisse und gebe einen Ausblick auf die Möglichkeiten diese Ansätze weiter zu verfolgen und auszubauen. Des Weiteren wurden im Zuge dieser Arbeit verschiedene Software Tools von mir entwickelt, bzw. studentische Arbeiten zur Entwicklung betreut, die für die Durchführung der hier gezeigten Analysen wichtig waren. Diese werden in einem getrennten Abschnitt besprochen.

Alternative Abstract:

Alternative Abstract

Language

Biology deals with the construction and organization of living things. For both aspects phenomena can be found on different abstraction levels that can be interpreted as networks. A macroscopic example of this is predator-prey relationships (e.g. the size of a fox population depending on its prey, such as rabbits, chickens, etc.). It is easy to see that the sizes of these populations depend on each other and reflect a mutual dynamic. At the molecular level, there are also examples of interactions that can be described via a dynamic network, such as cellular processes. An example is the catalysis of a chemical reaction by means of an enzyme. The concentrations of the enzyme and substances involved influence the speed with which the metabolic process takes place. With this (macro) molecular level deals this thesis. How important a functioning network is, becomes apparent when analyzing disturbed system. This is the case when foreign species are introduced into an ecosystem. A recent example of this is the American Orconectes immunis, a crayfish which is currently spreading fast throughout Europe since its lack of natural predators. At the same time it threatens local animal species such as dragonflies, amphibians and native Crustacean by competing for resources. At the cellular level, a disorder of the DNA repair network and the cell cycle control can lead to the development of cancer. DNA repair is a complex system that involves various proteins and DNA. A failure of one of the components of this system can have devastating consequences for the repair process. It becomes clear how important it is to understand the dynamics of these systems: Creating predictions for the state and dynamics of these systems will in case of the former save endangered species or the latter help to fight cancer. The goal of this work is to reconstruct network topologies by measuring the information flow between involved elements of the network like proteins, DNA or RNA. For this purpose the time series of each node (e.g. a protein) is put in relationship to the other nodes using various statistical and information-theoretical measures. The selection of reviewed methods consists of both classical statistical measures (e.g. correlation coefficients) and measures that became popular more recently in the field of biology like information-theoretical methods based on Shannon entropy. The comparison of these methods is based on several example models. All these systems in common is the temporal simulation, to depict a dynamically changing system. Based on the measurement of the relationship between the single nodes over time, the idea is to inversely refer to the topology of the underlying network. The most simple one of those examples is a system of differential equations, which couples two feedback loops. The parameterization of the network caused a stable oscillation of the two loops around their respective mean value. The next category contains two different types of random graphs. The first type is created by an algorithm I have designed in a way that it randomly generates an amount of nodes with a certain in- and outdegree without individual nodes being isolated. The second type is a so-called scale-free network. This network topology can be found in many types of systems, such as biological networks or social networks. As a final step, I apply the above mentioned methods to various examples from the Biomodels Database. It contains a large amount of biochemical datasets, e.g. networks of protein-protein interaction, protein-RNA interaction, etc. Finally, I discuss the results presented and give an outlook of the opportunities to pursue and further develop these approaches. Additionally I describe various software tools that I either developed in the course of this work, or supervised the development by student projects. These tools were crucial to carrying out the analyzes shown in this thesis and are discussed in an individual section.

English

URN:

urn:nbn:de:tuda-tuprints-91284

Classification DDC:

500 Science and mathematics > 570 Life sciences, biology

Divisions:

10 Department of Biology > Computational Biology and Simulation

Date Deposited:

28 Oct 2019 14:00

Last Modified:

09 Jul 2020 02:46

URI: