Nikolay, Fabio (2019)
Graph Learning Methods for Genetic Interaction Networks.
Technische Universität Darmstadt
doi: 10.25534/tuprints-00009634
Ph.D. Thesis, Primary publication
|
Text
2019-11-28_Nikolay_Fabio.pdf Copyright Information: CC BY-NC-ND 4.0 International - Creative Commons, Attribution NonCommercial, NoDerivs. Download (1MB) | Preview |
Item Type: | Ph.D. Thesis | ||||
---|---|---|---|---|---|
Type of entry: | Primary publication | ||||
Title: | Graph Learning Methods for Genetic Interaction Networks | ||||
Language: | English | ||||
Referees: | Pesavento, Prof. Dr. Marius ; Bugallo, Prof. Dr. Monica | ||||
Date: | 2019 | ||||
Place of Publication: | Darmstadt | ||||
Date of oral examination: | 25 November 2019 | ||||
DOI: | 10.25534/tuprints-00009634 | ||||
Abstract: | In recent years, the field of biomedical, genetics and genomics research is undergoing a major change. Especially genomcis research is turning into a data-driven science, due to the rapid evolution of automatized data sensing technologies as well as data analysis methods. With the advent of new wet-lab technologies as the synthetic genetic array technique, the conduction of large-scale knockout experiments of thousands of genes has made a vast amount of data become available. With the knockout data at hand, a major challenge is the analysis of the available data which is commonly referred to as data-mining or knowledge-engineering. One important aspect of genomics research is the investigation of the influence of genetic interactions on the cell metabolism of organisms. For this purpose, the identification of genetic interactions with respect to a specific cell function is of high significance. Interactions among a collection of genes are well described by genetic interaction networks by means of directed graphs. In light of the importance of understanding the influence of genetic interactions for the cell metabolism, the problem of learning genetic interaction networks, which reflect the mutual genetic dependencies among a set of genes, has recently attracted much attention. In this dissertation, the focus lies on developing graph learning algorithms dedicated to the special structure of genetic interaction networks. The main contribution of this work is the formulation of the graph learning problem as integer linear programs that estimate the genetic interaction network underlying the knockout data. In particular, the two proposed integer linear program based formulations are of different accents, since the network topology is estimated in different representation domains. The two methods have the advantage over conventional graph learning methods like Graphical Lasso, that the estimates of both proposed integer linear program based algorithms are guaranteed by design to have the desired network structure, which is imposed by the specific biological interaction model that is under study in this thesis. Due to their intrinsic combinatorial nature, the proposed integer linear program based algorithms for graph learning cannot be applied to estimate large-scale genetic interaction networks. In order to compensate for this inability, a broader graph learning framework is presented which uses the proposed integer linear program based algorithms in an iterative fashion. Furthermore, ``of-the-shelf'' machine learning algorithms are customized to the graph learning problem. The proposed integer linear program based methods are evaluated in terms of synthetic data as well as real data and compared to state-of-the-art methods. It is observed that the proposed integer linear program based algorithms are superior to the considered state-of-the-art methods for both the synthetic data as well as the real data. Finally, the proposed integer linear program based algorithms are compared to selected ``of-the-shelf'' machine learning methods in terms of graph learning performance for synthetic data. Although the considered ``of-the-shelf'' machine learning methods cannot guarantee that their estimates are of the desired genetic interaction network structure, it is worth to mention that they yield a good tradeoff between estimation quality and the ability to estimate large-scale genetic interaction networks. |
||||
Alternative Abstract: |
|
||||
URN: | urn:nbn:de:tuda-tuprints-96343 | ||||
Classification DDC: | 600 Technology, medicine, applied sciences > 620 Engineering and machine engineering | ||||
Divisions: | 18 Department of Electrical Engineering and Information Technology > Institute for Telecommunications > Communication Systems | ||||
Date Deposited: | 13 Dec 2019 13:29 | ||||
Last Modified: | 13 Dec 2019 13:29 | ||||
URI: | https://tuprints.ulb.tu-darmstadt.de/id/eprint/9634 | ||||
PPN: | 457473827 | ||||
Export: |
View Item |