Nandy, Preetam ; Unger, Michael ; Zechner, Christoph ; Dey, Kushal K. ; Koeppl, Heinz (2024)
Learning diagnostic signatures from microarray data using L1-regularized logistic regression.
In: Systems Biomedicine, 2013, 1 (4)
doi: 10.26083/tuprints-00027017
Article, Secondary publication, Publisher's Version
Text
Learning diagnostic signatures from microarray data using L1-regularized logistic regression.pdf Copyright Information: CC BY-NC 3.0 Unported - Creative Commons, Attribution, NonCommercial. Download (2MB) |
Item Type: | Article |
---|---|
Type of entry: | Secondary publication |
Title: | Learning diagnostic signatures from microarray data using L1-regularized logistic regression |
Language: | English |
Date: | 22 April 2024 |
Place of Publication: | Darmstadt |
Year of primary publication: | 2013 |
Place of primary publication: | Austin, Tx. |
Publisher: | Taylor & Francis |
Journal or Publication Title: | Systems Biomedicine |
Volume of the journal: | 1 |
Issue Number: | 4 |
DOI: | 10.26083/tuprints-00027017 |
Corresponding Links: | |
Origin: | Secondary publication service |
Abstract: | Making reliable diagnoses and predictions based on high-throughput transcriptional data has attracted immense attention in the past few years. While experimental gene profiling techniques—such as microarray platforms—are advancing rapidly, there is an increasing demand of computational methods being able to efficiently handle such data. In this work we propose a computational workflow for extracting diagnostic gene signatures from high-throughput transcriptional profiling data. In particular, our research was performed within the scope of the first IMPROVER challenge. The goal of that challenge was to extract and verify diagnostic signatures based on microarray gene expression data in four different disease areas: psoriasis, multiple sclerosis, chronic obstructive pulmonary disease and lung cancer. Each of the different disease areas is handled using the same three-stage algorithm. First, the data are normalized based on a multi-array average (RMA) normalization procedure to account for variability among different samples and data sets. Due to the vast dimensionality of the profiling data, we subsequently perform a feature pre-selection using a Wilcoxon’s rank sum statistic. The remaining features are then used to train an L1-regularized logistic regression model which acts as our primary classifier. Using the four different data sets, we analyze the proposed method and demonstrate its use in extracting diagnostic signatures from microarray gene expression data. |
Uncontrolled Keywords: | classification, gene expression, L1-regularization, LASSO, logistic regression, microarray data, RMA normalization, Wilcoxon rank sum test |
Status: | Publisher's Version |
URN: | urn:nbn:de:tuda-tuprints-270174 |
Classification DDC: | 600 Technology, medicine, applied sciences > 610 Medicine and health 600 Technology, medicine, applied sciences > 621.3 Electrical engineering, electronics |
Date Deposited: | 22 Apr 2024 09:49 |
Last Modified: | 12 Aug 2024 08:07 |
URI: | https://tuprints.ulb.tu-darmstadt.de/id/eprint/27017 |
PPN: | 52056412X |
Export: |
View Item |