Braun, Alina (2021)
In Theory and Practice - On the Rate of Convergence of Implementable Neural Network Regression Estimates.
Technische Universität Darmstadt
doi: 10.26083/tuprints-00019052
Ph.D. Thesis, Primary publication, Publisher's Version
|
Text
Dissertation_BraunAlina_genehmigt.pdf Copyright Information: CC BY-NC-ND 4.0 International - Creative Commons, Attribution NonCommercial, NoDerivs. Download (2MB) | Preview |
Item Type: | Ph.D. Thesis | ||||
---|---|---|---|---|---|
Type of entry: | Primary publication | ||||
Title: | In Theory and Practice - On the Rate of Convergence of Implementable Neural Network Regression Estimates | ||||
Language: | English | ||||
Referees: | Kohler, Prof. Dr. Michael ; Betz, Prof. Dr. Volker | ||||
Date: | 2021 | ||||
Place of Publication: | Darmstadt | ||||
Collation: | x, 219 Seiten | ||||
Date of oral examination: | 25 June 2021 | ||||
DOI: | 10.26083/tuprints-00019052 | ||||
Abstract: | In theory, recent results in nonparametric regression show that neural network estimates are able to achieve good rates of convergence provided suitable assumptions on the structure of the regression function are imposed. However, these theoretical analyses cannot explain the practical success of neural networks since the theoretically studied estimates are defined by minimizing the empirical L_2 risk over a class of neural networks and in practice, solving this kind of minimization problem is not feasible. Consequently, the neural networks examined in theory cannot be implemented as they are defined. This means that neural network in applications differ from the ones that are analyzed theoretically. In this thesis we narrow the gap between theory and practice. We deal with neural network regression estimates for (p,C)-smooth regression functions m that satisfy a projection pursuit model. We construct three implementable neural network estimates and show that each of them achieve up to a logarithmic factor the optimal univariate rate of convergence. Firstly, for univariate regression functions with p contained in [-1/2,1] we construct a neural network estimate with one hidden layer where the weights are learned via gradient descent. The starting weights are randomly chosen from an interval independently of the data. The interval is large enough to guarantee that the estimate is close to a piecewise constant approximation. Secondly, for multivariate regression functions with p contained in (0,1] we construct a neural network estimate with one hidden layer where the weights are learned via gradient descent. The initial weights are chosen from specific intervals dependently on the data and the projection directions. This choice guarantees that the estimate is close to a piecewise constant approximation. The projection directions are repeatedly chosen randomly. Lastly, for multivariate regression functions with p>0 we construct a multilayer neural network estimate. The value of the inner weights are prescribed dependently on the projection directions by a new approximation result for a projection pursuit model by piecewise polynomials. The outer weights are chosen by solving a linear equation system. The projection directions are repeatedly chosen randomly. Since we are able to show a rate of convergence that is independent of the dimension of the data our second and third estimates are able to circumvent the curse of dimensionality. |
||||
Alternative Abstract: |
|
||||
Status: | Publisher's Version | ||||
URN: | urn:nbn:de:tuda-tuprints-190528 | ||||
Classification DDC: | 500 Science and mathematics > 510 Mathematics | ||||
Divisions: | 04 Department of Mathematics > Stochastik | ||||
Date Deposited: | 11 Aug 2021 08:47 | ||||
Last Modified: | 11 Aug 2021 08:47 | ||||
URI: | https://tuprints.ulb.tu-darmstadt.de/id/eprint/19052 | ||||
PPN: | 484189417 | ||||
Export: |
View Item |