Thostrup, Lasse Beck (2024)
Towards Network-Accelerated Databases.
Technische Universität Darmstadt
doi: 10.26083/tuprints-00026567
Ph.D. Thesis, Primary publication, Publisher's Version
Text
PhD_Diss_Lasse_Thostrup.pdf Copyright Information: In Copyright. Download (6MB) |
Item Type: | Ph.D. Thesis | ||||
---|---|---|---|---|---|
Type of entry: | Primary publication | ||||
Title: | Towards Network-Accelerated Databases | ||||
Language: | English | ||||
Referees: | Binnig, Prof. Dr. Carsten ; Wang, Prof. Tianzheng | ||||
Date: | 3 July 2024 | ||||
Place of Publication: | Darmstadt | ||||
Collation: | xxi, 257 Seiten | ||||
Date of oral examination: | 10 November 2023 | ||||
DOI: | 10.26083/tuprints-00026567 | ||||
Abstract: | Throughout the last years, data processing systems have seen substantial changes, notably moving towards disaggregation of resources. This shift separates compute and storage resources into distinct servers for better resource utilization, as they can now be scaled independently based on demand. This development is crucial for cloud-native Database Management Systems (DBMS), which mainly build on such disaggregated structures. This thesis examines two significant hardware trends in disaggregated architectures for DBMSs: modern networks and heterogeneous computing. Modern networks such as Remote Direct Memory Access (RDMA) are critical for efficient, high-throughput, low-latency data transfer, but present challenges for achieving optimal performance for DBMSs. The reason for this is that RDMA comes with a low-level interface with a plentitude of performance-critical aspects to consider. To address this challenge, this thesis introduces a high-level programming interface, the Data Flow Interface, specifically targeting the needs of data-intensive processing systems. In addition, this thesis highlights the emerging trend toward programmable network devices that offer data processing capabilities in the network. This trend is especially interesting for distributed DBMSs as they have to transfer large amounts of data over the network due to the disaggregated architecture, but also typical distributed data processing operations such as joins have to shuffle data between compute nodes. In the thesis, in-network processing devices are evaluated with typical DBMS operations to investigate the benefits and potential shortcomings. Another trend in the data center is the increasing heterogeneity of computing units such as GPUs and FPGAs due to their fast processing capabilities. Incorporating these heterogeneous devices into disaggregated architectures with fast networks has many merits. The reason is that specialized compute units can be exposed as network-attached disaggregated accelerator pools and thus provide flexible and scalable high-performance data processing. This integration of heterogeneous compute units and fast RDMA-capable networks is however non-trivial since networks like RDMA are typically not directly supported for devices besides CPUs and are as such non-trivial to integrate efficiently. The challenge of how to achieve efficient communication between different types of compute devices is addressed by proposing a network-driven communication scheme that leverages a programmable switch to carry out the network communication on behalf of the compute devices. |
||||
Alternative Abstract: |
|
||||
Status: | Publisher's Version | ||||
URN: | urn:nbn:de:tuda-tuprints-265675 | ||||
Classification DDC: | 000 Generalities, computers, information > 004 Computer science | ||||
Divisions: | 20 Department of Computer Science > Data and AI Systems | ||||
Date Deposited: | 03 Jul 2024 12:23 | ||||
Last Modified: | 04 Jul 2024 08:52 | ||||
URI: | https://tuprints.ulb.tu-darmstadt.de/id/eprint/26567 | ||||
PPN: | 519533526 | ||||
Export: |
View Item |