Design and Configuration of Distributed Job Processing Systems

Risse, Thomas (2006)
Design and Configuration of Distributed Job Processing Systems.
Technische Universität Darmstadt
Ph.D. Thesis, Primary publication

Preview

Dissertation - PDF
thomas_risse_diss.pdf
Copyright Information: In Copyright.
Download (2MB) | Preview

Preview

Lebenslauf - PDF
thomas_risse_cv.pdf
Copyright Information: In Copyright.
Download (24kB) | Preview

Item Type:

Ph.D. Thesis

Type of entry:

Primary publication

Title:

Design and Configuration of Distributed Job Processing Systems

Language:

English

Referees:

Aberer, Prof. Dr. Karl ; Buchmann, Prof. Dr. Alejandro P.

Advisors:

Neuhold, Prof. Dr. Erich J.

Date:

7 March 2006

Place of Publication:

Darmstadt

Date of oral examination:

19 January 2006

Abstract:

A key criterion in the design, procurement and use of computer systems is performance. Performance typically means the throughput and response time of a system. The effects of poorly performing systems range from dissatisfied users to high penalties for companies due to missed processing deadlines. As a result of continuously increasing hardware performance, companies often solve performance problems by replacing existing hardware with faster machines. One consequence can be that they achieve a performance increase, but the overall performance increase is less than expected. The reason for this is that the combination of hardware and software does not match. For system designers it would be helpful to have a systematic method which supports them in the design of new systems and in the extension of existing systems. The need for a systematic configuration method is motivated by a typical B2B application from the financial industry. Banks have to deal with several payment messages standards like EDIFACT or S.W.I.F.T. which have to be converted into the banks' internal representation for further processing. Such converters have to handle message size ranging from some 100 bytes to about 60 MB and have to fulfil certain performance requirements. To achieve the performance goals, identification of the hardware and software configuration is an important step in the implementation of a distributed message converter system. This thesis presents a systematic approach for the cost performance analysis of distributed job processing systems based on given requirements on throughput and system response time. Our method allows us to search for suitable configurations while minimizing the use of expensive methods for performance evaluation to the largest degree. The method is organized into a hardware and a software configuration step. For each of these configuration steps algorithms were developed. For the hardware configuration step we first approximate single host performance by a coarse model that requires few, inexpensive to obtain, key parameters. Based on it we perform the hardware selection and determine the workload distribution for the selected host configuration. The workload distribution and the hardware configuration are used to build a Layered Queueing Network model (LQN) of the complete system. It is used to determine a software configuration that actually achieves the performance that has been predicted given the hardware configuration. Since evaluations of the complete model are rather expensive, we use a greedy heuristic, which tries to minimize the number of model evaluations required. We have used our method to configure a large distributed system in order to demonstrate the scalability of the method. For a smaller system configuration we compared the predicted results with real system measurements. The verification on the real system shows that the method could be applied successfully to configure a distributed system to reach maximum performance. As we are using queueing networks for system performance modeling, our system configuration method is based on average system performance values. Hence runtime deviations are not covered during the system design phase and have to be handled during runtime by a scheduler to distribute incoming jobs in an optimal way among the hosts. In our case of the EDI message converter it turned out that the standard online scheduling method doesn't fulfil all requirements. Hence we adapted the Bin Stretching scheduling approach to fulfil the functional requirements of deadline processing and priority processing as well as the system performance requirements of low system response times and high system throughput. The algorithm behavior has been analyzed by simulation in different scenarios corresponding to different message distributions. The simulation results shows that the modified Bin-Stretching strategy generally gives better results than the well known list scheduling in the FCFS variety. We were also able to verify on our real message converter system the general good behavior of our algorithm.

Alternative Abstract:

Alternative Abstract

Language

Ein Maßstab für Entwicklung, Beschaffung und Nutzung von Computer Systemen ist die Systemleistung. Die Leistung eines Computersystems wird typischerweise durch den Datendurchsatz und die Antwortzeit des Systems beschrieben. Die Folgen eines Systems mit schlechter Performance reichen in der Praxis von unzufriedenen Benutzern bis zu hohen Geldstrafen auf Grund von nicht erreichten Verarbeitungszielen. Wegen der ständig zunehmenden Leistungsfähigkeit und sinkenden Preise für Computersysteme werden Leistungsprobleme häufig durch den Kauf von schnelleren Rechnern gelöst. Im Ergebnis wird eine Leistungssteigerung erzielt, allerdings fällt diese häufig geringer aus als erwartet. Die Ursache dafür liegt in einer unpassenden Kombination von Hardwareauswahl und Softwarekonfiguration. Deshalb ist es für Entwickler und Systemdesigner hilfreich, wenn sie eine systematische Methode hätten, die sie in der Auswahl der Hardware und Konfiguration der Software unterstützt. Wir motivieren den Bedarf nach einer systematischen Konfigurationsmethode mit einer typischen B2B Anwendung aus der Finanzindustrie. Banken müssen bei elektronischen Zahlungen verschiedene Nachrichtenformate wie z.B. EDIFACT oder S.W.I.F.T. unterstützen. Für die weitere Verarbeitung ist eine Konvertierung der Nachrichten in das interne Verarbeitungsformat der Bank notwendig. Die Konverter müssen Nachrichtengrößen zwischen 100 Bytes und 60 MBytes verarbeiten können und gleichzeitig gegebene Leistungsanforderungen erfüllen. Die Auswahl der Hardware und Konfiguration der Software ist deshalb ein wichtiger Schritt zum Erreichen der Verarbeitungsziele. Diese Arbeit stellt einen neuen systematischen Ansatz für die Leistungsanalyse von verteilten Systemen vor, der auf vorgegebenen Anforderungen wie Durchsatz und Antwortzeiten basiert. Unsere Methode erlaubt die Suche nach passenden Systemkonfigurationen, bei gleichzeitiger Minimierung von aufwändigen Leistungsevaluationsschritten. Unsere Methode teilt sich auf in die Schritte Hardwarekonfiguration und Softwarekonfiguration. Im Hardwarekonfigurationsschritt leiten wir zunächst ein analytisch zu lösendes Queueing Network Model für die Abschätzung der Leistung eines einzelnen Rechners her, das auf wenigen Schlüsselparametern basiert. Mit Hilfe dieses Models erfolgt in unserem Algorithmus die Hardwareauswahl und gleichzeitig eine Verteilung der Arbeitslasten auf die Rechner. Basierend auf diesen Ergebnissen erfolgt die Konfiguration der Software mit Hilfe eines Layered Queueing Networks (LQN) Models des Systems. Die zielorientierte Suchheuristik verwendet dieses Systemmodel zur Bestimmung der Systemleistung. Zur Demonstration der Skalierbarkeit unserer Methode haben wir sie zur Konfiguration eines komplexen Systems als auch kleineren, realen Systems verwendet. Die anschließenden Messungen verifizierten unsere Vorhersagen und zeigten, dass sich mit unserer Konfigurationsmethode verteilte Systeme erfolgreich konfigurieren lassen. Da wir für unsere Methode Queueing Networks verwenden, basiert unser Model nur auf Durchschnittswerten der Leistungsparameter. Laufzeitabweichungen oder Variationen in den Nachrichtengrößen können nicht berücksichtigt werden. Die Berücksichtigung dieser Variationen erfolgt zur Laufzeit durch einen Prozessplanungsalgorithmus (Scheduling), der eingehende Jobs optimal an die Rechner verteilt. Im Falle unserer Beispielanwendung war es notwendig, einen eigenen Algorithmus zu entwickeln, da die Standardverfahren nicht den Anforderungen entsprachen. In unserem Ansatz haben wir den Bin Stretching Algorithmus angepaßt, sodas dieser die Anforderungen zur Einhaltung von Verarbeitungszeiten und die Verarbeitung von priorisierten Jobs als auch die Leistungsziele wie niedrige Antwortzeiten und hohen Durchsatz erfüllt. Das Verhalten unseres modifizierten Bin Stretching Algorithmus haben wir durch Simulation in verschiedenen Szenarien mit unterschiedlichen Nachrichtenverteilungen und Rechnerkombinationen analysiert. Die Simulationen als auch die Überprüfung im realen System zeigen, dass unser Ansatz im Allgemeinen gleichwertige oder bessere Ergebnisse als vergleichbare Algorithmen zeigt.

German

Uncontrolled Keywords:

Automatische Hardwareauswahl, Automatische Softwarekonfiguration

Alternative keywords: