A Systematic Approach to Benchmark and Improve Automated Static Detection of Java-API Misuses

Amann, Sven (2018)
A Systematic Approach to Benchmark and Improve Automated Static Detection of Java-API Misuses.
Technische Universität Darmstadt
Ph.D. Thesis, Primary publication

Preview

Text
thesis-amann-2018-06-27.pdf - Accepted Version
Copyright Information: CC BY-SA 4.0 International - Creative Commons, Attribution ShareAlike.
Download (1MB) | Preview

Item Type:

Ph.D. Thesis

Type of entry:

Primary publication

Title:

A Systematic Approach to Benchmark and Improve Automated Static Detection of Java-API Misuses

Language:

English

Referees:

Mezini, Prof. Dr. Mira ; Zeller, Prof. Dr. Andreas

Date:

27 June 2018

Place of Publication:

Darmstadt

Date of oral examination:

7 May 2018

Abstract:

Today's software industry relies heavily on the reuse of existing software libraries. Such libraries provide the building blocks for modern software products. Reusing them allow developers to focus on innovation, while standing on the shoulders of giants. To use libraries effectively, developers need to know the Application Programming Interfaces (APIs) through which they communicate with the libraries. This includes both the APIs' semantics and the (implicit) usage constraints that come with them. In the face of the rapidly growing and evolving supply of software libraries, this has become a challenging task. As a result, incorrect usages of APIs, or API misuses, are a prevalent cause of software bugs, crashes, and vulnerabilities.

In reaction to this problem, researchers have proposed a multitude of developer-assistance tools. One particular class of such tools automates the detection of API misuses in software code. We call these tools API-misuse detectors. Existing misuse detectors address different aspects of API misuse. However, no attempt has been made to systematically define the problem space of API misuse and to assess the prevalence of API misuses compared to other types of bugs. This makes it impossible to judge the relevance of research on API-misuse detection. Moreover, previous empirical evaluations of misuse detectors commonly measure the detectors' precision. However, since the studies use different datasets, it is unclear to which extend the results are comparable. It is also unclear where the detectors make trade-offs between their precision and their recall.

In this thesis, we first present a systematic analysis of the problem of API misuse. We find that API misuse causes 9.1% of all software bugs in real-world projects, including many critical issues, such as program crashes, data loss, and security vulnerabilities. Then, we survey the literature to consolidate over a decade of research on API-misuse detection and build MUBench, a public automated benchmark for API-misuse detectors. This enables us to conduct the first-ever qualitative and quantitative comparison of existing misuse detectors. We find that these detectors have the potential to discover many API misuses, but suffer from extremely low precision and recall in practice. Finally, we systematically design MUDetect, a new API-misuse detector that addresses many of the problems of existing detectors. Using MUBench, we demonstrate that MUDetect clearly outperforms existing detectors with respect to both precision and recall. Our results provide strong evidence that, following our systematic approach, we can develop API-misuse detectors that are fit for practical application.

Alternative Abstract:

Alternative Abstract

Language

Die Wiederverwendung bestehender Softwarebibliotheken ist eine Grundpfeiler der heutigen Softwareindustrie. Solche Bibliotheken stellen Bausteine für moderne Softwareprodukte zur Verfügung. Ihre Verwendung erlaubt Softwareentwicklern sich auf innovative Aspekte der Software zu fokussieren, anstatt andauernd das Rad neu zu erfinden. Um Softwarebibliotheken effektive einzusetzen, müssen Entwickler die Semantik und die (teilweise impliziten) Nutzungsbedingungen der Programmierschnittstellen (APIs) kennen, über die sie mit den Bibliotheken interagieren. Angesichts der hohen Geschwindigkeit mit der neue Softwarebibliotheken entstehen und bestehende Bibliotheken weiterentwickelt werden, ist dies zu einer immensen Herausforderung geworden. Aus diesem Grund sind inkorrekte Verwendungen von APIs, sogenannte API Misuses, heute weit verbreitet und verursachen Probleme wie Programmabstürze und Sicherheitslücken.

In Reaktion auf derartige Probleme haben Forscher eine Vielzahl von Assistenzwerkzeugen für Softwareentwickler vorgeschlagen. Eine Kategorie solcher Werkzeuge automatisiert die Identifikation von API Misuses in Software-Quelltext. Diese Werkzeuge werden API Misuse-Detektoren genannt. Bestehende Misuse-Detektoren adressieren unterschiedliche Aspekte von API Misuse. Es wurde jedoch bisher kein Versuch unternommen, das Problem von API Misuse systematisch zu definieren und die relative Häufigkeit von API Misuses in der Menge aller Softwarefehler zu erfassen. Daher ist es unmöglich, die Relevanz von Forschungsarbeit bzgl. API Misuse abzuschätzen. Bisherige empirische Untersuchungen von API Misuse-Detektoren haben deren Precision gemessen. Da diesen Untersuchungen jedoch stets unterschiedliche Datensätze zugrunde lagen, ist es unklar inwieweit die entsprechenden Ergebnisse vergleichbar sind. Ebenso ist unklar, inwieweit die verschiedenen Detektoren niedrigeren Recall zugunsten von höherer Precision in Kauf nehmen.

In der vorliegenden Arbeit präsentieren wir zunächst die Ergebnisse einer systematischen Analyse von API Misuse in Softwareprojekten. Wir belegen, dass API Misuse für 9.1% aller Softwarefehler verantwortlich ist. Viele dieser Fehler haben kritische Auswirkungen, wie Programmabstürze, Datenverlust, oder Sicherheitslücken. Anschließend präsentieren wir eine Literaturübersicht über mehr als 10 Jahre Forschungsarbeit an API Misuse-Detektoren und entwickeln MUBench, einen öffentlichen, automatisierten Benchmark für API Misuse-Detektoren. Diese Schritte ermöglichen uns den ersten qualitative und quantitative Vergleich von API Misuse-Detektoren. Wir zeigen, dass die Detektoren potentiell viele API Misuses identifizieren können, in der praktischen Anwendung aber sowohl im Hinblick auf Precision als auch im Hinblick auf Recall sehr schlecht abschneiden. Zuletzt stellen wir unseren neuen API Misuse-Detektor MUDetect vor, der viele Probleme von anderen Detektoren gezielt vermeidet. Mithilfe von MUBench zeigen wir, dass MUDetect im Vergleich zu anderen Detektoren sowohl eine deutlich höhere Precision also auch einen deutlich höheren Recall erreicht. Unsere Ergebnisse weisen darauf hin, dass wir mit unserem systematischen Ansatz Detektoren entwickeln können, die für den praktischen Einsatz in der Softwareentwicklung geeignet sind.

German

URN:

urn:nbn:de:tuda-tuprints-74222

Classification DDC:

000 Generalities, computers, information > 004 Computer science

Divisions:

20 Department of Computer Science
20 Department of Computer Science > Software Technology

Date Deposited:

09 Jul 2018 09:52

Last Modified:

09 Jul 2020 02:06

URI: