Incrementalization of Analyses for Next Generation IDEs

Kloppenburg, Sven (2009)
Incrementalization of Analyses for Next Generation IDEs.
Technische Universität Darmstadt
Ph.D. Thesis, Primary publication

Preview

PDF
main.pdf
Copyright Information: CC BY-NC-ND 2.5 Generic - Creative Commons, Attribution, NonCommercial, NoDerivs .
Download (2MB) | Preview

Item Type:

Ph.D. Thesis

Type of entry:

Primary publication

Title:

Incrementalization of Analyses for Next Generation IDEs

Language:

English

Referees:

Mezini, Prof. Dr.– Mira ; Schürr, Prof. Dr. Andy

Date:

15 November 2009

Place of Publication:

Darmstadt

Date of oral examination:

16 January 2009

Abstract:

To support developers in their day–to–day work, Integrated Develoment En- vironments (IDEs) incorporate more and more ways to help developers focus on the inherent complexities of developing increasingly larger software systems. The complexity of developing large software systems can be categorized into inherent complexity that stems from the complexity of the problem domain and accidential complexity that stems from the shortcomings of the tools and methods used to tackle the problem. For example: To reduce the complexity of having to know exactly, which methods a certain class provides, IDEs offer autocompletion. To alert developers to errors and potential errors in their use of the programming language, IDEs connect the lists of warnings and errors with their source locations. To ease navigation in bigger projects, structural views of the program, such as the type hierarchy are presented. Development environments thus enable developers to be more productive and help them to find bugs earlier in the development cycle by using codified expert knowledge. In these environments, static anlyses are used to extract information from the program under development. Static analyses detect properties of pro- grams without running them. In the past, static analyses were mostly integrated into compilers with the goal to check for errors and to produce faster or smaller code. Integrating static analyses into the IDE opens up new areas of application. Among these are domain specific analyses, optional type systems and checks for structural properties. Domain specific analyses check properties specific to the program under development. For example, that the use of a framework conforms to the specifications. Optional type systems are type systems that do not influence the runtime semantics. This allows to have multiple type systems (e.g. confined types and the builtin Java type system) to coexist and to be checked by static analyses. If these analyses are available to developers, a wider range of software defects can be detected. By integrating the analyses into the IDE, faster and better feedback can be delivered. This enables developers to incorporate the analyses in their daily workflow, as it preserves the immediacy. To gain full advantage of IDE integration, the analyses need to be integrated into the incremental build process of the IDE and the rulebases should be modularly modifiable to fit the program under inspection. One exam- ple for an open, modular approach to achieve this is Magellan. Magellan is a build process integrated open static analysis platform that tackles the problems of integrating static analyses with the IDE and in particular with the incremental build process. To benefit from this integration, analyses running on such platforms need to work in an incremental fashion. In this thesis, approaches for incrementalizing static analyses for integrated open static analysis platforms are analyzed. Incrementalizing a static analysis means, that the analyses uses the result from a previous build and the changes made to the program as additional input to reconcile the result from the previous build. The result is equal to an analysis of the full build. The approaches can be categorized into manual incrementalization and automatic incrementalization. Manual incrementalization uses a general pur- pose language, such as Java, to implement a static analysis that achieves the incrementalization using a special purpose algorithm. Automatic incremen- talization means, that the analysis is written with the full build in mind, and the underlying mechanisms of the language or framework has a builtin mechanism to reconcile the results for the changed program. Currently, incremental analyses are developed in an ad hoc fashion, choos- ing the approach the developer is most familiar with. If the appraoch taken is not the best for the problem at hand, then either the development will take longer or the analysis will run slower then necessary. To investigate the properties of analyses that influence the recommended appraoches to incrementalization, three static analyses have been selected. The analyses were implemented twice; once using the manual approach and once using the automatic approach. The three selected analysis represent analyses that check for data flow properties, control flow properties and structural properties of the inspected program. The analysis that checks for data flow properties searches for violations of an optional type system for confined types. The analysis that checks for control flow properties incrementally computes the call graph using the rapid type analysis (RTA). Finally, the static analysis that checks for structural properties searches for violations of structural dependencies between concerns in the program. The results indicate, that analyses incorporating query engines to be used by the user of the analysis need to use automatic incrementalization at least for this purpose. Analyses that can be configured only in narrow, predictable ways lend themselves to manual incrementalization. Then the domain knowl- edge allows for domainspecific optimizations that cannot easily be integrated into the frameworks for automatic incrementalization.

Alternative Abstract:

Alternative Abstract

Language

Um Entwickler in ihrer täglichen Arbeit zu unterstützen, integrieren integrierte Entwicklungsumgebungen (IDEs) zunehmend mehr Hilfsmittel, die es Entwicklern erlauben, sich auf die inhärente Komplexität der Entwicklung zunehmend größerer Software Systeme zu konzentrieren. Die Komplexität der Entwicklung dieser Systeme wird unterteilt in inhärente Komplexität die aus der Komplexität der Problemstellung stammt, sowie accidentielle Komplexität die von der Unzulänglichkeit der verwendeten Werkzeuge und Methoden kommt und daher durch bessere Werkzeuge beseitigt werden kann. So bieten IDEs automatische Vervollständigung an, damit Entwickler sich nicht die genaue Schreibweise von Methoden merken müssen. Um Entwickler auf (potentielle) Fehler im Gebrauch der Programmiersprache hinzuweisen, werden Fehlermeldungen in der IDE mit dem Quelltext verknüpft. Um die Navigation in Projekten zu erleichtern bieten IDEs strukturelle Ansichten des Programms, wie z. B. die Typhierarchie an. IDEs ermöglichen Entwicklern, produktiver zu sein und Fehler früher zu finden, in dem sie kodifiziertes Expertenwissen nutzen. In IDEs werden statische Analysen benutzt, um Informationen aus dem Programm in Entwicklung zu extrahieren. Statische Analysen entdecken Eigenschaften von Programmen, ohne diese auszuführen. In der Vergan-genheit waren statische Analysen meist in Compilern integriert, um Fehler zu finden und kleineren oder schnelleren Code zu produzieren. Werden statische Analysen in IDEs integriert, öffnen sich ihnen neue Anwendungsgebiete. Unter diesen sind domainspezifische Analysen, optionale Typsysteme und das Überprüfen struktureller Eigenschaften. Optionale Typsysteme sind Typsysteme die die Laufzeitsemantik nicht verändern. Das erlaubt es, mehrere Typsysteme (zum Beispiel confined types und das Java Typ- system) zu kombinieren und von statischen Analysen prüfen zu lassen. Wenn diese Analysen Entwicklern zur Verfügung stehen, kann ein breiterer Bereich von Softwaredefekten erkannt werden. Durch das Integrieren der Analysen in die Entwicklungsumgebung kann dem Entwickler schneller und besser Rückmeldung gegeben werden. Das erlaubt es den Entwicklern, die Analysen in ihren alltäglichen Arbeitsablauf zu integrieren, da die Unmittelbarkeit der Rückmeldungen gewahrt bleibt. Um den vollen Vorteil der IDE Integration zu erreichen, müssen die Analysen zum Einen in den inkrementellen Übersetzungsvorgang eingebettet werden und zum Anderen müssen die Analysen die ausgeführt werden an das untersuchte Programm anpassbar sein. Ein Beispiel für einen offenen, modularen Ansatz um dies zu erreichen ist Magellan. Magellan ist eine offene statische Analyseplattform die in den inkrementellen Übersetzungsvorgang integriert ist und die es ermöglicht, statische Analysen in IDEs und insbesondere in den inkrementellen Übersetzungsvorgang zu integrieren. In dieser Arbeit werden Ansätze zur Inkrementalisierung statischer Analysen für integrierte, offene statische Analyseplattformen untersucht. Eine statische Analyse zu inkrementalisieren bedeutet, dass die Analyse die Ergebnisse eines vorhergenden Übersetzunsvorgangs und die Änderungen am Programm nutzt, um das Analyseergebnis an den aktuellen Zustand des Pro- gramms anzugleichen. Das Analyseergebnis ist dann äquivalent zu einer kompletten Analyse des Programms im aktuellen Zustand. Die Ansätze hierzu können in manuelle und automatische Inkrementalisierung eingeteilt werden. Manuelle Inkrementalisierung nutzt eine universelle Programmiersprache, wie beispielsweise Java, um eine statische Analyse zu implementieren, die die Inkrementalisierung in einem spezialisierten Algorithmus verwirklicht. Bei automatischer Inkrementalisierung wird die Analyse geschrieben, wie für die komplette Analyse, da Die zugrundeliegende Sprache beziehungsweise Framework einen Mechanismus anbietet, um die Analyseergebnisse an die Programmänderungen anzupassen. Gegenwärtig werden inkrementelle Analysen ad hoc entwickelt, mittels dem Ansatz, der dem Entwickler am vertrautesten ist. Wenn aber der Ansatz nicht der am besten geeignetste für das Problem ist, wird die Entwicklungs- zeit oder die Laufzeit der Analyse länger sein als notwendig. Um die Eigen- schaften von Analysen zu untersuchen, die die Wahl des Ansatzes beein- flussen, wurden drei Analysen ausgewählt. Diese Analysen wurden je einmal mit dem manuellen und dem automatischen Ansatz implementiert. Die ausgewählten Analysen repräsentieren Analysen, die den Daten- und den Kontrollfluss untersuchen, sowie Analysen die strukturelle Eigenschaften überprüfen. Die Analyse die Dateflusseigenschaften überprüft, sucht Verletzungen des optionalen Typsystems confined types. Die Analyse die den Kontrollfluss untersucht, erstellt und wartet einen intraprozeduralen call graph mit Hilfe der rapid type analysis (schnelle Typanalyse, RTA). Die Analyse die strukturelle Eigenschaften prüft, sucht nach Verletzungen von strukturellen Abhängigkeiten zwischen Belangen (concerns) im Programm. Die Ergebnisse deuten darauf hin, dass Analysen, die Abfragemechanismen (query engines) beinhalten, zumindest für diesen Teil automatische Inkrementalisierung nutzen sollten. Analysen, die sich nur in einfacher, Vorhersagbarer Weise konfigurieren lassen, eignen sich eher für manuelle Inkrementalisierung. Dann kann Wissen über das Fachgebiet der Problem- stellung Optimierungen ermöglichen, die sich nicht ohne weiteres in Umgebungen für automatische Inkrementalisierung integrieren lassen.

German

Uncontrolled Keywords:

incremental static analysis, prolog, java, IDE, DSL, confined types, RTA, call graph, software architecture

Alternative keywords: