Detection and Exploitation of Information Flow Leaks

Do, Quoc Huy (2017)
Detection and Exploitation of Information Flow Leaks.
Technische Universität Darmstadt
Ph.D. Thesis, Primary publication

Preview

Text
PhDthesis_Huy.pdf - Accepted Version
Copyright Information: CC BY-NC-ND 4.0 International - Creative Commons, Attribution NonCommercial, NoDerivs.
Download (960kB) | Preview

Item Type:

Ph.D. Thesis

Type of entry:

Primary publication

Title:

Detection and Exploitation of Information Flow Leaks

Language:

English

Referees:

Hähnle, Prof. Dr. Reiner ; Sands, Prof. Dr. David

Date:

15 March 2017

Place of Publication:

Darmstadt

Date of oral examination:

27 April 2017

Abstract:

This thesis contributes to the field of language-based information flow analysis with a focus on detection and exploitation of information flow leaks in programs. To achieve this goal, this thesis presents a number of precise semi-automatic approaches that allow one to detect, exploit and judge the severity of information flow leaks in programs.

The first part of the thesis develops an approach to detect and demonstrate information flow leaks in a program. This approach analyses a given program statically using symbolic execution and self-composition with the aim to generate so-called insecurity formulas whose satisfying models (obtained by SMT solvers) give rise to pairs of initial states that demonstrate insecure information flows. Based on these models, small unit test cases, so-called leak demonstrators, are created that check for the detected information flow leaks and fail if these exist. The developed approach is able to deal with unbounded loops and recursive method invocation by using program specifications like loop invariants or method contracts. This allows the approach to be fully precise (if needed) but also to abstract and allow for false positives in exchange for a higher degree of automation and simpler specifications. The approach supports several information flow security policies, namely, noninterference, delimited information release, and information erasure.

The second part of the thesis builds upon the previous approach that allows the user to judge the severity of an information flow leak by exploiting the detected leaks in order to infer the secret information. This is achieved by utilizing a hybrid analysis which conducts an adaptive attack by performing a series of experiments. An experiment constitutes a concrete program run which serves to accumulate the knowledge about the secret. Each experiment is carried out with optimal low inputs deduced from the prior distribution and the knowledge of secret so that the potential leakage is maximized. We propose a novel approach to quantify information leakages as explicit functions of low inputs using symbolic execution and parametric model counting. Depending on the chosen security metric, general nonlinear optimization tools or Max-SMT solvers are used to find optimal low inputs, i.e., inputs that cause the program to leak a maximum of information.

For the purpose of evaluation, both approaches have been fully implemented in the tool KEG, which is based on the state-of-the-art program verification system KeY. KEG supports a rich subset of sequential Java programs and generates executable JUnit tests as leak demonstrators. For the secret inference, KEG produces executable Java programs and runs them to perform the adaptive attack.

The thesis discusses the planning, execution, and results of the evaluation. The evaluation has been performed on a collection of micro-benchmarks as well as two case studies, which are taken from the literature.

The evaluation using the micro-benchmarks shows that KEG detects successfully all information flow leaks and is able to generate correct demonstrators in case the supplied specifications are correct and strong enough. With respect to secret inference, it shows that the approach presented in this thesis (which computes optimal low inputs) helps an attacker to learn the secret much more efficiently compared to approaches using arbitrary low inputs.

KEG has also been evaluated in two case studies. The first case study is performed on an e-voting software which has been extracted in a simplified form from a real-world e-voting system. This case study focuses on the leak detection and demonstrator generation approach. The e-voting case study shows that KEG is able to deal with relatively complicated programs that include unbounded loops, objects, and arrays. Moreover, the case study demonstrates that KEG can be integrated with a specification generation tool to obtain both precision and full automation. The second case study is conducted on a PIN integrity checking program, adapted from a real-world ATM PIN verifying system. This case study mainly demonstrates the secret inference feature of KEG. It shows that KEG can help an attacker to learn the secret more efficiently given a good enough assumption about the prior distribution of secret.

Alternative Abstract:

Alternative Abstract

Language

Diese Dissertation befasst sich mit dem Bereich der sprachbasierten Informationsflussanalyse und erweitert den aktuellen Kenntnisstand insbesondere in den Gebieten der Erkennung und Ausnutzung unsicheren Informationsflusses in Programmen.

Zum Erreichen dieses Ziels werden im Rahmen dieser Arbeit Verfahren entwickelt, die es ermöglichen, unsicheren Informationsfluss in Programmen zu erkennen und dessen Folgen fundiert einzuschätzen. Die entwickelten Verfahren zeichnen sich durch einen hohen Automatisierungsgrad sowie eine hohe Präzision aus.

Im ersten Teil der Dissertation wird ein Verfahren entwickelt, das es erlaubt unsicheren Informationsfluss zu entdecken und aufzuzeigen. Das entwickelte Verfahren kombiniert symbolische Ausführung mit Selbstkomposition (self-composition) von Programmen mit dem Ziel logische Formeln, sog. Unsicherheitsformeln, abzuleiten. Die Modelle dieser Formeln beschreiben initiale Zustandspaare, mit deren Hilfe unsicherer Informationsfluss demonstriert werden kann. Dazu werden Unit-Tests (Leak Demonstrators) erzeugt, die das Programm jeweils in einem der initialen Zustände ausführen und fehlschlagen, wenn ein unsicherer Informationsfluss entdeckt wird. Das entwickelte Verfahren unterstützt unbegrenzte Schleifenausführungen und rekursive Methodenaufrufe mit Hilfe von Programmspezifikationen wie Schleifeninvarianten und Methodenverträgen. Diese Vorgehensweise erlaubt auf der einen Seite eine hohe Präzision (d.h., keine False Positives) und auf der anderen Seite aber auch Abstraktion (und damit eine Zunahme von False Positives) bei höherem Automatisierungsgrad. Der Ansatz unterstützt die Informationsflusspolitiken Noninterference, Delimited Information Release und Information Erasure.

Der zweite Teil der Dissertation baut auf dem oben beschrieben Verfahren auf. Es erlaubt jedoch nicht nur die Demonstration des Vorhandenseins von unsicherem Informationsfluss, sondern ermöglicht es dessen Auswirkungen einzuschätzen. Dazu bettet es das bisherige Verfahren in eine hybride Analyse ein, die einen adaptiven Angriff auf das Programm unter Ausnutzung der entdeckten Unsicherheitsstellen realisiert. Das entwickelte Verfahren führt dazu eine Reihe von Experimenten durch und reichert systematisch das Wissen über das im Programm enthaltene Geheimnis an. Ein Experiment besteht aus einer konkreten Programmausführung mit optimal gewählten Eingabewerten basierend auf einer angenommenen A-priori-Verteilung der Werte des Geheimnisses und des durch die vorherigen Experiment angesammelten Wissens über das aktuell vorliegende Geheimnis. Optimal bedeutet, dass die Programmausführung einen maximalen Wissenszuwachs bedingt. Das entwickelte Verfahren grenzt sich von bisher existierenden Ansätzen dadurch ab, dass es den Informationsfluss als explizite Funktion in Abhängigkeit von öffentlichen Eingabewerten (low inputs), basierend auf Ergebnissen aus der symbolischen Programmausführung und parametrischer Modellzählung präsentiert. In Abhängigkeit von der gewählten Sicherheitsmetrik werden allgemeine nicht-lineare Optimierer und Max-SMT-Problemlöser verwendet, um optimale Eingabewerte zur Geheimnisextraktion zu bestimmen.

Die beiden Ansätze wurden implementiert und sind als Programmanalysewerkzeug KEG, basierend auf dem deduktiven Verifikationssystem KeY, verfügbar. KEG ermöglicht (zu einem hohen Grad) die Analyse sequentieller Java-Programme und erzeugt JUnit Testfälle, um unsicheren Informationsfluss zu demonstrieren. Für die Geheimnisextraktion erzeugt KEG zum Durchspielen eines Angriffs kleine Java Programme und führt diese aus.

Im Rahmen dieser Arbeit wurden die Verfahren mit Hilfe von Microbenchmarks und zweier Fallstudien evaluiert. Die Arbeit beschreibt die Planung sowie Durchführung der Evaluationen und diskutiert deren Ergebnisse.

Die Durchführung der Microbenchmarks zeigte, dass KEG alle unsicheren Informationsflüsse erkennen konnte und entsprechende Demonstratoren erzeugte, sofern die notwendigen Spezifikation korrekt und ausreichend vollständig waren. Es konnte auch nachgewiesen werden, dass KEG bei der Geheimnisextraktion wesentlich effizienter ist als ein Angreifer, der mit zufällig gewählten Experimenten arbeitet.

KEG bewies in den beiden Fallstudien, dass die Verfahren auch zur Analyse realistischerer Programme eingesetzt werden können. Die erste Fallstudie betrachtet eine vereinfachte Version einer realen e-Voting Software mit unbegrenzten Schleifenausführungensowie, Objekt- und Arraydatentypen. Neben der allgemeinen Anwendbarkeit von KEG konnte insbesondere demonstriert werden, dass sich KEG gut mit einem Spezifikationserzeugungswerkzeug kombinieren lässt, um eine vollautomatische Analyse des Programms zu erreichen.

Die zweite Fallstudie betrachtet einem der Literatur entnommenen Algorithmus zum Überprüfen der Integrität von PINs. KEG bewies hier, dass es unter Annahme einer ausreichend guten A-priori-Verteilung der Geheimniswerte einem Angreifer erlaubt, die geheimen PINs effizienter zu extrahieren.

German

URN:

urn:nbn:de:tuda-tuprints-62587

Classification DDC:

000 Generalities, computers, information > 004 Computer science

Divisions:

20 Department of Computer Science > Software Engineering

Date Deposited:

26 May 2017 11:17

Last Modified:

09 Jul 2020 01:39

URI: