TU Darmstadt

ULB

TUprints

Analyzing and Applying Cryptographic Mechanisms to Protect Privacy in Applications

Treiber, Amos (2022)
Analyzing and Applying Cryptographic Mechanisms to Protect Privacy in Applications.
Technische Universität Darmstadt
doi: 10.26083/tuprints-00022922
Ph.D. Thesis, Primary publication, Publisher's Version

Text
dissertation_treiber.pdf
Copyright Information: In Copyright.
Download (4MB)

Item Type:

Ph.D. Thesis

Type of entry:

Primary publication

Title:

Analyzing and Applying Cryptographic Mechanisms to Protect Privacy in Applications

Language:

English

Referees:

Schneider, Prof. Dr. Thomas ; Kamara, Prof. Dr. Seny

Date:

2022

Place of Publication:

Darmstadt

Collation:

147 Seiten in verschiedenen Zählungen

Date of oral examination:

17 November 2022

DOI:

10.26083/tuprints-00022922

Abstract:

Privacy-Enhancing Technologies (PETs) emerged as a technology-based response to the increased collection and storage of data as well as the associated threats to individuals' privacy in modern applications. They rely on a variety of cryptographic mechanisms that allow to perform some computation without directly obtaining knowledge of plaintext information. However, many challenges have so far prevented effective real-world usage in many existing applications. For one, some mechanisms leak some information or have been proposed outside of security models established within the cryptographic community, leaving open how effective they are at protecting privacy in various applications. Additionally, a major challenge causing PETs to remain largely academic is their practicality-in both efficiency and usability. Cryptographic mechanisms introduce a lot of overhead, which is mostly prohibitive, and due to a lack of high-level tools are very hard to integrate for outsiders.

In this thesis, we move towards making PETs more effective and practical in protecting privacy in numerous applications. We take a two-sided approach of first analyzing the effective security (cryptanalysis) of candidate mechanisms and then building constructions and tools (cryptographic engineering) for practical use in specified emerging applications in the domain of machine learning crucial to modern use cases. In the process, we incorporate an interdisciplinary perspective for analyzing mechanisms and by collaboratively building privacy-preserving architectures with requirements from the application domains' experts.

Cryptanalysis. While mechanisms like Homomorphic Encryption (HE) or Secure Multi-Party Computation (SMPC) provably leak no additional information, Encrypted Search Algorithms (ESAs) and Randomization-only Two-Party Computation (RoTPC) possess additional properties that require cryptanalysis to determine effective privacy protection.

ESAs allow for search on encrypted data, an important functionality in many applications. Most efficient ESAs possess some form of well-defined information leakage, which is cryptanalyzed via a breadth of so-called leakage attacks proposed in the literature. However, it is difficult to assess their practical effectiveness given that previous evaluations were closed-source, used restricted data, and made assumptions about (among others) the query distribution because real-world query data is very hard to find. For these reasons, we re-implement known leakage attacks in an open-source framework and perform a systematic empirical re-evaluation of them using a variety of new data sources that, for the first time, contain real-world query data. We obtain many more complete and novel results where attacks work much better or much worse than what was expected based on previous evaluations.

RoTPC mechanisms require cryptanalysis as they do not rely on established techniques and security models, instead obfuscating messages using only randomizations. A prominent protocol is a privacy-preserving scalar product protocol by Lu et al. (IEEE TPDS'13). We show that this protocol is formally insecure and that this translates to practical insecurity by presenting attacks that even allow to test for certain inputs, making the case for more scrutiny of RoTPC protocols used as PETs.

This part of the thesis is based on the following two publications: [KKM+22] S. KAMARA, A. KATI, T. MOATAZ, T. SCHNEIDER, A. TREIBER, M. YONLI. “SoK: Cryptanalysis of Encrypted Search with LEAKER - A framework for LEakage AttacK Evaluation on Real-world data”. In: 7th IEEE European Symposium on Security and Privacy (EuroS&P’22). Full version: https://ia.cr/2021/1035. Code: https://encrypto.de/code/LEAKER. IEEE, 2022, pp. 90–108. Appendix A. [ST20] T. SCHNEIDER , A. TREIBER. “A Comment on Privacy-Preserving Scalar Product Protocols as proposed in “SPOC””. In: IEEE Transactions on Parallel and Distributed Systems (TPDS) 31.3 (2020). Full version: https://arxiv.org/abs/1906.04862. Code: https://encrypto.de/code/SPOCattack, pp. 543–546. CORE Rank A*. Appendix B.

Cryptographic Engineering. Given the above results about cryptanalysis, we investigate using the leakage-free and provably-secure cryptographic mechanisms of HE and SMPC to protect privacy in machine learning applications. As much of the cryptographic community has focused on PETs for neural network applications, we focus on two other important applications and models: Speaker recognition and sum product networks. We particularly show the efficiency of our solutions in possible real-world scenarios and provide tools usable for non-domain experts.

In speaker recognition, a user's voice data is matched with reference data stored at the service provider. Using HE and SMPC, we build the first privacy-preserving speaker recognition system that includes the state-of-the-art technique of cohort score normalization using cohort pruning via SMPC. Then, we build a privacy-preserving speaker recognition system relying solely on SMPC, which we show outperforms previous solutions based on HE by a factor of up to 4000x. We show that both our solutions comply with specific standards for biometric information protection and, thus, are effective and practical PETs for speaker recognition.

Sum Product Networks (SPNs) are noteworthy probabilistic graphical models that-like neural networks-also need efficient methods for privacy-preserving inference as a PET. We present CryptoSPN, which uses SMPC for privacy-preserving inference of SPNs that (due to a combination of machine learning and cryptographic techniques and contrary to most works on neural networks) even hides the network structure. Our implementation is integrated into the prominent SPN framework SPFlow and evaluates medium-sized SPNs within seconds.

This part of the thesis is based on the following three publications: [NPT+19] A. NAUTSCH, J. PATINO, A. TREIBER, T. STAFYLAKIS, P. MIZERA, M. TODISCO, T. SCHNEIDER, N. EVANS. Privacy-Preserving Speaker Recognition with Cohort Score Normalisation”. In: 20th Conference of the International Speech Communication Association (INTERSPEECH’19). Online: https://arxiv.org/abs/1907.03454. International Speech Communication Association (ISCA), 2019, pp. 2868–2872. CORE Rank A. Appendix C. [TNK+19] A. TREIBER, A. NAUTSCH , J. KOLBERG , T. SCHNEIDER , C. BUSCH. “Privacy-Preserving PLDA Speaker Verification using Outsourced Secure Computation”. In: Speech Communication 114 (2019). Online: https://encrypto.de/papers/TNKSB19.pdf. Code: https://encrypto.de/code/PrivateASV, pp. 60–71. CORE Rank B. Appendix D. [TMW+20] A. TREIBER , A. MOLINA , C. WEINERT , T. SCHNEIDER , K. KERSTING. “CryptoSPN: Privacy-preserving Sum-Product Network Inference”. In: 24th European Conference on Artificial Intelligence (ECAI’20). Full version: https://arxiv.org/abs/2002.00801. Code: https://encrypto.de/code/CryptoSPN. IOS Press, 2020, pp. 1946–1953. CORE Rank A. Appendix E.

Overall, this thesis contributes to a broader security analysis of cryptographic mechanisms and new systems and tools to effectively protect privacy in various sought-after applications.

Alternative Abstract:

Alternative Abstract

Language

Privatsphäre-Erweiternde Technologien (PETs) entstanden als technologiebasierte Antwort auf die erhöhte Erhebung und Speicherung von Daten sowie der assoziierten Bedrohungen der Privatsphäre von Individuen in modernen Anwendungen. Sie beruhen auf verschiedenen kryptografischen Mechanismen, die sichere Berechnungen erlauben, ohne dass Wissen über Klartextinformationen erlangt wird. Allerdings haben viele Herausforderungen bisher eine effektive Nutzung in der realen Welt in vielen existierenden Anwendungen verhindert. Einerseits geben manche Mechanismen einige Informationen preis oder wurden außerhalb der in der kryptografischen Gemeinschaft etablierten Sicherheitsmodelle vorgeschlagen, was offen lässt, wie effektiv sie darin sind, in verschiedenen Anwendungen die Privatsphäre zu schützen. Zusätzlich ist eine hauptsächliche Herausforderung, die PETs größtenteils akademisch bleiben lässt, ihre Praktikalität-sowohl für Effizienz als auch Nutzbarkeit. Kryptografische Mechanismen verursachen einen großen Overhead, der meistens untragbar ist, und sind für Außenstehende wegen eines Fehlens abstrakter Tools sehr schwer zu integrieren.

In dieser Dissertation richten wir uns danach, PETs effizienter und praktischer darin zu gestalten, in verschiedenen Anwendungen die Privatsphäre zu schützen. Wir nutzen eine zweiseitige Herangehensweise, die zuerst die effektive Sicherheit geeigneter Mechanismen analysiert (Kryptoanalyse) und dann Konstruktionen und Tools für praktischen Nutzen in spezifizierten, aufkommenden Anwendungen in der für moderne Anwendungen entscheidende Domäne des maschinellen Lernens baut (kryptografische Entwicklung). Währenddessen bauen wir eine interdisziplinäre Perspektive ein, sowohl für die Analyse von Mechanismen als auch für das gemeinsame Bauen Privatsphäre-schützender Architekturen mit den Anforderungen von den Experten der Anwendungsdomäne.

Kryptoanalyse. Während Mechanismen wie homomorphe Verschlüsselung (im Englischen Homomorphic Encryption; HE) oder sichere Mehrparteienberechnung (im Englischen Secure Multi-Party Computation; SMPC) beweisbar keine weiteren Informationen preisgeben, besitzen verschlüsselte Suchalgorithmen (im Englischen Encrypted Search Algorithms; ESAs) und nur Randomisierung-basierte Zweiparteienberechnung (im Englischen Randomization-only Two-Party Computation; RoTPC) weitere Eigenschaften, die Kryptoanalyse benötigen, um effektiven Privatsphärenschutz festzustellen.

ESAs erlauben die Suche auf verschlüsselten Daten, was eine wichtige Funktionalität in vielen Anwendungen ist. Die meisten effizienten ESAs besitzen eine Art von wohldefinierter Preisgabe von Informationen (im Englischen Leakage), die durch eine Breite an in der Literatur vorgeschlagenen, sogenannten Leakage-Angriffen kryptoanalysiert wird. Es ist allerdings schwer, deren praktische Effektivität festzustellen, da vorige Evaluationen keinen Quellcode bereitgestellt haben, eingeschränkte Daten benutzt haben und viele Annahmen getroffen haben, u.a. über die Verteilung der Suchanfragen, da echte Suchanfragedaten sehr schwer zu finden sind. Aus diesen Gründen reimplementieren wir bekannte Leakage-Angriffe in einer offen verfügbaren Softwareplattform und führen eine systematische, empirische Reevaluation dieser Angriffe auf verschiedenen neuen Datenquellen aus, die zum ersten Mal echte Suchanfragen enthalten. Wir bekommen viel komplettere und neue Resultate, wobei Angriffe viel besser oder viel schlechter als basierend auf den vorigen Evaluationen erwartet funktionieren.

RoTPC-Mechanismen brauchen Kryptoanalyse, weil sie nicht auf etablierten Techniken und Sicherheitsmodellen basieren und stattdessen Nachrichten nur mit Randomisierung verschleiern. Ein prominentes Protokoll ist ein Privatsphäre-schützendes Skalarproduktprotokoll von Lu et al. (IEEE TPDS'13). Wir zeigen, dass dieses Protokoll formal unsicher ist und dass sich das zu einer praktischen Unsicherheit übersetzt, indem wir Angriffe präsentieren, die es sogar erlauben, auf bestimmte Eingaben zu testen, was Argumente für die genaueste Überprüfung von RoTPC-Protokollen als PETs liefert.

Dieser Teil der Dissertation basiert auf den folgenden zwei Publikationen: [KKM+22] S. KAMARA, A. KATI, T. MOATAZ, T. SCHNEIDER, A. TREIBER, M. YONLI. “SoK: Cryptanalysis of Encrypted Search with LEAKER - A framework for LEakage AttacK Evaluation on Real-world data”. In: 7th IEEE European Symposium on Security and Privacy (EuroS&P’22). Full version: https://ia.cr/2021/1035. Code: https://encrypto.de/code/LEAKER . IEEE, 2022, S. 90–108. Appendix A. [ST20] T. SCHNEIDER , A. TREIBER. “A Comment on Privacy-Preserving Scalar Product Protocols as proposed in “SPOC””. In: IEEE Transactions on Parallel and Distributed Systems (TPDS) 31.3 (2020). Full version: https://arxiv.org/abs/1906.04862. Code: https://encrypto.de/code/SPOCattack, S. 543–546. CORE Rank A*. Appendix B.

Kryptografische Entwicklung. Unter den obenstehenden Kryptoanalyse-Resultaten untersuchen wir die Nutzung von Leakage-freien und beweisbar sicheren kryptografischen Mechanismen HE und SMPC, um die Privatsphäre in Anwendungen des maschinellen Lernens zu schützen. Da ein Großteil der kryptografischen Gemeinschaft sich auf PETs für Anwendungen mit neuronalen Netzwerken fokussiert hat, fokussieren wir uns auf zwei weitere Anwendungen und Modelle: Sprecherauthentifizierung und Summe-Produkt-Netzwerke. Wir zeigen besonders die Effizienz unserer Lösungen in möglichen, echten Szenarien und stellen Tools für außenstehende Experten zur Verfügung.

Bei der Sprecherauthentifizierung werden Sprachdaten eines Nutzenden mit Referenzdaten verglichen, die bei einem Dienstleistungsanbieter gespeichert sind. Wir benutzen HE und SMPC, um das erste Privatsphäre-schützende Sprecherauthentifizierungssystem zu bauen, das die neusten Techniken mit Normalisierung von Scores mittels Kohorten durch eine Kohortenkürzung mit SMPC beinhaltet. Dann bauen wir ein Privatsphäre-schützende Sprecherauthentifizierungssystem, das nur auf SMPC basiert, und zeigen, dass es vorige HE-basierte Lösungen um einen Faktor bis zu 4000x übertrifft. Wir zeigen, dass unsere beide Lösungen spezifische Standards zum Schutz biometrischer Daten erfüllen und daher effektive und praktische PETs für Sprecherauthentifizierung sind.

Summe-Produkt-Netzwerke (SPNs) sind nennenswerte probabilistische, grafische Modelle die-wie neuronale Netzwerke-auch effiziente Methoden für Privatsphäre-schützende Inferenz als eine PET benötigen. Wir präsentieren CryptoSPN, welches SMPC für Privatsphäre-schützende Inferenz von SPNs nutzt, die (aufgrund einer Kombination von Techniken des maschinellen Lernens und der Kryptografie) sogar die Netzwerkstruktur versteckt. Unsere Implementierung ist in die prominente SPN-Softwareplattform SPFlow integriert und evaluiert mittelgroße SPNs innerhalb von Sekunden.

Dieser Teil der Dissertation basiert auf den folgenden drei Publikationen: [NPT+19] A. NAUTSCH, J. PATINO, A. TREIBER, T. STAFYLAKIS, P. MIZERA, M. TODISCO, T. SCHNEIDER, N. EVANS. Privacy-Preserving Speaker Recognition with Cohort Score Normalisation”. In: 20th Conference of the International Speech Communication Association (INTERSPEECH’19). Online: https://arxiv.org/abs/1907.03454. International Speech Communication Association (ISCA), 2019, S. 2868–2872. CORE Rank A. Appendix C. [TNK+19] A. TREIBER, A. NAUTSCH , J. KOLBERG , T. SCHNEIDER , C. BUSCH. “Privacy-Preserving PLDA Speaker Verification using Outsourced Secure Computation”. In: Speech Communication 114 (2019). Online: https://encrypto.de/papers/TNKSB19.pdf. Code: https://encrypto.de/code/PrivateASV, S. 60–71. CORE Rank B. Appendix D. [TMW+20] A. TREIBER , A. MOLINA , C. WEINERT , T. SCHNEIDER , K. KERSTING. “CryptoSPN: Privacy-preserving Sum-Product Network Inference”. In: 24th European Conference on Artificial Intelligence (ECAI’20). Full version: https://arxiv.org/abs/2002.00801. Code: https://encrypto.de/code/CryptoSPN. IOS Press, 2020, S. 1946–1953. CORE Rank A. Appendix E.

Insgesamt trägt diese Thesis zu einer breiteren Sicherheitsanalyse kryptografischer Mechanismen und zu neuen Systemen und Tools zum effektiven Schutz der Privatsphäre in verschiedenen gefragten Anwendungen bei.

German

Status:

Publisher's Version

URN:

urn:nbn:de:tuda-tuprints-229225

Classification DDC:

000 Generalities, computers, information > 004 Computer science

Divisions:

20 Department of Computer Science > Cryptography and Privacy Engineering (ENCRYPTO)

Date Deposited:

28 Nov 2022 13:46

Last Modified: