Inter-Model Consistency Checking and Restoration with Triple Graph Grammars DEM FACHBERE ICH ELEKTROTECHNIK UND INFORMAT IONSTECHNIK DER TECHNISCHEN UNIVERS ITÄT DARMSTADT ZUR ERLANGUNG DES AKADEMISCHEN GRADES E INES DOKTOR- INGENIEURS (DR . - ING . ) GENEHMIGTE D I S SERTAT ION VON ERHAN LEBLEB IC I GEBOREN AM 3 1 . MA I 1 9 8 8 IN I STANBUL , TÜRKE I ERSTGUTACHTER : PROF. DR . ANDY SCHÜRR ZWE ITGUTACHTER : PROF. DR . BERNHARD WESTFECHTEL TAG DER E INRE ICHUNG : 3 0 . 0 1 . 2 0 1 8 TAG DER D I SPUTAT ION : 0 4 . 0 5 . 2 0 1 8 DARMSTADT 20 1 8 The work of Erhan Leblebici was supported by the Graduate School of Excellence Computational Engineering (CE) at the Technische Universität Darmstadt. The GraTraM project presented in this thesis was funded by the German Federal Ministry of Education and Research, funding code 01|S12054, in the context of the Software Campus (www.softwarecampus.de). Leblebici, Erhan Inter-Model Consistency Checking and Restoration with Triple Graph Grammars Darmstadt, Technische Universität Darmstadt Year of publication at TUprints: 2018 Disputation date: May 4, 2018 Published under CC BY-SA 4.0 International https://creativecommons.org/licenses/ www.softwarecampus.de https://creativecommons.org/licenses/ DECLARAT ION OF AUTHORSH IP I warrant that the thesis presented here is my original work and that I have not received outside assistance. All references and other sources that I used have been appropriately acknowledged in the work. I further declare that the work has not been submitted anywhere else for the purpose of academic examination, either in its original or similar form. I hereby grant the Real-Time Systems Lab the right to publish, reproduce and distribute my work. Darmstadt, 30.01.2018 Erhan Leblebici Simplicity is a great virtue but it requires hard work to achieve it and education to appreciate it. And to make matters worse: complexity sells better. Edsger Wybe Dijkstra ACKNOWLEDGMENTS I wish to thank Prof. Dr. Andy Schürr for his supervisionwhich has alwaysmademe feel lucky in both personal and technical ways, Prof. Dr. Bernhard Westfechtel for kindly accepting to serve the PhD thesis com- mittee and for his valuable comments on an earlier version of this document, Dr. Anthony Anjorin,Dr. Gergely Varró, and Lars Fritsche formany fruitful hours of co-desinging, co-developing, and co-authoring, all those who shared my PhD highs and lows at the university canteen in Darmstadt or during my time-constrained but intense visits in Istanbul, finally yet foremost, my parents Mehmet and Nursen Leblebici and my brother İhsan Metin Leblebici for their continuous and unconditional support, no mat- ter the distance and no matter the circumstance. You all made this work possible. Darmstadt, 2018 ABSTRACT Software development is a complex task. The success of a software project highly relies on the involvement of domain experts in the development process. In re- cent years, therefore, the field of software engineering has been striving to elevate the level of abstraction towards domain-specific concepts (instead of the computa- tion-oriented nature of classical programming languages). Model-Driven Engineer- ing (MDE), a novel software development methodology, lies at the heart of these efforts. In MDE, a model represents an abstraction of a system with one specific goal inmind.Hence, theMDE strategy does not only dealwith abstractions but also advocates the co-existence of relatedmodels capturing different aspects of the same system.While this supports separation of concerns, consistencymanagement between relatedmodels becomes a crucial challenge asmodels are changed throughout their life cycle. Two basic building blocks of consistency management are (i) consistency checking to indicate whether or to what extent two related models are consistent and (ii) consistency restoration to suitably handle discrepancies between models. To address consistency management tasks in a formally-founded manner, bidi- rectional transformations (BX) have been established as a research area. Among the diverse BX approaches, Triple Graph Grammars (TGGs) represent a prominent tech- nique with various implementations and industrial applications. In this setting, models are formalized as graphs and consistency is described as a grammar con- structing two consistent models together with a correspondence model. Consis- tency management tools are then automatically derived from this description. Current TGG approaches (and in fact also BX approaches in general) focus on consistency scenarioswhere only onemodel ismaintained by human intelligence at the same time and the other one is automatically updated by consistency restoration. Consistency management between two concurrently developed models, however, is not sufficiently supported as practical solutions for consistency checking are essentially missing. Strategies for consistency restoration, furthermore, range from heuristics to auxiliary model analyses which constitute the most complex and least understood part of TGGs. Despite sharing the same basic goal and the same formal foundation, it is difficult to exchange ideas amongst the different TGG approaches. In this thesis, therefore, we first establish consistency checking as a novel use case of TGGs. We identify search space problems in consistency checking and over- come them by combining TGGs with linear optimization techniques. Second, we propose a novel consistency restoration procedure that exploits incremental pattern matching techniques to address the intermediate steps of consistency restoration in a simplified manner. Furthermore, we present a TGG tool that implements our formal results and experimentally evaluate its scalability in real-world consistency scenarios. Finally, we report on an industrial project from the mechanical engi- neering domain where we applied this tool for maintaining consistency between computer-aided design and mechatronic simulation models. ZUSAMMENFASSUNG Softwareentwicklung ist eine komplexe Aufgabe. Der Erfolg eines Softwarepro- jektes ist in hohem Maße auf die Beteiligung der Domänenexperten im Entwick- lungsprozess angewiesen. Das Gebiet Softwaretechnik strebt daher höhere Ab- straktionsstufen mit Fokus auf domänenspezifische Konzepte an (anstatt der auf Rechnen orientierten Natur der klassischen Programmiersprachen). Model-Driven Engineering (MDE), eineneueSoftwareentwicklungsmethodik, liegt imMittelpunkt dieser Bestrebungen. In MDE stellt ein Modell eine Abstraktion eines Systems mit einem spezifischen Ziel dar. Folglich beschäftigt sich MDE nicht nur mit Abstrak- tionen, sondern befürwortet auch die Koexistenz verwandterModelle, die dasselbe System beschreiben. Während dies die Trennung der Zuständigkeiten unterstützt, wird Konsistenzverwaltung zwischen verwandten Modellen eine wichtige Heraus- forderung, da Modelle sich im Laufe ihres Lebenszyklus ändern. Zwei Grund- bausteine der Konsistenzverwaltung sind (i) Konsistenzprüfung, um zu bestimmen, ob oder inwiefern zwei verwandteModelle konsistent sindund (ii)Konsistenzwieder- herstellung, um Diskrepanzen zwischen den Modellen zu beseitigen. UmKonsistenzverwaltungmit formalenMitteln anzugehenwurden bidirektionale Transformationen (BX) als ein Forschungsfeld etabliert. Unter den BX-Ansätzen stellen Tripel-Graph-Grammatiken (TGGen) eine prominente Technik mit verschiede- nen Implementierungen und industriellen Anwendungen dar. Dabei werden Mo- delle als Graphen undKonsistenzbeziehungen als eineGrammatik beschrieben, die zwei konsistenteModellemit einemKorrespondenzmodell aufbaut.Werkzeuge zur Konsistenzverwaltung werden aus dieser Beschreibung generiert. Existierende TGG-Ansätze (sowie BX-Ansätze generell) setzen weitgehend vor- aus, dass gleichzeitig nur das eine Modell durch menschliche Intelligenz gepflegt und das andere durch Konsistenzwiederherstellung automatisch aktualisiert wird. Das Problem der Konsistenzerhaltung zwischen zwei konkurrierend gepflegten Modellen bleibt dagegen offen, da praktische Lösungen zur Konsistenzprüfung fehlen. Des Weiteren reichen die Strategien zur Konsistenzwiederherstellung von Heuristiken bis hin zu zusätzlichen Modellanalysen, die den unübersichtlichsten Teil der TGGen bilden. Das erschwert auch den Ideenaustausch zwischen unter- schiedlichen TGG-Ansätzen, obwohl diese dieselben Ziele und Grundlagen haben. In dieser Thesis wird erstens Konsistenzprüfung als ein neuer Anwendungsfall der TGGen eingeführt. Suchraumprobleme in Konsistenzprüfung werden identi- fiziert und durch lineare Optimierungstechniken bewältigt. Zweitens wird eine neue Prozedur zur Konsistenzwiederherstellung vorgestellt, die von Techniken zur inkre- mentellen Graphmustersuche Gebrauch macht, um Konsistenzwiederherstellung zu vereinfachen. Des Weiteren wird ein TGG-Werkzeug vorgestellt, das die formalen Ergebnisse umsetzt und dessen Skalierbarkeit mit realen Beispielen evaluiert wird. Zum Abschluss wird über ein Industrie-Projekt aus der Maschinenbau-Domäne berichtet, indemdiesesWerkzeugzurKonsistenzerhaltungzwischenModellenvon rechnergestütztem Konstruieren und mechatronischer Simulation benutzt wurde. CONTENTS 1 Overview and Motivation 1 1.1 Co-existing Models and Bidirectional Transformations (BX) . . . . . . . . 2 1.2 Challenges of BX in an MDE context . . . . . . . . . . . . . . . . . . . . 5 1.3 Stakeholders and Requirements . . . . . . . . . . . . . . . . . . . . . . . 10 1.4 Contributions and the Structure of the Thesis . . . . . . . . . . . . . . . 11 2 Fundamentals and Running Example 15 2.1 Graphs and Triple Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2 Triple Graph Grammars (TGGs) . . . . . . . . . . . . . . . . . . . . . . . 25 2.3 An Extended Consistency Specification for the Running Example . . . . . 30 2.4 Summary, Open Issues, and Existing Extensions to TGGs . . . . . . . . . 35 3 Consistency Checking with TGGs 39 3.1 Examples of Consistency Checking . . . . . . . . . . . . . . . . . . . . . 40 3.2 Consistency Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.3 Wrong Choices of Consistency Rule Applications . . . . . . . . . . . . . 51 3.4 Integer Linear Programming (ILP) Techniques . . . . . . . . . . . . . . . 54 3.5 Consistency Checking with TGGs and ILP . . . . . . . . . . . . . . . . . 56 3.6 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 3.7 Summary and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . 80 4 Consistency Restoration with TGGs 83 4.1 Examples of Consistency Restoration . . . . . . . . . . . . . . . . . . . . 85 4.2 Forward Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 4.3 Wrong Choices of Forward Rule Applications . . . . . . . . . . . . . . . 98 4.4 Application Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 4.5 Marking-Complete Forward Rules . . . . . . . . . . . . . . . . . . . . . 102 4.6 A Consistency Restoration Procedure . . . . . . . . . . . . . . . . . . . . 106 4.7 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 4.8 Summary and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . 123 5 Tool Support, Experimental Evaluation, and Practical Application 125 5.1 The Meta-Tool eMoflon . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 5.2 Experimental Evaluation of Consistency Checking . . . . . . . . . . . . . 128 5.3 Experimental Evaluation of Consistency Restoration . . . . . . . . . . . . 136 5.4 The GraTraM Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 5.5 Summary and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . 148 6 Conclusion 151 B IBL IOGRAPHY 155 ACRONYMS API Application Programming Interface BX Bidirectional Transformations CAD Computer-Aided Design EMF Eclipse Modeling Framework HTML Hypertext Markup Language ILP Integer Linear Programming MDA Model-Driven Architecture MDE Model-Driven Engineering MOF Meta-Object Facility NAC Negative Application Condition OCL Object Constraint Language OMG Object Management Group PIM Platform-Independent Model PSM Platform-Specific Model QVT-R Query/View/Transformation - Relations TGG Triple Graph Grammar UML Unified Modeling Language XMI XML Metadata Interchange XML Extensible Markup Language xii 1 OVERV IEW AND MOT IVAT ION Since the appearance of first computers in the 1930s, the rapid increase in computing power of hardware has made it imaginable to tackle more and more complex tasks with computers. The difficulty of writing useful software meeting the complex requirements, however, has always been the limiting factor in exploiting computers. In the 1960s, even the term software crisis [111] was coined at the intergovernmental level referring to software products that are of low quality, inefficient, not meeting the requirements, or undelivered, and referring to software development processes that run over time and over budget. To this end, the then practically unknown field of software engineering has been established to apply systematic engineering methods to the development of software. Today, software engineering has a strong impact on our daily life as computers are employed in almost every context, be it a workstation, an industrial process, a medical treatment, a means of transportation, or a consumer good. The essence of software engineering is the division between hardware and soft- ware by introducing appropriate abstractions to deal with the increasing complexity. High-level programming languages provide a means for this abstraction where hu- man-readable source code is compiled to or interpreted as executablemachine code. Although such languages have made a significant contribution to the technological progress in the last decades, they still have a computation-oriented focus. Hence, the achieved level of abstraction is still not sufficient to incorporate non-developer domain experts (in general users of a software product) into the development pro- cess. Indeed, recent chaos reports [29] on information technology projects identify user involvement as the key factor for success and lack of user input as currently the most challenging problem. There are educational measures to bridge the gap between domain experts and software engineers including, e.g., integrating a specific field as secondary sub- ject into computer science courses (or vice versa), and even inherently combined courses such as business informatics or bioinformatics.While such interdisciplinary education can provide qualified workforce for the interface between software and its application domain, there is still a need for technological change (with regard to tools, languages, and formalisms) to facilitate software development with do- main experts. With this in mind, the field of software engineering has been striving to elevate the level of abstraction towards domain-specific concepts rather than computational details in recent years. The software development methodology Model-Driven Engineering (MDE) lies at the heart of these efforts. In an MDE context, a model is considered as an abstraction of a system built for a specific goal in mind and provides information only related to this goal [20, 23, 90]. 2 1 Overview and Motivation The most prominent and standardized set of guidelines for how to structure and organize models in MDE has been established by the Object Management Group (OMG)under the nameofModel-DrivenArchitecture (MDA) [110]. TheMDAstrategy comes along with a set of standards enjoying broad acceptance such asMeta-Object Facility (MOF) [107] to definedomain-specificmodeling languages,UnifiedModeling Language (UML) [141] to create object-oriented design models, and XML Metadata Interchange (XMI) [149] to persist and exchange models in a common format. 1.1 Co-existing Models and Bidirectional Transformations (BX) Defining a model as intended for a specific goal, the MDA strategy strongly ad- vocates separation of concerns via co-existing models for a system instead of cap- turing the whole system as one global model. Depending on the abstraction level of models, MDA distinguishes between the two basic concepts of Platform-Indepen- dent Models (PIM) and Platform-Specific Models (PSM). While the term platform here might be associated with different abstraction levels such as hardware architecture, operating system, or programming language, PIMs and PSMs of a system remain in all cases relative to each other in the following sense: PIMs abstract from tech- nological details of the platform and represent business logic of the system while PSMs stipulate how to realize the system in the respective platform. Hence, PIMs are rather expected to integrate domain experts with less software engineering knowledge into the development process while PSMs are executable artifacts (or should directly lead to executable artifacts). Therefore, the MDE vision is not only concerned with introducing higher abstraction levels but also suggests to regard everything as a model consequently, e.g., even source code written in a programming language that exists longer than the MDE vision itself is a model. Figure 1.1 shows an excerpt of different PIMs andPSMs that can be involved in the development of a software system. The models cover overall three different goals, namely documentation, implementation, and testing of the system. At the highest abstraction level, i.e., at the level of PIMs, there exists a documentation model that abstracts from the format of the documentation. Similarly, an object-oriented design in UML as well as a test suite model abstract from the programming language used for implementation and testing, respectively. Considering the level of code as platform in our concrete case, the models at the PSM level represent details concerning the choice of the programming language for each individual goal. At the documentation side, for example, the Hypertext Markup Language (HTML) is chosen to be the language of documentation artifacts, and the HTML model (basically a PSM) organizes the documentation data in the form of HTML elements (called HTML tags). Analogously, assuming Java as the programming language of the project, the Java model in the middle captures the object-oriented design with Java language features and possibly limitations in mind (the same applies to the Java model for testing at bottom right). Co-existence of differentmodels representing the same system facilitates amodu- lar development process but requires appropriate consistencymanagement asmodels are due to change during their life cycles. In its most broad interpretation with re- gard to the goals of this thesis, consistency of two models refers to a state where no 1.1 Co-existing Models and Bidirectional Transformations (BX) 3 Documentation  Model Object‐oriented Design in UML Test Suite Model HTML Model Java Model (Impl.) Java Model (Tests) PIMs PSMs Documentation Implementation Testing Figure 1.1: Exemplary PIMs and PSMs in a software engineering project contradiction exists between the models. In other words, consistent models agree on the information they are meant to represent in common. Consistency becomes a relevant challenge especially in large software projects when different domains are involved with their own models, and models are developed concurrently by their owners. In general, consistencymust be addressed vertically between different abstraction levels, e.g., between a PIM and a PSM from the same domain, as well as horizontally, e.g., between two PIMs or between two PSMs from different domains. The bidirectional arrows in Figure 1.1 represent the choice ofmodel pairs between whicha tool-supported consistencymanagement isdesired for the concrete scenario (the black-filled arrow will be further investigated in detail as running example throughout the thesis). The vertical arrows indicate that two models of different abstraction levels (but sharing the same goal) must conform to each other in all cases while horizontal arrows represent a consistency management strategy that considers implementation as the central point of consistency, i.e., documentation as well as test models must be kept consistent to implementation models. It should be mentioned that some of the horizontal arrows in Fig. 1.1 can be realized indirectly in practice, e.g., horizontal consistency at the PSM level can be induced via horizontal consistency at the PIM level (or vice versa) together with vertical consistency. The choice of arrows that must be realized as a distinct component or induced by other arrows is case-specific and depends usually on the internal structures of models or the selected development process. The MDA specification already exhibits awareness of consistency challenges in MDE from the very beginning and proposes to applymodel transformations, i.e., pro- ducing an output model from a given input model in an automated way, to keep related models consistent. Model transformations thus play a central role in MDA and systematic development of model transformations has been actively investi- gated with various approaches differing in their supported features, capabilities, and scopes (cf., e.g., [13, 34, 132] for classifications and comparisons of different approaches). If models are intended for specific goals and maintained by different stakeholders, moreover, model transformations are not one-way streets but must be bidirectional, i.e., executable in either direction. In line with this observation, a cross-disciplinary research community named bidirectional transformations (BX) [35] has been established whose research interests include (but are not limited to) con- 4 1 Overview and Motivation sistency tasks in MDE. Providing appropriate BX formalisms, languages, and tools has been the centre of BX-related research in recent years. The MDA specification indeed introduces the BX language Query View Transfor- mation - Relations (QVT-R) [120]. Unlike the aforementionedMDA standards such as MOF, UML, and XMI, however, QVT-R has not gained a broad acceptance and im- plementations are scarce [134]. Arguably, describing and maintaining consistency between two models is one of the most complex tasks in the MDE vision and the level of formalization and preciseness in the QVT-R specification does not meet the requirements of this complexity. Recently, ambiguities in the standard and the resulting conformance issues of the respective tool support for QVT-R have been stressed in [146]. The observations from the investigated case studies reveal that BX development with QVT-R turns out to be a trial-and-error process when cop- ing with these problems. Consequently, developers might end up with complex solutions while BX research generally strives to allow for compact and readable solutions as compared to, e.g., a manual implementation. Nevertheless, a lot of formal work to address BX in an MDE context has been done in the field of graph grammars. In this setting, the content of a model is formalized as a graph, and computations on a model (e.g., analyses, manipu- lations, or consistency management) are captured as graph grammar rules. Espe- cially Triple Graph Grammars (TGGs) [128], a particular dialect of graph gram- mars dedicated to BX, have been successfully used in several industrial MDE projects [5, 21, 56, 67, 126], and present a prominent alternative to QVT-R with various implementations [46, 71, 83, 86, 96]. The ancestors of TGGs are pair gram- mars [119] whose central idea is twofold: (i) provide a pair of grammars that build up together consistent pairs of source and target models, and (ii) automatically de- rive consistency tools from this specification. Refining this idea, TGGs introduce a third grammar that builds up a third graph representing the correspondence links between consistent pairs. In the universe of BX languages, TGGs belong, similar to QVT-R, to declarative approaches where the BX specification only describes what consistency means (for the particular BX scenario) but not how it is maintained. Both approaches, however, particularly differ in the way of describing consistency. A QVT-R specification, on the one hand, consists of relations in the form of logical predicates and consistency of twomodels means the satisfaction of these predicates. A TGG, on the other hand, is a graph grammar constructing consistent models and consistency is defined as membership to the language induced by the grammar. Both approaches, there- fore, are commonly distinguished as relation-based and grammar-basedwithin the BX landscape. A further representative group of BX languages is given by those based on bidirectional programming as proposed in, e.g., [69, 87, 153]. In this setting, consistency maintenance is programmed in one direction and the corresponding program in the reverse direction is automatically induced. Bidirectional program- ming languages offer more fine-grained control over consistency maintenance as compared to declarative approaches but provide less abstraction for the necessary computations thereof. Bidirectional programming, furthermore, generally tends to be more restrictive with regard to the supported class of consistency scenarios (e.g., primarily supporting one-to-one mappings betweenmodels such that the program 1.2 Challenges of BX in an MDE context 5 can be reversed unambiguously). TGGs, on the contrary, allow for specifying and maintaining consistency in a flexible manner as we shall demonstrate with our examples throughout the thesis. 1.2 Challenges of BX in an MDE context The question of how to deal with consistency between related models is strongly coupled with how the models are created andmaintained. Current BX approaches, in particular TGG approaches, focus on the case where always one of the models is maintained by human intelligence at the same time and deriving the other one is only routine work (and the goal of BX is to automate this routine work). Such a scenario is depicted in Figure 1.2 where Model A and Model B refer to two related models (e.g., two models at the ends of any bidirectional arrow in Figure 1.1). The rectangles with gearwheels in the middle represent individual runs with a BX tool (the dashed gray arrows indicate the input and the output for each run). The v.X suffixes at each model indicate its version (e.g., v.1 for the first version). Incrementing the version of a model is either the result of a transformation or is done by human intelligence. The latter case is depicted via black arrowswith a stick person. Finally, white filling of models indicates that they are created by human intelligence while gray-filled ones are automatically derived via BX. Model A v.1 Model B v.1 Model B v.2 Model A v.2 Model A v.3 Model B v.3 Figure 1.2: Consistency management as addressed by current BX approaches in MDE In Figure 1.2, the first version of Model A is created by human intelligence as a first step and is transformed to the first version ofModel B. After changingModel B to its secondversion (again byhuman intelligence), the secondversion ofModelA is derived via a transformation in the reverse direction. Analogously, further changes are done on one of the models, and the other one is derived. In most of the BX approaches, after changing a model with human intelligence, the derivation of the other one is done as an incremental update, i.e., the previous version of the derived model is taken into consideration and only changes are propagated incrementally instead of an entire transformation from scratch. BX tools generally keep auxiliary data for the consistency history (e.g., traces between models) to calculate what is to be done for updating a model according to the changes on the other side. 6 1 Overview and Motivation In a scenario as depicted in Figure 1.2, there is an assumption that does not necessarily hold in many real-world consistency tasks: Only one of the models is available at the beginning and the first version of the other one is to be automatically derived. Further transformations by propagating changes in one of the models to the other rely on this initial transformation as the starting point. This requires working in a BX-supported environment and applying BX from the very beginning starting with one model. Applying BX, however, can be a decision taken in an intermediate phase of the development process where both models are existent and have already reached an advanced state. Differentmodels indeed belong to different stakeholders and concurrent develop- ments speed up the development process. Figure 1.3, therefore, depicts the vision of a more general consistency management support eliminating these assumptions and defining the goals of this thesis. This time, the starting point of consistency management allows twoalready existingmodelswhich are concurrently developed. In the initial start, consistency management takes both models as input (indicated as v.1) and derives their updated versions (v.2) yielding a consistent state. Further runs of change propagation from one model to the other (as previously shown in Figure 1.2) are again possible upon the results of this initial start (e.g., changing Model A to its v.3 and deriving v.3 of Model B). Model A v.1 Model B v.1 Model B v.2 Model A v.2 Model A v.3 Model B v.3 Model B v.4 Model A v.4 Figure 1.3: Consistency management with two existing models as the starting point In the following, we explore in more detail (on the level of model elements) what intermediate steps are to be taken into account to realize a BX vision as depicted in Figure 1.3 (these intermediate steps shall define the individual subtasks we address throughout the thesis). For this purpose, an excerpt of the consistency between Java and UML models (the black-filled bidirectional arrow in Figure 1.1) serves as our running example. In particular,we focus on Java andUMLclasseswith their methods. Bothmodels represent a system from an object-oriented viewwhere classes define a template for runtime objects. In fact, there exist UML tools [140] providing their own mechanisms for consistency with Java code but they operate, in line with the BX conception in Figure 1.2 starting with one model. Consistency 1.2 Challenges of BX in an MDE context 7 management for concurrently developed models, however, are not addressed by these custom-tailored BX solutions either. In Figure 1.4, an exemplary pair of Java (left) and UML (right) models is depicted both representing a class List. The Java model has two remove methods, one with a parameter of type Object and the other one with a parameter of type int. The UML model has only one remove method (with a parameter of type Object) and, additionally, another method named add (again with a parameter of type Object). Note that these models represent v.1 on both sides when transferred to our BX conception in Figure 1.3. List + remove(o: Object): boolean + add(o: Object): boolean public class List { public boolean remove(Object o){...} public Object remove (int i){...} } Figure 1.4: An exemplary pair of Java (left) and UML (right) models There are obvious inconsistencies between the twomodels in Figure 1.4. The Java model has an additional removemethodwhichmight have been introduced by Java programmers as a result of their experience. Furthermore, the Java model misses the add method as compared to the UML model, i.e., a feature planned on the higher abstraction level has not been reflected in the code level yet. Nevertheless, the models have also consistent parts, e.g., they agree on a List class and a remove method with an Object parameter. In order to tackle consistency issues in such a case, there are two basic steps that must be supported by a fully-fledged BX approach: (i) consistency must be checked between the two models to detect which parts of the models already correspond to each other (and which parts violate consistency), and subsequently, (ii) consistency must be restored depending on the decisions of stakeholders, e.g., by applying model transformations that propagate or simply delete inconsistent parts. Consistency checking and restoration between concurrently developed models are illustrated in Figure 1.5 based on this exemplary model pair. In addition to the models, the bidirectional dashed arrows in the middle indicate correspondences between model elements. New correspondences and model elements created in individual steps are depicted green with an additional «new» markup. In a first step in Figure 1.5, consistency checking is performed and the corre- spondences between conforming elements are created, i.e., between the List classes and their remove methods with Object parameter (we omit the correspondence arrows on the parameter level to avoid diagram clutter). As a consequence, the remaining elements can be regarded as violating consistency and must be handled to restore consistency. One of the possibilities would be to delete these elements and to reduce the models to their consistent parts which, however, is a very strict and unsatisfactory solution. In Figure 1.5, a strategy is chosen that transforms the 8 1 Overview and Motivation List + remove(o: Object): boolean + add(o: Object): boolean public class List { public boolean remove(Object o){...} public Object remove (int i){...} } List + remove(o: Object): boolean + remove(i: int): Object + add(o: Object): boolean public class List { public boolean remove(Object o){...} public Object remove (int i){...} } List + remove(o: Object): boolean + remove(i: int): Object + add(o: Object): boolean public class List { public boolean remove(Object o){...} public Object remove (int i){...} public boolean add(Object o){...} } Consistency checking Consistency restoration (Java to UML) <> <> <> <> <> <> Consistency restoration (UML to Java) Figure 1.5: Consistency checking and restoration on the exemplary model pair remaining Java elements to the UML model and vice versa. Note that, from a con- sistency restoration point of view, we consider the remaining elements as additions to the consistent portions of themodels (detected in the initial consistency checking run) and propagate these additions to the other side. In the final state, both models are consistent and contain all three methods. This state of the models represents their v.2 when transferred to the BX conception in Figure 1.3. Our understanding of consistency restoration, moreover, is not only limited to the remaining elements after a consistency checking run. We also handle user changes on the models in the same manner whereas user changes possibly do not only involve additions but also deletions. In Figure 1.6, a further run of consistency 1.2 Challenges of BX in an MDE context 9 restoration is demonstrated starting with the consistent state from Figure 1.5. This time, the user deletes one of the removemethods and adds a sizemethod in the Java model. The propagation of these changes with consistency restoration accordingly deletes the corresponding remove method in the UML model (the deleted method and the correspondence are grayed-out). Furthermore, a size method is created in the UML model (together with its correspondence). List + remove(o: Object): boolean + remove(i: int): Object + add(o: Object): boolean public class List { public boolean remove(Object o){...} public Object remove (int i){...} public boolean add(Object o){...} public int size(){...} } Consistency restoration (Java to UML) List + remove(o: Object): boolean + remove(i: int): Object + add(o: Object): boolean + size(): int public class List { public boolean remove(Object o){...} public Object remove (int i){...} public boolean add(Object o){...} public int size(){...} } <> <> Figure 1.6: A further consistency restoration run upon user changes To sum up, this thesis represents the following two building blocks for consis- tency management in MDE: • Given two concurrently developedmodels (possiblywithout any prior consis- tency management support until the models reach an advanced state), a con- sistency checking run indicates which parts of the models are in conformance and which parts require action for consistency restoration. Furthermore, cor- respondences are created representing the relationships between individual elements of the two models. • Given two consistent models (together with correspondences), a consistency restoration run propagates additions and/or deletions in one of the models to the other. Correspondences are updated in the process as well. 10 1 Overview and Motivation 1.3 Stakeholders and Requirements Throughout this thesis, consistency checking and restoration in an MDE context are discussed and formal as well as practical results based on TGGs are proposed. After having illustrated the targeted BX vision in an abstract manner (Figure 1.3) and via small concrete examples (Figure 1.5 and 1.6), we are now ready to identify stakeholders in BX and their requirements. In Figure 1.7, the functional requirements are summarized via main use cases of different stakeholders. We distinguish between the following three different environments (and three groups of stakeholders): • Consistency tool, usedbymodel owners to check and restore consistencybetween their related models • Meta-tool, i.e., a tool for building (consistency) tools, used by consistency tool developers • BX approach, used by meta-tool developers (e.g., BX researchers), providing the foundations for the meta-tool. BX Approach Meta-Tool Consistency Tool Check consistency between models Restore consistency between models Model Owner Develop consistency tool Consistency Tool Developer Develop meta- tool Meta-Tool Developer Figure 1.7: Stakeholders and their functional requirements in a BX landscape Obviously, model owners are clients of consistency tool developers while con- sistency tool developers are clients of meta-tool developers. Although we consider three different environments with clear boundaries here, some of the stakeholder roles can possibly belong to the same person, e.g., when model owners develop a consistency tool for their own needs, or when meta-tool developers provide indi- vidual consistency tools using their own techniques. In all cases, solid theoretical foundations and available tool support for BX are important for all stakeholders, and define the goals of BX-related research as well as this thesis. Besides the functional requirements, we demand the following non-functional requirements that are crucial for a scientific underpinning of a BX approach: • Scalability: We have a pragmatic scalability demand from a tooling point of view and define it as the capability of dealing with industry-sized and re- 1.4 Contributions and the Structure of the Thesis 11 al-world scenarios of consistency checking and restoration. In this regard, the proposed (and implemented) procedures for consistency checking and restorationmust consumeacceptable runtimewith typical hardware resources. • Formal properties:A consistency checking and restoration approachmust guar- antee correctness. This means that, given a consistency specification and two models (that are not necessarily consistent), the outcomemust be a consistent pair of models. While guaranteeing this for all kinds of consistency require- ments cannot be feasible in general (e.g., due to scalability issues), the lim- itations and compromises for scalability must be defined clearly. Moreover, it must be possible to examine at specification time, whether a correct result can be guaranteed for a given consistency specification. • Expressiveness: Consistency checking and restoration in BX relies on a con- sistency description written in the respective BX language. Hence, the BX language must be expressive enough to describe consistency for exactly those model pairs that are meant to be consistent. Otherwise, consistency tool de- velopersmust resort to hand-crafted pre- and post-processing ofmodels to re- duce the complexity of BX. This, however, distributes the focus of consistency tools over different components and reduces the scope of formal guarantees. • Usability: The ease of use of a BX meta-tool must be provided in at least two different aspects: First, appropriate language editors must facilitate produc- tivity when specifying consistency with a BX language. Second, execution of consistency checking and restoration based on a consistency specification must be possiblewith little effort (e.g., via included librarieswith entry points or graphical user interface). Ease of execution has also an arguable impact on the usability of consistency tools developed with the meta-tool. • Validation: The provided BX theory and tool support for consistency checking and restoration must be used and validated in different application scenarios. On the one hand, academic “toy examples” that are compact but exhibit non-trivial consistency challenges must be used to validate capabilities of a BX approach. On the other hand, this must be complemented via real-world and industrial case studies where involved models as well as consistency specifications are larger. 1.4 Contributions and the Structure of the Thesis This thesis is mainly concerned with consistency checking and restoration based on TGGs and its respective tool support. We first discuss TGGs as a set of grammar rules that build up consistent pairs of models together with correspondences. The concrete contributions based on this formalism are: • Contribution I:We formalize consistency checking with TGGs. We first oper- ationalize TGG rules for consistency checking, i.e., consistency checking rules are derived that do not build up models but take existing ones as input and detect their consistent parts. Subsequently,we identify challengeswith regard 12 1 Overview and Motivation to search space involved in using these rules to detect consistent parts, and in- corporate optimization techniques into TGGs to tackle these challenges. Finally, we exploit these optimization techniques (i) to define sufficient and necessary conditions for consistency between two models and, in case of inconsistency, (ii) to detect the largest consistent portions of models. • Contribution II:We formalize consistency restorationwith TGGs that can op- erate upon consistency checking results.Analogously to consistency checking, we operationalize TGG rules to consistency restoration rules that take one of the models as input and update the other one consistently. We again identify and clear search space problems (resulting from wrong choices of rules that can lead consistency restoration to a dead end before consistency is restored). We discuss sufficient conditions (i) to avoiddead ends at rule application time and (ii) to state clearly under which circumstances a correct result of consistency restoration can be guaranteed. • Contribution III: We consider the realization of consistency checking and restoration from a tool architecture point of view and provide procedures that operate with the aforementioned operational rules. Having a formalism based on graphs and graph patterns in case of TGGs, we propose to use incremental pattern matching techniques in these procedures. An incremental pattern matcher maintains and reports patterns in a host graph where our procedures are solely reactions to these reports. This yields a noticeable sim- plification for a newgeneration of TGG tools as additional analyses at runtime are outsourced to calculate necessary steps when checking and restoring con- sistency. Furthermore, this enables TGGs to profit directly from current and future progress in the scalability of incremental pattern matchers. • Contribution IV: We present a meta-tool whose implementation is based on the formal results and tool architecture from the first three contributions. Besides investigating our running example, we report on another industrial consistency project in the context of computer-aided engineering where our meta-tool is used to develop a tool for consistency checking and restoration between technical drawings and their respective mechatronic simulations. We furthermore conduct experiments on different consistency scenarios to evaluate the scalability of our approach, and provide tool comparisons in some cases where solutions with other tools are available. Table 1.1 relates these contributions to the functional and non-functional require- ments. It can be observed that the contributions I-III are mainly foundational and address developing a meta-tool with formal properties in the first place. This is complemented with the provided meta-tool based on the results and its evalua- tion (contribution IV) allowing users to develop their own consistency tools with a validated and usable approach. In all of the contributions, consistency checking and/or restoration are of the uttermost importance. Finally, expressiveness of TGGs is out-of-scope for this thesis and is a current research topic with open questions concerning new language constructs and their operationalization. 1.4 Contributions and the Structure of the Thesis 13 Functional Requirements Non-functional Requirements Check Cons. Restore Cons. Develop Cons. Tool Develop Meta-tool Scalability Formal Properties Expr. Usability Validation Contr. I X X X Contr. II X X X Contr. III X X X X X Contr. IV X X X X X Abbreviations: Contr. = Contribution | Cons. = Consistency | Expr. = Expressiveness Table 1.1: Relation between contributions and requirements Having so far introduced and motivated our BX vision consisting of consistency checking and restoration, the rest of the thesis is organized as follows: • The following Section 2 provides the necessary formal background for TGGs to understand the subsequent sections. Furthermore, our running example is introduced focusing on the consistency between Java code and UML class diagrams. The subsequent two sections form our formal contributions: • Section 3 represents our consistency checking approach where we combine TGGs with linear optimization techniques. Our main formal results are Theo- rem1andTheorem2 stating a sufficient conditionor a sufficient andnecessary condition, respectively, to conclude consistency of two models. • Section 4 represents our consistency restoration approach where we this time combine TGGs with incremental pattern matching techniques. Our main for- mal results are givenbyAlgorithm2,whichprovides a consistency restoration procedure, and its formal guarantees including termination and correctness stated in Theorem 3 and Theorem 4, respectively. Both Section 3 and Section 4, furthermore, conclude with their own discussions of related work. Our practical contributions are discussed subsequently: • Section 5 represents our meta-tool implementing the formal results, reports on an industrial project where we have applied TGGs for consistency man- agement in the computer-aided engineering domain, and, finally, provides an experimental and quantitative evaluation of our meta-tool. Section 6, finally, concludes the thesis and discusses future work. 2 FUNDAMENTALS AND RUNNING EXAMPLE This section introduces foundational aspects which are necessary to understand how models are structurally organized and discusses what consistency means in a graphgrammar-basedBXapproach.We strive to impart intuitionswith examples as well as formalizations to a sufficient extent. In our formal statements, we highly use the category theoretical foundations of graphgrammars (inparticular as introduced in [42]) and make our own adoptions and simplifications to capture all what we need on the way of formalizing consistency of models. Before delving into the formal part of our fundamentals, it is important to under- stand what MDE (and in particular the MDA strategy) prescribes from a technical point of view to structure models. While models are the centre of attention in an MDE context, the structure of a model is defined by ameta-model. The prefix “meta” here (in analogy to its previous usage in meta-tool) is used for self-referencing of a term and thus leads us to the notion of a model for models in this concrete case. A meta-model basically resorts to basics of object-oriented analysis and defines concepts with their allowed relations to construct a valid model. Considering our running example, a Java meta-model should adapt the Java language specification [75] to the technical space of MDE and define concepts such as classes, methods, parameters, and their relationships (e.g., classes can have methods and methods can have parameters). Analogously, a UML meta-model should do the same for UMLmodels. In this thesis, we get our inspiration from the MDE tool MoDisco [25] for the Java meta-model and from the OMG standard [141] for the UMLmeta-model.We also introduce a correspondencemetamodel between these two as our ultimate goal is to deal with consistency of two models together with their correspondences. Figure 2.1 depicts excerpts from the Java (left) and UML (right) meta-model connected by a correspondence meta-model (middle). In the Java andUMLmeta-model in Figure 2.1,we only focus on classifiers (which can either be classes or interfaces) and their methods with parameters. While all of these concepts have a name property represented by a string value, parameters (in Java as well as in UML) additionally have a pos property represented by an integer value indicating the position of a parameter in the parameter list of the containing method. For brevity, we omit the typing information of methods and parameters. Concepts such as packages or class fields, furthermore, are omitted as well. The cor- respondencemeta-model in themiddle, moreover, maps the concepts from the Java and UML meta-model (C2C for classifier-to-classifier, M2M for method-to-method, and P2P for parameter-to-parameter correspondences). While the Java and UML meta-model seem identical on the depicted concepts, they differ in representing inheritance relations between classifiers: In the Java 16 2 Fundamentals and Running Example JClassifier JInterface JClass JNamedElement name: String JMethod JParameter method 0..* parameter 0..* superInterface 0..* superClass 0..1 UMLClassifier UMLInterface UMLClass UMLNamedElement name: String UMLMethod method 0..* parameter 0..* UMLParameter general  0..* contract 0..* P2P M2M C2C pos: int pos: int Figure 2.1: Excerpts from the Java metamodel (left), the UML metamodel (right), and the correspondence metamodel (middle) meta-model, a classmay have atmost one super class (note the superClass reference with 0..1 multiplicity) while classes as well as interfaces may have an arbitrary number of super interfaces (note the superInterface reference with 0..* multiplicity). This reflects the restrictions in Java with respect to multiple inheritance (inheriting frommultiple classes is not allowed). TheUMLmetamodel, on the contrary, ismore relaxed regarding inheritance relations and allows arbitrarily many inheritance relations between classifiers (be it a class or an interface) via the general reference with 0..* multiplicity. A second kind of inheritance relation from a class to an interface is represented by the contract reference (again with 0..* multiplicity). In general, meta-models focus on defining concepts formodels but not how these concepts are encoded in an actual representation of models. As already shown in Figure 1.4 and 1.5, Java models are encoded in the textual Java syntax while UML models have a visual notation. This is referred to as concrete syntaxwhile the entirety of concepts defined in a meta-model describes an abstract syntax (which “abstracts” from the concrete representation). Abstract syntax, moreover, is generally visual- ized in the well-known notation of object diagrams [141]. As from now on, we normalize most of the future figures using abstract syntax when depicting models (and go back to the concrete syntax in certain cases where this increases the read- ability of illustrations). Figure 2.2 depicts an excerpt of the consistent model pair in Figure 1.5 and their correspondences in the abstract syntax. The Java model (left), the UML model (right) as well as the correspondence model (middle) instantiate the respective meta-models in Figure 2.1. In the rest of this section, we formalize triples of (meta-)models as triples of graphs and subsequently define consistency as a grammar over such triples. We first focus on rather simple examples and then extend our consistency specification. Finally, we summarize this section and discuss open issues as well as possible extensions for future with regard to the consistency specification. 2.1 Graphs and Triple Graphs 17 : JClass name = "List" : JMethod name = "remove" : JParameter name = "o" pos = 0 : JMethod name = "remove" : JParameter name = "i" pos = 0 method method parameter parameter : UMLClass name = "List" : UMLMethod name = "remove" : UMLParameter name = "o" pos = 0 method :P2P : UMLMethod name = "remove" : UMLParameter name = "i" pos = 0 :M2M :M2M :P2P :C2C method parameter parameter Figure 2.2: A triple of Java model (left), UML model (right), and correspondence model (middle) conforming to the triple of meta-models in Figure 2.1 2.1 Graphs and Triple Graphs The following definitions and statements are adopted from the category theoretical foundations of graph grammars [42]. To formalize (triples of) models and their consistency, we shall start with defining the most basic mathematical structure that comprises “objects” and “mappings” (morphisms) between them, namely a category. Definition 1 (Category). [42] A category C � (ObC, MorC, ◦) consists of: • a class ObC of objects • for each pair of objects X, Y ∈ ObC, a set MorC(X, Y) of morphisms, where each m ∈ MorC(X, Y) is denoted as m : X → Y • for all objects X, Y, Z ∈ ObC, a composition operation ◦ : MorC(Y, Z) ×MorC(X, Y) → MorC(X, Z) such that the following properties are fulfilled: 1. Associativity. For all objects X, Y, Z, W ∈ ObC, and all morphisms f : X → Y, g : Y → Z, h : Z→W , it holds that (h ◦ g) ◦ f � h ◦ (g ◦ f ) 2. Identity. For all objects X, Y ∈ ObC and morphisms f : X → Y, there exists a morphism idX : X → X and a morphism idY : Y → Y such that f ◦ idX � f and idY ◦ f � f . Hence, a category is an abstract representation of “structured things” and their morphisms that serve as mappings. This applies independently of the concrete structure of the objects and forms a means for using and stating mathematical results for different kinds of objects in a common way. 18 2 Fundamentals and Running Example Throughout this thesis, the category Sets whose objects are sets and whose morphisms are total functions will be the most basic category that shapes the underlying structure for further categories. When considering sets as categorical objects, the probablymost noticeable difference to the usual definition of sets is how functions are defined (cf. Example 2 in [118]): In a categorical setting, the output set of a function is not defined by its range (i.e., not by the elements that are a value for some elements in the input set). Instead, a function is a mapping from an input set to an output set (and the range of the function does not necessarily cover all elements in the output set). Definition 2 (Sets). Sets � (ObSets, MorSets, ◦) is defined as: • a class ObSets of finite sets • total functions MorSets • for all sets X, Y, Z ∈ ObSets, the function composition operation ◦ : MorSets(Y, Z) ×MorSets(X, Y) → MorSets(X, Z). Fact 1 (The category Sets). Sets forms a category where the associativity property is induced by the asso- ciativity of function composition, and identities are the identity functions. As the definition of a category (Definition 1) and its instantiation (Definition 2) might imply, categories focus on morphisms and their properties rather than on objects. We shall exploit this “mapping-oriented” flavor of categories in at least two critical ways in this thesis: First, if our goal is to set two models in relation for consistency purposes, relations can be captured via morphisms. As exemplified in Figure 2.1 on the meta-model level and in Figure 2.2 on the model level, corre- spondence structures set the other two structures in relation in our case. Second, existence or absence of morphisms allows us to draw conclusions about the consis- tency of models (indicating what actions to take when dealing with consistency). Furthermore, the following types of morphisms are of special interest: Definition 3 (Monomorphism, Epimorphism, Isomorphism). [42] Given a category C, 1. A morphism m : Y → Z ∈ MorC is called a monomorphism, if for all morphisms f , g : X → Y ∈ MorC, it holds that m ◦ f � m ◦ g implies f � g 2. A morphism e : X → Y ∈ MorC is called an epimorphism, if for all morphisms f , g : Y → Z ∈ MorC, it holds that f ◦ e � g ◦ e implies f � g 3. A morphism i : X → Y ∈ MorC is called an isomorphism if there exists a morphism i−1 : Y → X such that i ◦ i−1 � idY and i−1 ◦ i � idX . 2.1 Graphs and Triple Graphs 19 Example 1. An intuition in Sets already suffices in this thesis to understand these special types of morphisms. In Sets, injective functions are monomor- phisms. As an injective function m maps all elements in its input set distinctly to the elements of the output set, m ◦ f � m ◦ g can only hold if f � g. Further- more, surjective functions are epimorphisms. As a surjective function e covers all elements in the output set, f ◦ e � g ◦ e can only hold if f � g. Bijective functions, finally, are isomorphisms. Remark 1. In Sets, bijective functions are those functions which are injective and surjective. Hence, a morphism which is a monomorphism as well as an epimorphism is an isomorphism. This, however, does not necessarily hold in the most general setting of the category theory as i−1 might not exist for a morphism i in a category (cf. Remark 2.14 in [42]). Hence, we do not state in Definition 3 isomorphism as being monomorphism and epimorphism at the same time. As our structures shall entirely be derived from Sets throughout this thesis, nevertheless, this intuition is indeed valid for our purposes. Inour context, graphs are the fundamental structures representing (meta-)models. A graph is typically defined as a quadruple (E, V , s, t) consisting of • a set E of edges, • a set V of vertices, • a function s : E→ V assigning each edge a source vertex, • a function t : E→ V assigning each edge a target vertex. While this is themost common definition of a graph that can be found in the liter- ature, an alternative definition is to define graphs as functors. A functor is basically a mapping between two categories remaining compatible with compositions and identities. Functors allow us to build new (and, roughly spoken, more complex) cat- egories from existing ones, e.g., graphs from sets and triple graphs from graphs. To capture all relevant structures in a unified manner, we use functors consequently from the very beginning (starting with the definition of a graph) and exemplify them for the reader who is not familiar with this concept. Definition 4 (Functor). [42] Let C and D be two categories. A functor F : C→ D is given by two mappings FOb : ObC → ObD and FMor : MorC(X, Y) → MorD(FOb(X), FOb(Y)) with X, Y ∈ ObC, such that the following properties are fulfilled: 1. For all morphisms f : X → Y and g : Y → Z ∈ MorC, it holds that FMor(g ◦ f ) � FMor(g) ◦ FMor( f ) 2. For each object X ∈ ObC, it holds that FMor(idX) � idFOb(X). 20 2 Fundamentals and Running Example Example 2. The classical definition of a graph, i.e., the quadruple (E, V , s, t) mentioned above, states nothing but two sets and two total functions in the same direction between these sets. Such a structure can be constructed as a functor which maps a “small” category O1 ⇒ O2, i.e., a category consisting of the two objects O1 and O2 and two morphisms from O1 to O2, to the category Sets. The result of this mapping consists of two sets and two functions in the same direction exemplified in Figure 2.3 using a Java model. At the top, the small category O1 ⇒ O2 is depicted, and at the bottom two sets (E and V) with two functions (s, t : E→ V). The functor (represented with bold and gray arrows) maps O1 and O2 to E and V , respectively, while the two morphisms O1 ⇒ O2 are mapped to the two functions s and t. Though not depicted explicitly, the functor furthermore maps the id morphisms idO1 and idO2 to the id functions idE and idV , respectively, fulfilling thus the two conditions stated over morphisms in Definition 4. The set E consists of two elements, namely the method and parameter edges, where the set V consists of three elements representing a class List, a method remove, and a parameter o. The function s (represented with dashed arrows) assigns the class List to themethod edge and themethod remove to the parameter edge as their sources. Furthermore, the function t (represented with dotted arrows) assigns the method remove to the method edge and the parameter o to the parameter edge as their targets. O1 O2 method parameter : JClass name = "List" : JMethod name = "remove" : JParameter name = "o" pos = 0 E V s t Figure 2.3: A graph considered as a functor from O1 ⇒ O2 to Sets To sum up, a functor (O1 ⇒ O2)→ Sets captures all what a quadruple (E, V , s, t) with the required properties stated before intends to describe for defining a graph. While a functor is one single mapping between two categories (that leads to, e.g., one single graph as depicted in Figure 2.3), all functors together with their 2.1 Graphs and Triple Graphs 21 compatiblemorphisms yield a functor category. This leads us to the categoryGraphs and subsequently TripleGraphs. Definition 5 (Functor Category). [42] F(X) F(Y) G(X) G(Y) F(f) G(f) αX αY Given two functors F, G : C → D, a natural transformation α : F ⇒ G is a family of morphisms α � (αX)X∈ObC with αX : F(X) → G(X) ∈ MorD, such that for all morphisms f : X → Y ∈ MorC the diagram to the right commutes, i.e., αY ◦ F( f ) � G( f ) ◦ αX . A functor category [C,D] is given by all functors F : C → D as the objects and by the natural transformations as the morphisms. Composition of two natural transformations α : F ⇒ G and β : G ⇒ H is the componentwise composition in D, i.e., β ◦ α � (βX ◦ αX)X∈ObC . Identities are identical natural transformations given by componentwise identities in D. Definition 6 (The Category Graphs). The category Graphs of graphs is the functor category [O1 ⇒ O2,Sets]. Given a graph G, the sets G(E) and G(V) are referred to as edges and vertices of G, respectively. We refer to the union set G(E) ∪G(V) as elements of G, denoted as elements(G). Example 3. Each of the Java, UML, and correspondence meta-models in Fig- ure 2.1 and models in Figure 2.2 represents an individual graph. The concepts (e.g., JClass and JMethod) and their references (e.g., the method reference) in the meta-models as well as their instantiations in the models represent vertices and edges, respectively. Note that we further on use the visualization in these figures (referred to as abstract syntax at the beginning of this section) to rep- resent graphs and do not explicitly depict the small category O1 ⇒ O2 or its mappings induced by the functor (which we have done once in Figure 2.3 to understand functors). Additionally, Figure 2.4 shows two exemplarymorphisms α and β inGraphs, depicted via dashed arrows between their respectively mapped graphs. Both α and β are natural transformations, i.e., a family of morphisms in Sets (in particular a pair of morphisms in Sets as the small category O1 ⇒ O2 consists of two objects). Considering the individual components of α and β according to Definition 5, αO1 and βO1 refer to those parts of morphisms that map edges while αO2 and βO2 map vertices. Most importantly, a morphism in Graphs commutes with the s and t functions of the individual graphs. For example, when αO1 (or βO1) maps an edge of a graph to an edge of another graph, the source and target vertices of these edges are also mapped accordingly by αO2 (or βO2). Note that this is inherently required in Definition 5 which states that natural transformations commute with the morphisms in the “host” category (Sets in the case ofGraphs). Furthermore, the composition of natural 22 2 Fundamentals and Running Example transformations as stated in Definition 5 can be exemplified by β ◦ α which is given by the componentwise compositions βO1 ◦ αO1 and βO2 ◦ αO2 . Finally, α and β are monomorphisms as they map vertices and edges injec- tivelywhile β, being surjective at the same time, is additionally an isomorphism. : JClass name = "List" : JMethod name = "remove" : JParameter name = "o" pos = 0 method parameter : JClass name = "List" : JMethod name = "remove" method : JClass name = "List" : JMethod name = "remove" : JParameter name = "o" pos = 0 method parameter α β Figure 2.4: Two morphisms in Graphs Remark 2. For brevity, we restrict our formalization to graphs with vertices and edges but without attributes (which are actually used on the meta-model and model level in Figure 2.1 and 2.2, respectively) and vertex type inheritance (used on the meta-model level in Figure 2.1). Although we implicitly use at- tributes and inheritance in our examples, formalizations reduced to vertices and edges suffice to represent the contributions of this thesis with respect to consistency checking and restoration (and help us to avoid notational inflation). The formalization, nevertheless, can compatibly be extended to attributed graphs with vertex type inheritance by constructing new kinds of categories which still retain all desired properties. We refer the interested reader to [42] for the re- spective extensions in the general theory of graph grammars, and to [8, 92] for attribute handling dedicated to the case of TGGs.While there is no doubt on the solid formal foundation of these extensions, an interesting discussion that also merits mentioning here is that of “the awkwardness of attributes” [124] as ma- nipulating attributes requires different techniques than manipulating vertices and edges. To this end, there have been recent efforts to come up with more lightweight approaches to integrating attributes into graph grammars [73]. Analogously to graphs, triple graphs are also constructed as functors whichmap a small category S← C→ T toGraphs (the letters S, C, and T indicating the source, correspondence, and target domain, respectively). Definition 7 (The Category TripleGraphs). The category TripleGraphs of triple graphs is the functor category [S← C → T,Graphs]. Given a triple graph G, the graphs G(OS), G(OC), G(OT) are re- ferred to as the source graph, the correspondence graph, and the target graph of G, respectively. G is denoted by GS ← GC → GT where GX � G(X), X ∈ {S, C, T}. 2.1 Graphs and Triple Graphs 23 Example 4. The triple of meta-models in Figure 2.1 and the triple of models in Figure 2.2 are exemplary triple graphs where the individual components (Java, UML, and correspondences) represent single graphs. The connections from correspondences to the other two graphs are captured as morphisms in Graphs as Definition 7 implies.We further on depict thesemorphisms via solid (and horizontal) lines with open arrows. In Figure 2.5, moreover, a morphism (in fact, a monomorphism) α in Triple- Graphs is depicted via vertical dashed arrows. Basically, α is a natural transfor- mation consisting of threemorphisms αS, αC, and αT inGraphs (represented in Figure 2.5 by the arrows placed at the left, middle, and right part, respectively). Most importantly, as a result of Definition 5 and as is the case in Figure 2.5, αS, αC, and αT commute with the morphisms from the correspondence graphs to the source and target graphs of the respective triple graphs. That is, for each correspondence mapped by αC, its source and target vertex are mapped accordingly by αS and αT , respectively. : JClass name = "List" : JMethod name = "remove" : JParameter name = "o" pos = 0 method parameter : UMLClass name = "List" : UMLMethod name = "remove" : UMLParameter name = "o" pos = 0 method :P2P :M2M :C2C parameter : JClass : JMethodmethod : UMLClass : UMLMethod method:M2M :C2C α Figure 2.5: A morphism in TripleGraphs While meta-models and models are graphs, meta-models are in fact “distin- guished” graphs that represent typing information for their instantiating models. This also applies analogously to triples. For consistency checking and restoration purposes, we only deal with triples of models whose types conform to a given triple of meta-models (e.g., the one in Figure 2.1 for our running example). Typing, moreover, is captured as a morphism, commonly referred to as a type morphism. Intuitively, a type morphism maps vertices and edges in models to vertices and edges in meta-models and induces this way a typing information. In a categorical setting, typing can be captured via slice categories, i.e., a “slice” of a category that solely consists of objects that have a type morphism to a dis- tinguished object and morphisms that are compatible with these type morphisms. In our concrete case, triple graphs representing Java and UML models with corre- spondences (i.e., triples of models typed over the triple of meta-models given in Figure 2.1) form a “slice” of TripleGraphs. 24 2 Fundamentals and Running Example Definition 8 (Slice Category). [42] Let C be a category and X an object in C. The slice category CX is defined as follows: F(X) F(Y) G(X) G(Y) F(f) G(f) αX αY X A Bm f g 1. ObCX � { f : A→ X | f ∈ MorC} 2. MorCX ( f : A→ X, g : B → X) � {m : A→ B | g ◦m � f } (i.e., the diagram to the right commutes for each m ∈ MorcCX ) 3. composition operation ◦ as defined in C 4. id f :A→X � idA ∈ MorC. Example 5. Althoughwe ultimately deal with a slice of TripleGraphs through- out this thesis, we first exemplify objects andmorphisms from a slice ofGraphs to impart an intuition for Definiton 8. At the top of Figure 2.6, an excerpt from the Java meta-model is given which is now considered to be a distinguished graph and thus represents the object X in Definition 8. Note that, as already stated in Remark 2, we provide our formalization without vertex type inheritance on the meta-model level. Hence, the inheritance hierarchy shown in Figure 2.1 is now “flattened” from a formal point of view and represented this way in our excerpt in Figure 2.6 (e.g., JClass has now a direct edge to JMethodwhich was initially inherited from JClassifier in Figure 2.1). At the bottom of Figure 2.6, two graphs and a morphism are given which actually repeat Figure 2.4. In addition to Figure 2.4, the vertices and edges in both graphs are now mapped to the distinguished graph by the type mor- phisms (depicted via vertical and grayed out arrows). In all cases, the type morphisms commute with the morphism between these graphs (horizontal dashed arrows) and induce typing information for the individual vertices and edges. The entirety of such graphs (for which the type morphisms are given) and their morphisms (which commute with the type morphisms) forms a slice ofGraphs. In the case of Figure 2.6, this slice represents models typed over the Java meta-model (and excludes all other graphs representing other models). Example 5 discusses a slice of Graphs that leads us to the notion of typed graphs. This rather serves as an introductory example to understand slice categories. As mentioned before, we deal with a slice of TripleGraphs in our context. Hence, although exemplified, we skip the definition of typed graphs and apply the concept of slice categories directly to TripleGraphs in the following definition. Lifting Example 5 to triples is left to the reader as exercise. Definition 9 (Typed Triple Graph). Let TG be a distinguished triple graph, referred to as type triple graph. The category of triple graphs typed over TG is the slice category TripleGraphsTG. 2.2 Triple Graph Grammars (TGGs) 25 : JClass name = "List" : JMethod name = "remove" : JParameter name = "o" pos = 0 method parameter : JClass name = "List" : JMethod name = "remove" method JClass JMethod JParameter method pos: int parameter name : String name : String name : String Figure 2.6: A distinguished graph (top) and two graphs together with a morphism from a slice of Graphs derived via this distinguished graph (bottom) Example 6. The triple of meta-models in Figure 2.1 is the type triple graph TG for our running example, and the triple of models in Figure 2.2 represents a triple graph typed over TG. We further on use the visual notation of Figure 2.2 and do not explicitly depict TG or type morphisms when discussing triple graphs typed over TG. Type information, nevertheless, is explicitly given via the labels on the vertices and edges (e.g., :JClass for a vertex which is mapped to the JClass vertex in the Java meta-model). Remark 3. In our definitions and statements from now on, we assume a type triple graph TG is always given (and thus do not require it explicitly). In the absence of an explicit TG symbol, the term triple graph refers to a triple graph typed over TG, and TripleGraphs denotes the slice category TripleGraphsTG. 2.2 Triple Graph Grammars (TGGs) The term grammar has its roots in linguistics where it refers to a set of rules that describe how to construct validwords, phrases, and sentences in a natural language. In the 1950s, formal grammars have been introduced (e.g., [30, 31]) as an approach to studying the structure of a language. It then came as no surprise that formal grammars became highly relevant for “computational” linguistics, e.g., describing what is syntactically possible in a programming language by using string grammars. In the 1970s, graph grammars have followed evolving into a community with dedi- cated scientific events [33] as graphs are (just like strings) ubiquitous in computer 26 2 Fundamentals and Running Example science to represent data. Around that time, pair grammars [119] have shown that grammars are also useful to describe consistency between two structures. The main idea of a pair grammar is to provide a set of rule pairs where each rule pair constructs the two structures simultaneously. Though only illustrated on string-to-graph translators, possible use cases include graph-to-string, string-- to-string, and graph-to-graph translators. This is the source of inspiration for TGGs, introduced in the 1990s in [128], where pairs are lifted to triples by introducing correspondences in the middle. All in all, a TGG describes how triple graphs are constructed. If a grammar is a set of rules, our first step is to definewhat is a rule in the context of TGGs.A rule describes howa triple graph canbe extended to another triple graph by creating new vertices and edges on the individual graphs of the triple. Although it might seem to be a severe limitation at a first glance that the rules can only create something but never delete (which is generally allowed in graph grammars), it should be noted that our goal is not to capture all possible modifications to triple graphs but only to describewhich triple graphs are consistent. Indeed, a consistency restorer derived from a TGG is able to handle creations as well as deletions when propagating changes as we shall see later (though the consistency description itself is only constructive). Definition 10 (Rule). A TGG rule, short rule, is a monomorphism r : L → R in TripleGraphs. The triple graphs L and R are referred to as the left-hand side and the right-hand side, respectively, of r. Example 7. Figure 2.7 depicts a set of rules that construct triple graphs repre- senting Java and UML models together with their correspondences. As a rule, being a monomorphism, “embeds” its left-hand side L injectively into its right-hand side R, we use a compact syntaxmerging both L and R in the same diagram. Elements (vertices and edges) that are both in L and R are de- picted black. These elements represent the triple graphwhich is to be extended and are referred to as context elements required by the rule. Elements that are in R but not in L are depicted green and additionally with a (++)-markup for monochrome printing. These elements represent the extensions intended by the rule and are referred to as created elements. We also use attribute constraints in some rules (within a note icon) that require solely name equality between Java and UML elements (and position equality between parameters) that are created together. Our rules have the following purposes: • ClassRule (r1) does not require any context elements and creates a pair of Java and UML classes together with a correspondence in the middle. The names of the Java and UML class, furthermore, must be equal. • InterfaceRule (r2) is similar to the previous rule and creates a correspond- ing pair of Java and UML interfaces without requiring a context. 2.2 Triple Graph Grammars (TGGs) 27 (++) jc : JClass (++) uc : UMLClass (++) c2c :C2C jc.name == uc.name r5: SubClassRule jc' : JClass uc' : UMLClassc2c' :C2C (++) superClass (++) general (++) jc : JClass (++) uc : UMLClass (++) c2c :C2C jc.name == uc.name r1: ClassRule (++) ji : JInterface (++) ui : UMLInterface (++) c2c :C2C ji.name == ui.name r2: InterfaceRule ji : JInterface ui : UMLInterfacec2c :C2C r4: GeneralRule ji' : JInterface ui' : UMLInterfacec2c' :C2C (++) superInterface (++) general jc : JClass uc : UMLClassc2c :C2C r3: ContractRule ji : JInterface ui : UMLInterfacec2c' :C2C (++) superInterface (++) contract (++) jm : JMethod (++) um : UMLMethod (++) m2m :M2M jm.name == um.name r6: MethodRule jc : JClassifier uc : UMLClassifierc2c :C2C (++) method (++) method (++) jp : JParameter (++) up : UMLParameter (++) p2p :P2P jp.name == up.name jp.pos == up.pos r7: ParameterRule jm : JMethod um : UMLMethodm2m :M2M (++) parameter (++) parameter Figure 2.7: TGG rules for our running example • ContractRule (r3) requires a corresponding pair of Java and UML classes as well as a corresponding pair of Java and UML interfaces as context (which, e.g., can be created by the previous two rules), and creates a superInterface edge from the Java class to the Java interface as well as a contract edge from the UML class to the UML interface. • GeneralRule (r4) requires two corresponding pairs of Java and UML inter- faces as context, and creates a superInterface edge on the Java side as well as a general edge on the UML side. Both edges are created in the same direction. • SubClassRule (r5) requires a corresponding pair of Java and UML classes, and creates a corresponding pair of subclasses with equal names. Note that the rule crates a Java class together with its superClass edge. As we do not have any other rule creating a superClass edge for an existing Java class, it is ensured that a Java class has either one super class (when created with this rule) or no super class (when created with ClassRule) but never multiple super classes which is forbidden in Java. 28 2 Fundamentals and Running Example • MethodRule (r6) requires a corresponding pair of Java andUML classifiers (which can be classes or interfaces) as context and creates a corresponding pair of Java and UML methods with equal names. • ParameterRule (r7), finally, requires a corresponding pair of Java andUML methods as context and creates a corresponding pair of Java and UML parameters with equal names and positions. As the reader might have noted, our definition of a rule (Definiton 10) is nothing but a monomorphism in TripleGraphs but we discuss them with the intention of creating vertices and edges. In fact, creations are realized by applying a rule to a given triple graph. A rule application is basically gluing the right-hand side of a rule with the given triple graph along the left-hand side of the rule. Constructions over gluings is generalized as a pushout in the sense of category theory. Definition 11 (Pushout).F(X) F(Y) G(X) G(Y) F(f) G(f) αX αY X A Bm f g A Bf C g Df' g' X c x b L Gm R r G'm' g PO PO Given morphisms f : A → B and g : A → C in a category C, a pushout is defined by a pushout object D ∈ ObC and morphisms f ′ : C→ D and g′ : B→ D with f ′ ◦ g � g′ ◦ f , such that the following property, referred to as the universal property, is fulfilled: For all morphisms b : B → X and c : C → X, there is a unique morphism such that x ◦ g′ � b and x ◦ f ′ � c. Future diagrams representing a pushout are labeled with PO (as is the case in Definition 11). Example 8. As the category Sets is the underlying category for Graphs (and consequently TripleGraphs), we again start with an intuition on the level of sets to understand how a pushout is constructed in our future examples. The pushout object over two morphisms f : A → B and g : A → C in Sets (i.e., two total functions) can be constructed as the quotient B]C |≡where] denotes disjoint union of sets and≡ is the smallest equivalence relation such that∀a ∈ A, ( f (a), g(a)) ∈≡. In the exemplary diagram below, all functions map the elements in the input set to the elements with the same index in the output set, e.g., f (a1) � b1 and g(a1) � c1. The pushout object, finally, is the set of equivalence classes with respect to≡. Equivalence classes that contain two elements (i.e., {b1, c1}, {b2, c2}, and {b3, c3}) are collapsed to a single representative (i.e., to d1, d2, and d3, respectively) while all other equivalence classes (i.e., {b4}, {c4}, and {c5}) are represented with the symbol of their single element. {a1, a2, a3} {b1, b2, b3, b4} {c1, c2, c3, c4, c5} {d1, d2, d3, b4, c4, c5} f f' g g'PO A B C D 2.2 Triple Graph Grammars (TGGs) 29 Pushouts inGraphs and consequently TripleGraphs are induced by component- wise pushouts inSets.We refer to [42] (Fact 2.17 for graphs and Fact A.37 for functor categories in general) for a detailed proof of the following fact. Fact 2 (Pushouts in TripleGraphs). Pushouts exist in Graphs and TripleGraphs. In Graphs, pushouts can be con- structed componentwise for vertices and edges in Sets, where pushouts in TripleGraphs can be constructed componentwise for the source, correspon- dence, and target graph in Graphs. Definition 12 (Rule Application).F(X) F(Y) G(X) G(Y) F(f) G(f) αX αY X A Bm f g A Bf C g Df' g' X c x b L Gm R r G'm' g PO PO A rule applicationwith a rule r : L→ R and a triple graph G over amonomorphism m : L→ G, denoted as G r@m ���⇒ G′, is a pushout as depicted in the diagram to the right. For a rule application, we refer to m and m′ as match and comatch, respectively. A sequence d : G1 r1@m1 ����⇒ G2 r2@m2 ����⇒ . . . rn@mn ����⇒ Gn of rule applications is referred to as a derivation. Example 9. Figure 2.8 depicts a derivation consisting of two rule applications (represented as two pushout diagrams) with two of our exemplary rules from Figure 2.7. The vertical arrows on the left side represent the two rules, namely ClassRule (r1) and MethodRule (r6). Note that, for the first time, we explic- itly show the rules as a morphism and do not use the compact syntax with (++)-markup. Accordingly, L1 (R1) and L6 (R6) refer to the left-hand side (right-- hand side) of r1 and r6, respectively. The matches and comatches of the single rule applications, furthermore, are represented as m and m′, respectively. We start with the empty triple graph (G0) and find amatch of L1 which is also an empty triple graph, and create a pair of Java and UML classes by gluing R1 over the empty triple graph, resulting in the pushout object G1. We then find a match of L6 in G1, glue R6 over this match resulting in a corresponding pair of Java and UML methods in the pushout object G2. Finally, a TGG is a set of rules and its language consists of triple graphs that can be constructed via derivations with these rules startingwith the empty triple graph (as illustrated, e.g., in Figure 2.8). Definition 13 (Triple Graph Grammar). A triple graph grammar, denoted and abbreviated as TGG, is a set of rules. The language L(TGG) of a TGG is defined as: L(TGG) � {G∅} ∪ {G | ∃ d : G∅ r1@m1 ����⇒ G1 . . . rn@mn ����⇒ Gn with r1, . . . , rn ∈ TGG} where G∅ � ∅ ← ∅ → ∅ is the empty triple graph. The source language LS(TGG) and the target language LT(TGG) are defined as: LS(TGG) � {GS | ∃ GS ← GC → GT ∈ L(TGG)} LT(TGG) � {GT | ∃ GS ← GC → GT ∈ L(TGG)}. 30 2 Fundamentals and Running Example jc : JClass uc : UMLClassc2c :C2C jc.name == uc.name : JClass name = "List" : UMLClass name = "List" :C2C m1 r1 m'1 jc : JClassifier uc : UMLClassifierc2c :C2C jm : JMethod um : UMLMethodm2m :M2M jm.name == um.name jc : JClassifier uc : UMLClassifierc2c :C2C  method method : JClass name = "List" : JMethod name = "remove" : UMLClass name = "List" : UMLMethod name = "remove" method :M2M :C2C method m2 r6 g2 m'2 g1 L1 R1 L6 R6 PO PO G0 G1 G2 Figure 2.8: A derivation with two rule applications Example 10. The set of rules in Figure 2.7 forms a TGG for our running example. The language of a TGG induces a notion of consistency for source and target graphs which is central in the upcoming sections. Definition 14 (Inter-model Consistency). Given a TGG, two graphs GS ∈ LS(TGG) and GT ∈ LS(TGG) are said to be consistentwith respect to TGG if ∃ GS ← GC → GT ∈ L(TGG). 2.3 An Extended Consistency Specification for the Running Example So far, we have discussed the consistency of Java and UML models on minimal examples with a rather simple set of rules (Figure 2.7). Our rules seem symmetric and basically describe a one-to-one mapping: For each Java classifier, method, pa- rameter, and inheritance relation, there must be a counterpart in UML. Consistency can be considered as a bijective function in this case, i.e., for a given Java model in the source language of our TGG there is exactly one UML model in the target language simply consisting of the same classifiers, methods, parameters, and in- 2.3 An Extended Consistency Specification for the Running Example 31 heritance relations (this also applies reversely for a given UMLmodel). Experience over the years with industrial consistency projects, however, shows that a more flexible conception of consistency is needed such that consistency of two models is a relation (and not necessarily a function). That is, for a given a source model in the source language of a TGG, there might be several target models that are consistent to the given source model (and vice versa). In fact, consistency of Java and UMLmodels is also a typical scenario requiring a broad consistency understanding that goes beyond bijections. If UML models are intended to be rather high-level models abstracting from implementation details, the extent of abstraction can be considered as a design choice leading to a certain degree of freedom. For example, assume that some classifiers,methods, and further elements in a Java model are used for good Java programming practices but do not necessarily contribute to a platform-independent understanding of the developed system (and this platform-independent understanding is probably the main mo- tivation of using UML). In this case, a UML model representing “less” than the Java model can still be considered as consistent. In the following, we focus on one of these situations and extend our consistency specification which (together with our former set of rules in Figure 2.7) can construct several UML models (differ- ing in their extent of abstraction) for the same Java model. Reversely, several Java models can be constructed for the same UML model (differing in their usage of programming practices that are irrelevant to the UML model). A well-known practice in Java, which indeed is used by several tools such as [39, 140] when generating Java code from PIMs, is to represent each entity systematically as an interface together with an implementation class. A client appli- cationwhichmakes use of such Java code should only interact with interfaceswhile implementation classes exist only for hiding behavioral complexity. In fact, there is usually technological support to enforce this by making only interfaces available to clients. Generally, the implementation classes adopt a naming convention (e.g., an “Impl”-suffix to the name of the respective interface) to easily assign them to their interfaces when reading code. It is then a reasonable abstraction for the UML model to allow that pairs of interfaces and implementation classes (adhering to the naming convention) in Java are represented as single UML classes. We consider this as a decision for each individual case (e.g., more abstraction in UML for a cer- tain group of Java elements and less abstraction for others depending on what the stakeholders expect from a UML model). In Figure 2.9, two Java models and two UML models are depicted in concrete syntax, all representing a Rectangle entity once as an interface together with an implementation class and once solely as a class. We use an Impl-suffix for naming an implementation class, e.g., RectangleImpl, and refer to such classes as Impl-class (the naming convention generally can be matter of a predefined configuration when generating code or matter of taste when programming). According to our extended, and actually relaxed, understanding of consistency, the following three pairs of models must be regarded as consistent (these pairs are also indicated by the bidirectional arrows in Figure 2.9): 1. The Java model containing only one Rectangle class is consistent to the UML model containing only one Rectangle class, i.e., not using the aforementioned 32 2 Fundamentals and Running Example public class Rectangle{ } public interface Rectangle{ } public class RectangleImpl implements Rectangle{ } Rectangle «interface» Rectangle RectangleImpl Figure 2.9: Two Java models and two UML models where three pairs are consistent (indi- cated with⇔ between models) programming practice in Java models is allowed (this case is already sup- ported with our rules so far). 2. Also the Javamodel containing aRectangle interface and aRectangleImpl class is consistent to the UML model containing only one Rectangle class, i.e., the UML model may abstract from Impl-classes (and this is exactly the extension we are intending in the following). 3. The Java model containing a Rectangle interface and a RectangleImpl class is also consistent to the UML model containing both, i.e., the UML model does not necessarily have to abstract from Impl-classes (this case is already supported with our rules so far). Moreover, systematic use of interfaces and Impl-classes in Java is not only a good practice but also a necessity if a corresponding UML model describes multiple inheritance between classes (which is not allowed in Java as already taken into account by our rules). Consider, for example, the UML model in Figure 2.10 con- sisting of Rectangle, Clickable, and Button classes where the latter has inheritance relations to the first two. In this case, demanding for each UML class a correspond- ing Java class (as we have done so far) is not appropriate anymore. The Java model in Figure 2.10, therefore, represents all these classes as an interface and an Impl-class where multiple inheritance is represented via inheritance relations between inter- faces. While such a strategy might seem to be an unsatisfactory solution in many cases leading to redundantmethod implementations and attributes as these are not inherited from interfaces, it is also the most common means for dealing with the discrepancy between Java and UML. The strategy is especially useful for entities that are rather meant for maintaining and querying data but not for complex com- 2.3 An Extended Consistency Specification for the Running Example 33 putations. Default method implementations in interfaces since the introduction of Java 8, nevertheless, mitigates the problem of redundant method implementations. public class Rectangle{ } public interface Rectangle{ } public class RectangleImpl implements Rectangle{ } Rectangle «interface» Rectangle RectangleImpl Rectangle Clickable Button public interface Rectangle{ } public class ClickableImpl implements Clickable{ } public interface Clickable{ } public class RectangleImpl implements Rectangle{ } public interface Button extends Rectangle, Clickable{ } public class ButtonImpl implements Button{ } Figure 2.10: A model pair representing multiple inheritance Having discussed some examples with additional Impl-classes in Java without explicit UML counterparts, we are now ready to extend our consistency specifica- tion to capture this. While we need new rules that create a Java interface and a corresponding UML class, the additional Java Impl-class will be connected to the same UML class. To this end, we first extend our correspondence meta-model and introduce a new type of correspondence, namely C2C*, between Java and UML classes as depicted in Figure 2.11 (all other meta-model parts that have already been shown in Figure 2.1 are grayed out). This new correspondence type will be used to connect Impl-classes with the corresponding UML class of their interfaces and, furthermore, allow us to distinguish Impl-classes from other Java classes. JClassifier JInterface JClass superInterface 0..* superClass 0..1 UMLClassifier UMLInterface UMLClass general  0..* contract 0..* C2C C2C* Figure 2.11: New correspondence type C2C* for Impl-classes 34 2 Fundamentals and Running Example Finally, Figure 2.12 depicts further rules that describe how consistent Java and UML models are constructed where Java models may have additional Impl-classes. The individual rules in Figure 2.12 have the following purposes: ji: JInterface uc : UMLClassc2c :C2C r9: InterfaceClassGeneralRule ji' : JInterface uc' : UMLClassifierc2c' :C2C (++) superInterface (++) general (++) ji : JInterface (++) uc : UMLClass (++) c2c :C2C ji.name == uc.name jc.name == ji.name + “Impl“  r8: ImplClassRule (++) jc : JClass (++) superInterface (++) c2c* :C2C* jc' : JClass uc' : UMLClassc2c'* :C2C* r11: SubClassOfImplClassRule (++) jc : JClass (++) superClass (++) c2c* :C2C* jc : JClass uc : UMLClassc2c* :C2C* r12: SuperInterfaceOfImplClassRule ji : JInterface (++) superInterface jc : JClass uc : UMLClassc2c* :C2C* r13: MethodOfImplClassRule (++) jm : JMethod (++) method jc : JClass uc : UMLClassc2c* :C2C* r14: ParameterOfImplClassRule jm : JMethod method (++) jp : JParameter (++) parameter (++) ji : JInterface (++) uc : UMLClass (++) c2c :C2C ji.name == uc.name jc.name == ji.name + “Impl“  r10: ImplClassWithSuperClassRule (++) jc : JClass (++) superInterface (++) c2c* :C2C* jc' : JClass (++) superClass Figure 2.12: New rules to capture additional Impl-classes in Java models • ImplClassRule (r8) creates a Java interface and a corresponding UML class with the same name. Additionally, an Impl-class is created on the Java side and connected to the UML class with a correspondence of type C2C*. The name of the Impl-class has a suffix “Impl”. • InterfaceClassGeneralRule (r9) is similar to GeneralRule (r4) as it creates a superInterface on the Java side and a general edge on the UML side. The Java interfaces, however, can now have correspondences to UML classes (and not necessarily to UML interfaces) as allowed by the previous rule. Note that this rule enables, for the first time, creating multiple inheritance among UML classes (while still only interfaces might have multiple inheritance in Java). 2.4 Summary, Open Issues, and Existing Extensions to TGGs 35 The remaining rules in Figure 2.12 are provided for the sake of completeness and play rather a minor role in our future examples and discussions. These rules basically describe how further Java elements concerning Impl-classes in Java are created without having a counterpart in UML: • ImplClassWithSuperClassRule (r10) creates all elements that can also be cre- ated by the previously mentioned rule ImplClassRule. The only difference as compared to ImplClassRule is that the Impl-class is created as a subclass of an existing Java class. This inheritance relation is not reflected on the UML side (as the Impl-class itself is not explicitly represented on the UML side). • SubClassOfImplClassRule (r11) creates a subclass of an Impl-class on the Java side (note thatwe indicate an Impl-class by requiring a correspondence of type C2C*). The new subclass in Java does not have any counterpart on the UML side but is also connected to the UML class corresponding to its super class. We again use a correspondence of type C2C* for this connection. That is, we handle subclasses of Impl-classes in the same manner and consider them as further Impl-classes that can be omitted on the UML side. • SuperInterfaceOfImplClassRule (r12) creates a superInterface edge from an ex- isting Impl-class to an existing interface on the Java side. While Impl-classes inherently have at least one superInterface edge as they are created together with their interfaces, this rule creates further superInterface edges. There is no UML counterpart for these edges as we abstract from Impl-classes in UML. • MethodOfImplClassRule (r13) creates a method for an Impl-class without any UML counterpart. • ParameterOfImplClassRule (r14) creates parameters for Impl-class methods, again without any UML counterpart. Overall, the two sets of rules from Figure 2.7 and 2.12 constitute together a consistency specification that allowsdifferent degrees of abstraction inUMLmodels for the same Java model (depending on whether a Java class with an Impl-suffix is created as an Impl-class or as a normal class). While such extensions can admittedly blow up the size of a consistency specification, e.g., doubling the number of rules in our concrete case, modularity and reuse concepts for rules as proposed in [11] provide viablemeans to reduce specification andmaintenance effort in practice.We shall exploit the diversity of our rules to exemplify different situations concerning consistency checking and restoration in the upcoming sections. In each individual case, the examples will be reduced to a minimal subset of the rules. 2.4 Summary, Open Issues, and Existing Extensions to TGGs In this section we have • discussedmeta-models that define the structure ofmodels in anMDE context, 36 2 Fundamentals and Running Example • formalized (meta-)models and triples of (meta-)models as graphs and triple graphs, respectively, using functor categories consequently to derive both structures basically from the well-known category Sets of sets, • captured typing information in triple graphs using slice categories, • introduced TGGs as a set of rules that describe how to construct triple graphs yielding a language of consistent models. In all cases, we have exemplified the concepts based on our running example concerning the consistency between Java and UML models. Overall, a consistency specification is provided first as a one-to-one mapping between Java and UML elements and later extended to cater for more complex situations w