Impressive Progress : WP2
Results and Evolution
During the very exciting 2023 year, the partners UL, VUA, CUNI, TUD, ICOM and UPC did major contributions towards the work package 2 Conceptual foundations for experiment-driven analytics over extreme data. The first deliverable D2.1 is describing our initial iteration of the architecture and it’s rationale.
The goal of our activities in the first project year has been to provide the initial modelling and language foundations for experiment-driven analytics focusing on both software variability aspects, experiment specification, and context. Our work also provides the foundations for the trustworthiness and traceability layer and the software architecture of the ExtremeXP
framework. The work under this work package is designed to feed all the other technical work packages, i.e., WP3-5 of the ExtremeXP project.
Initially, the partner CUNI led the activities towards experimentation modelling with domain specific modelling languages. They focused on the development of the foundational and novel meta-models and the associated domain specific modelling language (DSML) describing the experimentation process for optimizing complex analytics.
The DSML (defined with Eclipse EMF) which is under development allows the specification of different variants of a complex analytics process to be mapped to a user-specified intent and context. Such variants include different datasets/data sources, the flow of complex analytics tasks (visualization, simulation, ML), deployment options, algorithms, models, hyper-parameters. It also allows the specification of key metrics of experiment properties (e.g., time, resources, model accuracy) and their hypothesized difference that allows for their evaluation, comparison, planning and re-planning at runtime. The language developed in this task will be used by the experimentation engine andthe monitoring facilities, to be further developed under WP5.
The partner UL led the activity of semantic context representation and probabilistic meta-analysis. A paper was presented at the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases in Turin, Italy. Essentially, what is developed is our initial probabilistic Markov Decision Processes and Bayesian Network models that are designed to complement the meta-model. Focus was given on the solution to provide assurances, options ranking, and verification of ExtremeXP configurations. This way, we establish association rules between users and configurations. This work will further be used by the Adaptive Experiment Planning to be developed under WP5, the formalism for specifying user intents and the models that will be used under WP4.
The partner TUD led the activity on traceability and trustworthiness for experiment-driven analytics. Here, the aim was to register access to data and their contextual circumstances that drive the permit or deny decision to access requests to data and knowledge artefacts, even across organizational boundaries. This work builds on the ABAC model and it designs and extends an attribute-based access control mechanism based on smart contracts provisioning over a private and permissioned blockchain. Moreover, a Non-Fungible Token (NFT)-based dataset provenance mechanism is under development which will be used for transparently, immutably, and uniquely ‘tagging’ with metadata the datasets used for experiment-driven analytics. The outcome of this work will be integrated to the framework’s holistic data and knowledge management and lifecycle support to be developed under WP5.
Finally, VUA led the activity on the software and reference architecture for complex experiment-driven analytics. Our initial architecture design contains structural and behavioural diagrams. It is comprised of several independent, self-contained, and elastic components that can be used to store knowledge assets from experiments, collect evaluation data including user feedback, plan experiments, and enact them (either locally or on remote systems) using virtualized resources and serverless functions. It supports scheduling and running several experiment variants (alternative configurations of a complex analytics process) in parallel.
The core services of the framework will be implemented in WP5. The architecture provides interfaces between the different support services, i.e., data, algorithm, model, and feature selection services of WP3 and explainability, visualization, and human-in the loop services of WP4, to the Adaptive Experiment Planning module. All in all, the work under WP2 intended to provide a clear focus and vision to all the other project work packages.