Challenge 8 : Visualization and Explainability

In this challenge, you will explore the Visualization and Explainability capabilities of the ExtremeXP framework. Building on a previously executed experiment, you will analyse workflow executions, compare alternative model configurations, inspect data and metrics, and generate explainability insights to understand model behaviour and parameter effects.

This challenge focuses on post-execution analysis, interactive visualization, and explainability-driven interpretation, rather than experiment specification and execution.

Prerequisites

Before starting this challenge, you must have successfully completed Challenge 1 : Core functionalities of the ExtremeXP framework, including :

defining and executing a multi-workflow experiment using the ExtremeXP DSL,

running the experiment on ProActive,

producing metrics and ML artifacts (trained models, evaluation results),

verifying that workflows completed successfully.

No additional challenges are required prior to this one.

Background

This challenge involves the following ExtremeXP components :

Experimentation Engine

The Experimentation Engine orchestrates the execution of experiments defined using the ExtremeXP DSL. It interprets workflows, expands parameter spaces, schedules workflow instances, and collects execution metadata and results. Once execution completes, all experiment-related information is made available to downstream components.

Data Abstraction Layer (DAL)

The Data Abstraction Layer stores structured experiment metadata, including workflow parameters, execution states, and recorded metrics. The Visualization UI queries the DAL to retrieve metrics, parameter values, execution timelines, and user feedback associated with experiments and workflows.

Decentralized Data Management (DDM)

DDM stores datasets and data artifacts produced during execution, such as training and test datasets, model outputs, predictions, and exported artifacts. These datasets are dynamically loaded by the Visualization UI for interactive exploration and explainability analysis.

Executionware

The executionware (e.g. ProActive) is responsible for running workflows and tasks defined by the Experimentation Engine. While execution is not the focus of this challenge, execution status and runtime behaviour are visualized and analysed through the UI.

Visualization Dashboard

The Visualization Dashboard is a web-based user interface that enables interactive monitoring, exploration, comparison, and explainability analysis of experiments and workflows. It integrates experiment monitoring, workflow analysis, model performance diagnostics, data exploration, and explainability views into a unified environment that supports human-in-the-loop experimentation.

User guides and tutorials

ExtremeXP Experimentation Engine User Guide

Visualization and Experiment Monitoring tutorial

(include screenshots or short videos provided by the beta-testing coordinators)

DSL documentation (for reference only)

(include screenshots or short videos provided by the beta-testing coordinators)

Exercise

In this challenge, you will first extend the churn prediction experiment executed in the prerequisite challenge so that advanced visualizations and explainability become available. Then, you will use the ExtremeXP Visualization UI to analyze and interpret the results of the experiment.

Step 1 : Inspect the Explainability task code

You should first understand the provided code and what it aims to accomplish.

Download the provided python script and DSL files from the shared folder

Read it and try to understand its purpose; that it aims to canonicalize the data and model artifacts of the experiment, so that they can subsequently be consumed by the more advanced algorithms of the visualization / explainability components.

Identify the parameters each script expects and where they come from.

Step 2 : Execute the Experiment on Proactive

Step 3 : Access the Experiment Monitoring Page

Open the Visualization Dashboard.
Navigate to the Experiment Monitoring Page.
Locate the churn prediction experiment executed in the prerequisite challenge.
Inspect the progress summary bar and verify that all workflows have completed successfully.

Goal : Familiarize yourself with the overall experiment status and identify the different workflow instances.

Step 4 : Analyse workflow executions and metrics

Examine the Workflow Execution Table.
Review the variability-point parameters (e.g. model type, hyperparameters) for each workflow.
Inspect the recorded metrics (e.g. accuracy, F1 score, ROC AUC, execution time).
Sort and filter workflows based on performance metrics.

Goal : Identify high-performing and low-performing workflow configurations.

Step 5 : Use the Parallel Coordinates Plot for parameter–metric analysis

Enable the Parallel Coordinates Plot at the bottom of the Experiment Monitoring Page.
Select accuracy (or another performance metric) as the colour encoding.
Explore how different parameter values and model variants relate to performance.

Goal : Visually detect parameter sensitivities, trade-offs, and promising regions in the parameter space.

Step 6 : Inspect individual workflows in the Workflow Analysis Page

Select a single workflow and navigate to its Workflow Analysis Page.
Examine the workflow structure and task-level execution details.
Inspect parameters, metrics, and input/output artifacts associated with each task.
Explore dataset previews and generate basic visualizations (e.g. distributions, scatter plots).

Goal : Understand how a specific workflow was executed and how its results were produced.

Step 7 : Analyse model performance and explainability

Open the Model Insights section for a workflow that includes a trained model.
Inspect model performance visualizations such as:
- confusion matrix,
- ROC curve,
- precision–recall curve.
Explore instance-level predictions using the Instance View.
Identify misclassified instances and inspect their feature values.
Generate feature-level explainability results (e.g. feature importance, PDP, ALE).
Explore hyperparameter explainability views to understand how training configurations influence performance.

Goal : Interpret model behaviour, feature effects, and the influence of hyperparameters on outcomes.

Step 8 : Compare workflows using the Comparative Analysis Page

Select multiple completed workflows.
Navigate to the Comparative Analysis Page.
Compare workflows using:
- Metrics Tab (bar charts and line charts),
- Model Insights Tab (confusion matrices, ROC curves, instance views),
- Data Tab (dataset distribution comparisons).
Identify systematic differences between model types or hyperparameter configurations.

Step 10 : Human-in-the-loop interaction and feedback

Assign qualitative ratings to selected workflows using the rating mechanism.
Reflect on whether the best-performing models align with domain expectations.
(Optional) If enabled, interact with any available human-in-the-loop tasks or explanation-driven workflow suggestions.

Goal : Experience how human feedback and interpretability integrate into the experimentation lifecycle.

Success criteria

Screenshots showing

A short written summary describing

Feedback

Please complete the feedback form provided by the beta-testing coordinators to report usability issues, missing features, or suggestions related to the Visualization and Explainability components.