Challenge 5 : Knowledge Management and Experiment Documentation

This challenge guides participants through end-to-end testing of the Experiment Cards component: ingesting experiment metadata from the Data Abstraction Layer, browsing and filtering documented experiments, opening experiment-level card views, and submitting human feedback (lessons learnt and ratings) that is linked to experiments.

Estimated Time: 20 minutes

Background

Overview

Experiment Cards are a knowledge management and documentation component designed to support reproducible AI experimentation. By integrating automatically collected metadata with human-in-the-loop input, Experiment Cards document essential experimentation aspects such as intent, constraints, evaluation metrics, outcomes, and lessons learned. They address the limitations of existing documentation approaches, such as Model Cards and Data Cards, which primarily describe static AI artifacts and fail to represent the dynamic and iterative nature of experimentation. Moreover, current practices often overlook the importance of systematically reusing knowledge generated across experimental cycles, leading to fragmented or lost insights.

Functionally, Experiment Cards provide a unified view of experiments and a querying mechanism over executed experiments. They capture experimentation throughout its lifecycle both automatically, via technical components of the ExtremeXP platform when available, and through user interaction.

Technical Specifications

The Experiment Cards component is a Flask application that ingests experiment/workflow/metric payloads from the DAL, stores them, serves HTML views (experiment cards list with queryable fields, experiment-level card view, lessons-learnt submission form), and REST endpoints for creating/updating cards and submitting feedback/ratings, runs in Docker Compose, and keeps data up to date via a background sync loop polling the DAL.

The main functionalities include the following features:

Cards can be created/updated from DAL data (semi-automated documentation).

Users can browse, search and filter across experiments (query-based navigation).

Users can add qualitative and quantitative feedback (lessons learnt, overall rating, per-run ratings, optional attachment). Feedback is persisted and linked to the correct experiment, and is visible in the card view

Prerequisites

Follow the user guide below to install the Experiment Cards module. Do not start the challenge exercise before completing this.

https://github.com/extremexp-HORIZON/extremeXP_experimentCards/tree/code-cleanup

Start by cloning the repo,

> git clone https://github.com/extremexp-HORIZON/extremeXP_experimentCards.git

> cd flask-extremeXP

then move on to set up the environment variables (the access token from DAL is necessary at this stage), build the docker image and run the application as described in the above user guide.

Exercises

Step 1 — Verify the Experiment Cards deployment is reachable

1.Open the cards listing page:

- http://<host>:5002/query_experiments_page

2.Confirm you can load the page and see either:

- existing experiments (if already ingested), or
- an empty state (if no experiments have been stored yet).

What to screenshot:

The listing page.

Step 2 — Validate browsing, search, and filtering

On /query_experiments_page, test filters relevant to your data model (examples below—adapt to what your UI exposes):

intent (e.g., classification / regression / optimization)
algorithm/model type (e.g., LR / RF / etc.)
dataset name/version
metric thresholds (if supported)

date ranges / owner / tags (if supported)

What to screenshot:

One filtered view (show filters and results)

Step 3 — Open the experiment card view and check completeness

1.Open:

- http://<host>:5002/experiment_details_realData/<experiment_id> or click on any “Experiment Cards” button available in the main screen.

2. Inspect that the card consolidates:

- automatically populated metadata (related to experiment, workflow and metrics),
- any computed or derived values,
- and placeholders in the sections for human feedback.

What to screenshot:

The experiment card view

Step 4 — Submit “Lessons Learnt” + ratings (human-in-the-loop capture)

1.Open:

http://<host>:5002/form_lessons_learnt/<experiment_id> or click on any “Submit lessons learnt” button available in the main screen.

2.Fill:

Lessons Learnt (free text)
Experiment Rating (1–7)
Run Ratings (1–7 per run, if multiple runs exist)
Upload a supporting pdf file (optional)

3.Submit (which calls the API /submit/<experiment_id>).

What to screenshot:

The filled form (before submit), result after submission and the experiment card view showing the newly added feedback

Learning Resources

Experiment Cards presented in ICTAI 2025 Conference: https://www.computer.org/csdl/proceedings-article/ictai/2025/491900a433/2ct0EJ3US5y
DSL Task Definitions: https://extremexp-horizon.github.io/extremexp-experimentation-engine/dsl/tasks/
DSL Workflows: https://extremexp-horizon.github.io/extremexp-experimentation-engine/dsl/workflows/
DSL Experiments: https://extremexp-horizon.github.io/extremexp-experimentation-engine/dsl/experiments/

Success criteria

Experiments visible in respective page
Experiment Card view loads and displays experiment, workflow and metric data
Queries and filters executed in main page
Lessons Learnt + ratings submitted via the form
Feedback visible in card view and correctly linked to the experiment

Provide screenshots that showcase all the above.

Deliverables

Submit the following artifacts:

Screenshots showing what is listed in the “Success Criteria” above;
Short written summary, with:

- Any issues encountered
- Suggestions for improvement.

Feedback

Please complete the feedback form provided by the beta-testing coordinators to report usability issues, missing features, or suggestions related to the Experiment Cards component.

Reporting any issues

If you’d like to report issues or bugs, please include the following:

Endpoint/UI page
Steps to reproduce
Expected vs. actual behavior
Screenshot(s)
Any relevant IDs (e.g. experiment_id)