Challenge 5 : Knowledge Management and Experiment Documentation
This challenge guides participants through end-to-end testing of the Experiment Cards component: ingesting experiment metadata from the Data Abstraction Layer, browsing and filtering documented experiments, opening experiment-level card views, and submitting human feedback (lessons learnt and ratings) that is linked to experiments.
Estimated Time: 20 minutes
Background
Overview
Experiment Cards are a knowledge management and documentation component designed to support reproducible AI experimentation. By integrating automatically collected metadata with human-in-the-loop input, Experiment Cards document essential experimentation aspects such as intent, constraints, evaluation metrics, outcomes, and lessons learned. They address the limitations of existing documentation approaches, such as Model Cards and Data Cards, which primarily describe static AI artifacts and fail to represent the dynamic and iterative nature of experimentation. Moreover, current practices often overlook the importance of systematically reusing knowledge generated across experimental cycles, leading to fragmented or lost insights.
Functionally, Experiment Cards provide a unified view of experiments and a querying mechanism over executed experiments. They capture experimentation throughout its lifecycle both automatically, via technical components of the ExtremeXP platform when available, and through user interaction.
Technical Specifications
The Experiment Cards component is a Flask application that ingests experiment/workflow/metric payloads from the DAL, stores them, serves HTML views (experiment cards list with queryable fields, experiment-level card view, lessons-learnt submission form), and REST endpoints for creating/updating cards and submitting feedback/ratings, runs in Docker Compose, and keeps data up to date via a background sync loop polling the DAL.
The main functionalities include the following features:
- Cards can be created/updated from DAL data (semi-automated documentation).
- Users can browse, search and filter across experiments (query-based navigation).
- Users can add qualitative and quantitative feedback (lessons learnt, overall rating, per-run ratings, optional attachment). Feedback is persisted and linked to the correct experiment, and is visible in the card view
Prerequisites
Follow the user guide below to install the Experiment Cards module. Do not start the challenge exercise before completing this.
https://github.com/extremexp-HORIZON/extremeXP_experimentCards/tree/code-cleanup
- Before running the application, ensure you have the Docker installed.
- Access to the DAL component is necessary, either served locally or the online ExtremeXP platform.
Start by cloning the repo,
> git clone https://github.com/extremexp-HORIZON/extremeXP_experimentCards.git
> cd flask-extremeXP
then move on to set up the environment variables (the access token from DAL is necessary at this stage), build the docker image and run the application as described in the above user guide.
Exercises
1.Open the cards listing page:
- http://<host>:5002/query_experiments_page
2.Confirm you can load the page and see either:
- existing experiments (if already ingested), or
- an empty state (if no experiments have been stored yet).
What to screenshot:
- The listing page.
On /query_experiments_page, test filters relevant to your data model (examples below—adapt to what your UI exposes):
- intent (e.g., classification / regression / optimization)
- algorithm/model type (e.g., LR / RF / etc.)
- dataset name/version
- metric thresholds (if supported)
- date ranges / owner / tags (if supported)
What to screenshot:
- One filtered view (show filters and results)
1.Open:
- http://<host>:5002/experiment_details_realData/<experiment_id> or click on any “Experiment Cards” button available in the main screen.
2. Inspect that the card consolidates:
- automatically populated metadata (related to experiment, workflow and metrics),
- any computed or derived values,
- and placeholders in the sections for human feedback.
What to screenshot:
- The experiment card view
1.Open:
- http://<host>:5002/form_lessons_learnt/<experiment_id> or click on any “Submit lessons learnt” button available in the main screen.
2.Fill:
- Lessons Learnt (free text)
- Experiment Rating (1–7)
- Run Ratings (1–7 per run, if multiple runs exist)
- Upload a supporting pdf file (optional)
3.Submit (which calls the API /submit/<experiment_id>).
What to screenshot:
- The filled form (before submit), result after submission and the experiment card view showing the newly added feedback
Learning Resources
- Experiment Cards presented in ICTAI 2025 Conference: https://www.computer.org/csdl/proceedings-article/ictai/2025/491900a433/2ct0EJ3US5y
- DSL Task Definitions: https://extremexp-horizon.github.io/extremexp-experimentation-engine/dsl/tasks/
- DSL Workflows: https://extremexp-horizon.github.io/extremexp-experimentation-engine/dsl/workflows/
- DSL Experiments: https://extremexp-horizon.github.io/extremexp-experimentation-engine/dsl/experiments/
Success criteria
- Experiments visible in respective page
- Experiment Card view loads and displays experiment, workflow and metric data
- Queries and filters executed in main page
- Lessons Learnt + ratings submitted via the form
- Feedback visible in card view and correctly linked to the experiment
Provide screenshots that showcase all the above.
Deliverables
Submit the following artifacts:
- Screenshots showing what is listed in the “Success Criteria” above;
- Short written summary, with:
- Any issues encountered
- Suggestions for improvement.
If you’d like to report issues or bugs, please include the following:
- Endpoint/UI page
- Steps to reproduce
- Expected vs. actual behavior
- Screenshot(s)
- Any relevant IDs (e.g. experiment_id)