Zum Hauptinhalt springen

Early Workflow Design: From Collaborative Scientific Problem-Solving to DAW Specifications

Period:2024-2028
Funder:German Research Foundation (DFG) as part of CRC 1404 FONDA (Foundations of Workflows for Large-Scale Scientific Data Analysis)
Principal Investigators:Anna-Lena Lamprecht (University of Potsdam), Jan Mendling (HU Berlin), Matthias Weidlich (HU Berlin)

Summary

This project aims to address challenges encountered in the early phases of the workflow life cycle, specifically focusing on the conception of data analysis workflows (DAWs). During this phase, scientists collaborate to transition from a scientific question to an abstract workflow. An abstract workflow is a methodical sketch in domain terms, representing abstract functionalities and their interdependencies. Currently, this early workflow concept is often implicit, only becoming explicit through the implementation of concrete DAWs in a given workflow language or system. While various abstract workflow representations and methods for their development have been proposed, dedicated methods and support for the conception phase are missing. This gap poses a risk to the scientific quality of workflows, as scientists often resort to "next best" workflows rather than adopting a systematic, grounded approach to their design.

The goal of this subproject is to understand the principles and processes followed by experts in the workflow conception phase and, based on this understanding, devise DAW conception methods and support techniques. We employ a multi-method strategy, combining empirical and design science methods. Qualitative studies, including interviews with scientists and workflow developers from fields such as bioinformatics, geosciences, and materials science, will be conducted to investigate the challenges faced during the initial phases. Design science methods will then be used to develop support techniques to support the collaborative conception of workflows, their representation and explication as abstract workflows. We will contribute to FONDA principles SUM and PAD by enhancing the usability of abstract workflows and improving the dependability of DAWs through better alignment with the scientific question.