Semantic Composition and Validation of Interacting DAWs in Computational Materials Science
Period: | 2024-2028 |
Funder: | German Research Foundation (DFG) as part of CRC 1404 FONDA (Foundations of Workflows for Large-Scale Scientific Data Analysis) |
Principal Investigators: | Lars Grunkse (HU Berlin), Tilmann Hickel (BAM), Anna-Lena Lamprecht (University of Potsdam) |
Summary
Data analysis workflows (DAWs) are becoming increasingly popular for implementing large-scale simulations in computational materials science. These are often multiscale problems that involve the simulation of materials at different levels of granularity (e.g., spatial or temporal). Traditionally, workflows addressing the different scales would be executed sequentially and results transferred at the end of each. However, it is more efficient to implement such multiscale simulations as interacting data analysis workflows (IDAWs). Here, the individual workflows define the simulations for different levels of granularity, they are self-contained and can be executed and produce meaningful results independently. In addition, they can also exchange data with other workflows at defined points during runtime, and so become interacting DAWs.
However, there are open challenges that currently hamper the efficient implementation of IDAWs in computational materials science: The interaction patterns are not well classified and understood, concepts for ensuring the correctness of IDAW implementations are missing, and there is a lack of specific support for IDAW development. To address these challenges, we will develop a theory and architectural framework for interacting workflows as a foundation for much-needed IDAW develop- ment support. In particular, we will then use this framework for devising a concept for incorporating domain-level semantic annotations into IDAWs that facilitate the validation of their correctness as well as correct-by-construction automated composition. This will significantly reduce the time needed for developing IDAWs, leverage their usability (FONDA II principles SUM) and dependability (FONDA I principles PAD), and make them accessible to the wider research community. We will apply the new IDAW framework to address the intricate multiscale problem of hydrogen diffusion in metallic materials. Since these IDAWs will be efficient and realistically represent physical interactions, novel insights into this long-standing materials topic can be expected.