FONDA: Foundations of Workflows for Large-Scale Scientific Data Analysis

The project FONDA (Foundations of Workflows for Large-Scale Scientific Data Analysis (FONDA) is a new Collaborative Research Center (CRC) funded by DFG. FONDA will investigate methods to support scientists, who work with cluster infrastructures to analyze very large datasets.

Today, large-scale scientific data analysis is complicated by the necessity to select among different available computational resources and hand-tune distributed processing jobs. These settings are not straightforward and often platform-specific, yet have a significant impact on runtimes and efficiency and lead to either platform lock-in or performance losses. In FONDA, we are going to develop new methods for profiling, performance modeling, and task placement that will enable resource management systems to use the available cluster resources efficiently and, therefore, allow scientists to focus on the domain-specific challenges in their work.

We are part of the projects B1 and S1. Our focus in B1 is on infrastructure discovery and description and the creation of infrastructure-aware task execution profiles. In S1, we provide computational infrastructures, data analysis workflows (DAWs) and their associated data, and development tools to support the research.

Participating Organizations

Selected Publications

Further project information