Resource Management for Adaptive Memory Requirements of Scientific Workflows

Assignment:

As the amount of data available to researchers in fields ranging from bioinformatics to physics to remote sensing continues to grow, the importance of scientific workflow systems has increased dramatically. These systems play a critical role in creating and executing scalable data analysis pipelines. When designing these workflows, it’s important for users to define the resources required for each task and ensure that sufficient resources are allocated on the intended cluster infrastructure. A critical problem is underestimating a task’s memory requirements, which can lead to task failures. As a result, users often over-allocate resources, resulting in significant resource inefficiency and reduced overall throughput.

The challenge is to extend an existing resource manager such as Slurm, OpenPBS, or Kubernetes to work with dynamic memory requirements. Therefore, machine learning methods or heuristics must be developed and implemented to handle such dynamic assignments when scheduling tasks.

The quality of the developed methods should be evaluated with resource management systems and real-world workflows.

Requirements:

Knowledge of machine learning techniques and advanced Python skills.

Start: Immediately

Contact: Jonathan Bader (jonathan.bader ∂ tu-berlin.de)