We describe the software prototypes and releases that came out of our research on this page (as we are eager to provide access to our results not just in form of publications, but also open source software as much as possible).
LEAF is a simulator for Large Energy-Aware Fog computing environments. It enables then modeling of complex application graphs in distributed, heterogeneous, and resource-constrained infrastructures. A special emphasis was put on the modeling of energy consumption (and soon carbon emissions).
Besides allowing research on scheduling and placement algorithms on resource-constrained environments, LEAF puts a special focus on:
Links: Github repository, documentation, and conference presentation.
Publication: LEAF: Simulating Large Energy-Aware Fog Computing Environments
Main contact: Philipp
C3O is a cluster configuration system for distributed dataflow jobs running on public clouds. It chooses a machine type and a scale-out with the goal of reaching the user’s runtime target with a certain confidence and in the most cost-efficient manner.
It contains specialized runtime models that can take the execution context (i.e. runtime-influencing job parameters and dataset characteristics) into account. Therefore it allows for the sharing of runtime data among many users with different execution contexts.
The C3O cluster configurator is implemented as a Python-based command line tool. Its repository also includes example training data, gained from 930 distinct Spark job executions
Links: Github repository with documentation
Publication: C3O: Collaborative Cluster Configuration Optimization in Public Clouds
Main contact: Jonathan Will
We release the complete deployment code necessary to instantiate a cloud-based platform for processing monitoring data from large-scale water infrastructure monitoring campaigns. This research was conducted as part of the ongoing WaterGridSense 4.0 project and more results will be added to our repositories as the project progresses. The release includes parametrized Helm charts from which a Kubernetes cluster of arbitrary scale can be launched, as well as the code for the Apache Flink job that runs inside the cluster and enriches incoming sensor data with the required auxiliary information. The results of the data processing are published through a dedicated Apache Kafka topic, from which they can be retrieved for further processing or live visualization.
Repositories: Platform deployment charts and data enrichment job
Publication: A Scalable and Dependable Data Analytics Platform for Water Infrastructure Monitoring
Main contact: Morgan
Questions, Access, and Maintenance
If you have questions about the software feel free to get in touch. Also feel free to contact us, if you cannot access any of the repositories linked on this page.
Moreover, we will try to maintain our code and respond to issues and pull requests as best as we can.