QoS-aware Orchestration Across Cloud-to-Edge

Overview:

Deploying applications across cloud, fog, and edge resources is now routine in distributed computing, but ensuring Quality-of-Service (QoS) across these tiers remains challenging. While orchestration frameworks like Kubernetes and K3s simplify the mechanics of deployment, their schedulers remain largely reactive and rule-based, with no built-in support for learned decision-making based on historical or real-time metrics.

This thesis focuses on enhancing QoS-aware orchestration using machine learning (ML) techniques. Building on previous work that laid the architectural foundations (e.g., decentralized coordination, Raft consensus, Borda count ranking), this project shifts the focus toward predictive and adaptive orchestration strategies. The central goal is to develop and evaluate ML models that learn from system behavior (e.g., past deployments, resource usage, network latency) and can proactively steer placement and migration decisions for microservices.

Use cases may include reinforcement learning for scheduling, supervised learning for performance prediction, or even clustering models to identify optimal workload groupings under varying QoS constraints like energy efficiency, cost, and latency.

Research Questions:

This thesis lies at the intersection of ML, distributed systems, and edge-cloud orchestration:

Requirements:

Solid understanding of Kubernetes and container orchestration; strong programming skills (preferably in Go or Python); Knowledge of machine learning techniques; familiarity with YAML and CI/CD pipelines is a plus.

Start: Immediately

Contact: Ismail Aslan (aslan@tu-berlin.de)