AI-Ops
AI-Ops, short for Artificial Intelligence for IT Operations, represents a transformative approach to managing and optimizing complex IT environments.
In today’s rapidly evolving digital landscape, organizations are faced with the daunting challenge of maintaining the reliability and efficiency of their IT systems.
AI-Ops leverages cutting-edge artificial intelligence and machine learning technologies to autonomously monitor, analyze, and optimize IT infrastructure, from servers and networks to applications and data.
By harnessing the power of AI, AI-Ops not only enhances the agility and responsiveness of IT operations but also anticipates and proactively addresses issues, reducing downtime, improving performance, and ultimately empowering organizations to thrive in the age of digital transformation.
Ongoing Research
We currently work on multiple topics in this area:
People
Publications
- Failure Identification from Unstable Log Data using Deep Learning. Jasmin Bogatinovski, Sasho Nedelkoski, Li Wu, Jorge Cardoso, Odej Kao. In 2022 20th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), To Appear. IEEE/ACM, May 2022.
- A2Log: Attentive Augmented Log Anomaly Detection. Thorsten Wittkopp, Alexander Acker, Sasho Nedelkoski, Jasmin Bogatinovski, Dominik Scheinert, Wu Fan, Odej Kao. In Proceedings of the 55th Hawaii International Conference on System Sciences. 2022.
- Robust and Transferable Anomaly Detection in Log Data using Pre-Trained Language Models. Harald Odtt, Jasmin Bogatinovski, Alexander Acker, Nedelkoski Sasho, and Odej Kao. In 43-rd International Conference on Software Engineering, To Appear. ACM, 2021. [arXiv preprint]
- LogLAB: Attention-Based Labeling of Log Data Anomalies via Weak Supervision. Thorsten Wittkopp, Philipp Wiesner, Dominik Scheinert, Alexander Acker. In International Conference on Service-Oriented Computing, 700-707. 2021
- A Taxonomy of Anomalies in Log Data. Thorsten Wittkopp, Philipp Wiesner, Dominik Scheinert, Odej Kao. In International Conference on Service-Oriented Computing. 2021
- Self-supervised Log Parsing Sasho Nedelkoski, Jasmin Bogatinovski, Alexander Acker, Jorge Cardoso, and Odej Kao. In European Conference on Machine Learning and Prin-ciples and Practice of Knowledge Discovery in Databases, ECML-PKDD 2020, pages 1–742, 2020 [arXiv preprint]
-
Self-attentive Classification-based Anomaly Detection in Unstructured Logs. Sasho Nedelkoski, Jasmin Bogatinovski, Alexander Acker, Jorge Cardoso, and Odej Kao. In ICDM 2020: 20th IEEE International Conference on Data Mining, pages 1196–1201. IEEE, 2020. [arXiv preprint]
- A Taxonomy of Anomalies in Log Data. Thorsten Wittkopp, Philipp Wiesner, Dominik Scheinert, Odej Kao. In International Conference on Service-Oriented Computing. 2021
- Self-attentive Classification-based Anomaly Detection in Unstructured Logs. Sasho Nedelkoski, Jasmin Bogatinovski, Alexander Acker, Jorge Cardoso, and Odej Kao. In ICDM 2020: 20th IEEE International Conference on Data Mining, pages 1196–1201. IEEE, 2020. [arXiv preprint]
- Learning dependencies in distributed cloud applications to identify and localize anomalies. Dominik Scheinert, Alexander Acker, Lauritz Thamsen, Morgan K. Geldenghuys, and Odej Kao. In 43-rd International Conference on Software Engineering, To appear. ACM, 2021.
- Self-Supervised Anomaly Detection from Distributed Traces. Jasmin Bogatinovski, Sasho Nedelkoski, Jorge Cardoso and Odej Kao, 2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC), Leicester, UK, 2020, pp. 342-347.
- Multi-Source Anomaly Detection in Distributed IT Systems. Jasmin Bogatinovski, Sasho Nedelkoski In 18th International Conference on Service-Oriented Computing, To appear, Dubai,United Arab Emirates, December 2020. Springer. [arXiv preprint]
- Autoencoder-based condition monitoring and anomaly detection method for rotating machines. Sabtain Ahmad, Kevin Styp-Rekowski, Sasho Nedelkoski, and Odej Kao. In 2020 IEEE International Conference on Big Data, To appear. IEEE, 2020.
- Telesto: A graph neural network model for anomaly classification in cloud services. Dominik Scheinert and Alexander Acker. In 18th International Conference on Service-Oriented Computing, To appear. Springer, 2020
- Anomaly detection and levels of automation for ai-supported system administration. Anton Gulenko, Odej Kao, and Florian Schmidt. In Annual International Symposium on Information Management and Big Data, pages 1–7. Springer, 2019.
- Anomaly detection and classification using distributed tracing and deep learning. Sasho Nedelkoski, Jorge Cardoso, and Odej Kao. In 2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CC-GRID), pages 241–250. IEEE/ACM, May 2019.
- Anomaly detection from system tracing data using multimodal deep learning. Sasho Nedelkoski, Jorge Cardoso, and Odej Kao. In 2019 IEEE 12th International Conference on Cloud Computing (CLOUD), pages 179–186. IEEE, July 2019.
- Unsupervised anomaly alerting for IOT-gateway monitoring using adaptive thresholds and half-space trees. René Wetzig, Anton Gulenko and Florian Schmidt. In 2019 Sixth International Conference on Internet of Things: Systems, Management and Security (IOTSMS), pages 161–168.IEEE, October 2019.
- Detecting anomalous behavior of black-box services modeled with distance-based online clustering. Anton Gulenko, Florian Schmidt, Alexander Acker, Marcel Wallschlager, Odej Kao, and Feng Liu. In 2018 IEEE 11th International Conference on Cloud Computing (CLOUD), pages 912–915.IEEE, 2018.
- Unsupervised anomaly event detection for cloud monitoring using online arima. Florian Schmidt, Florian Suri-Payer, Anton Gulenko, Marcel Wallschlager, Alexander Acker, and Odej Kao. In 2018 IEEE/ACM International Conference on Utility and Cloud Computing(UCC), pages 71–76. IEEE, December 2018.
- Iftm-unsupervised anomaly detection for virtualized network function services. Florian Schmidt, Anton Gulenko, Marcel Wallschl̈ager, Alexander Acker, Vincent Hennig, Feng Liu, and Odej Kao. In 2018 IEEE International Conference on Web Services (ICWS), pages 187–194. IEEE,2018.
- Unsupervised anomaly event detection for VNF service monitoring using multivariate online arima. Florian Schmidt, Florian Suri-Payer, Anton Gulenko, MarcelWallschl̈ager, Alexander Acker, and Odej Kao. In 2018 IEEE International Conference on Cloud Computing Technology and Science (CloudCom), CloudCom 2018, pages 278–283.IEEE, December 2018.
- Anomaly detection for black box services in edge clouds using packet size distribution. Marcel Wallschlager, Anton Gulenko, Florian Schmidt, Alexander Acker, and Odej Kao. In 2018 7-th IEEE International Conference on Cloud Networking (CloudNet), CloudNet2018, pages 1–6. IEEE,October 2018.
- Patient-individual morphological anomaly detection in multi-lead electrocardiography data streams. Alexander Acker, Florian Schmidt, Anton Gulenko, Reinhard Kietzmann, and Odej Kao. In Big Data (Big Data), 2017 IEEE International Conference on, pages 3841–3846. IEEE, 2017.
- Automated anomaly detection in virtualized services using deep packet inspection. Marcel Wallschl ̈ager, Anton Gulenko, Florian Schmidt, Odej Kao, and Feng Liu. Procedia Computer Science, pages: 510–515, 2017.
- Evaluating machine learning algorithms for anomaly detection in clouds Anton Gulenko, Marcel Wallschlager, Florian Schmidt, Odej Kao, and Feng Liu. In Big Data (Big Data), 2016 IEEE International Conference, pages 2716–2721. IEEE, 2016.
- A system architecture for real-time anomaly detection in large-scale nfv systems. Gulenko, Anton and Wallschlager, Marcel and Schmidt, Florian and Kao, Odej and Liu, Feng. Procedia Computer Science, 94:491–496, 2016.
- Telesto: A graph neural network model for anomaly classification in cloud services. Dominik Scheinert and Alexander Acker. In 18th International Conference on Service-Oriented Computing, To appear. Springer, 2020
- Performance diagnosis in cloud microservices using deep learning. Li Wu, Jasmin Bogatinovski, Sasho Nedelkoski, Johan Tordsson, and Odej Kao. In 18th International Conference on Service-Oriented Computing, To appear, Dubai,United Arab Emirates, December 2020. Springer.
- Microras: Automatic recovery in the absence of historical failure data for microservice systems. Li Wu, Johan Tordsson, Alexander Acker, and Odej Kao. In 2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC), pages 227–236. IEEE, 2020
-
Microrca: Root cause localization of performance issues in microservices. Li Wu, Johan Tordsson, Erik Elmroth, and Odej Kao. In NOMS 2020 IEEE/IFIP Network Operations and Management Symposium, pages 1–9. IEEE, 2020.
- Towards a Cognitive Compute Continuum: An Architecture for Ad-Hoc Self-Managed Swarms Ferrer, Ana Juan and Becker, Soeren and Schmidt, Florian and Thamsen, Lauritz and Kao, Odej CCGrid. 2021
- Artificial Intelligence for IT Operations (AIOPS) Workshop White Paper, Jasmin Bogatinovski and Sasho Nedelkoski and Alexander Acker and Florian Schmidt and Thorsten Wittkopp and Soeren Becker and Jorge Cardoso and Odej Kao. arXiv arXiv/2101.06054, 2021.
- Towards AIOps in Edge Computing Environments. Becker Soeren, Schmidt Florian, Gulenko Anton, Acker Alexander and Kao, Odej International Conference on Big Data. 2020
- Ai-governance and levels of automation for aiops-supported system administration. Anton Gulenko, Alexander Acker, Odej Kao, and Feng Liu. In The 29th International Conference on Computer Communications and Networks, pages 1–6. IEEE, 2020.
- Multi-source distributed system data for ai-powered analytics. Sasho Nedelkoski, Jasmin Bogatinovski, Ajay Mandapati, Jorge Cardoso, and Odej Kao. In ESOCC 2020: European Conference On Service-Oriented And Cloud Com-puting, pages 161–176. Springer International Publishing, September 2020
- Bitflow: An In Situ Stream Processing Framework. Gulenko, Anton and Acker, Alexander and Schmidt, Florian and Becker, Soeren and Kao, Odej International Conference on Autonomic Computing and Self-Organizing Systems 2020
- Online density grid pattern analysis to classify anomalies in cloud and nfv systems. Alexander Acker, Florian Schmidt, Anton Gulenko, and Odej Kao. In 2018 IEEE International Conference on Cloud Computing Technology and Science (CloudCom), CloudCom 2018, pages 290–295. IEEE,December 2018.
- A system architecture for real-time anomaly detection in large-scale nfv systems. Anton Gulenko, Marcel Wallschläger, Florian Schmidt, Odej Kao, and Feng Liu. Procedia Computer Science, 2016.
Saeed Haddadi Makhsous, Anton Gulenko, Odej Kao, and Feng Liu.
- High available deployment of cloud-based virtualized network functions. In High Performance Computing & Simulation (HPCS), 2016 International Conference on, pages 468–475. IEEE, 2016.