Paper |
Title |
Page |
TH2AO02 |
High Availability Alarm System Deployed with Kubernetes |
1134 |
|
- J.J. Bellister, T. Schwander, T. Summers
SLAC, Menlo Park, California, USA
|
|
|
To support multiple scientific facilities at SLAC, a modern alarm system designed for availability, integrability, and extensibility is required. The new alarm system deployed at SLAC fulfills these requirements by blending the Phoebus alarm server with existing open-source technologies for deployment, management, and visualization. To deliver a high-availability deployment, Kubernetes was chosen for orchestration of the system. By deploying all parts of the system as containers with Kubernetes, each component becomes robust to failures, self-healing, and readily recoverable. Well-supported Kubernetes Operators were selected to manage Kafka and Elasticsearch in accordance with current best practices, using high-level declarative deployment files to shift deployment details into the software itself and facilitate nearly seamless future upgrades. An automated process based on git-sync allows for automated restarts of the alarm server when configuration files change eliminating the need for sysadmin intervention. To encourage increased accelerator operator engagement, multiple interfaces are provided for interacting with alarms. Grafana dashboards offer a user-friendly way to build displays with minimal code, while a custom Python client allows for direct consumption from the Kafka message queue and access to any information logged by the system.
|
|
|
Slides TH2AO02 [0.798 MB]
|
|
DOI • |
reference for this paper
※ doi:10.18429/JACoW-ICALEPCS2023-TH2AO02
|
|
About • |
Received ※ 06 October 2023 — Revised ※ 09 October 2023 — Accepted ※ 14 December 2023 — Issued ※ 18 December 2023 |
Cite • |
reference for this paper using
※ BibTeX,
※ LaTeX,
※ Text/Word,
※ RIS,
※ EndNote (xml)
|
|
|